Documentation Problems

to create new topics or reply. | New User Registration

This topic contains 0 replies, has 1 voice, and was last updated by  Mike McConnell 1 year ago.

  • Creator
  • #49725

    Mike McConnell

    Just and FYI,

    I’m doing some eval work with hadoop and noted some things in your documentation you might want to take a look at.

    The systems I’m working on in my lab and not accessible to/from the outside so I attempted to use your reference document bk-reference-20140210.pdf. For my situation I need to deploy behind a firewall with temporary access to the internet so I’m setting up local repos. I’m using CentOS6.5 currently for now. This brings us to section Using table 4.6 I used wget to fetch the hdp.repo file for yum to use. The problem is that your recommend arguments, specifically HDP and HDP-, for reposync command that follow are not matched in the hdp.repo file I just downloaded so they fail.

    I poked around and found what I thought were the correct names (HDP-2.x and Updates-HDP-2.x) and tried wget on those but I end up with (from centos6/2.x/GA) so I gave up and just fetched the whole HDP- from the HDP/centos6/ directory. The utils directory seems to work from above and seems current so I’ll run with it.

    The directory names used in the createrepo reflect the reposync names. I haven’t gotten to the Ambari installation docs to see what names are referenced there but you should probably check they match correctly too.

    Also noted that ambari repo has similar problems too and it’s repo files reflect a down rev so I bypassed that too. I did try to use a label that matched the repo file but ultimately killed the process as it was attempting to mirror atrpms – all 2600+ which is a little out of scope for my project at the moment. This is likely my error but it might be helpful.

    I did search the forums and saw several threads that I suspect may be related to the hdp.repo being out of sync with the current directory structure which may be an underlying problem.

    Anyway, hope this is helpful and I’m not missing the obvious.



You must be to reply to this topic. | Create Account

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.