Home Forums HDP on Linux – Installation Flume and Mahout

Tagged: , ,

This topic contains 16 replies, has 4 voices, and was last updated by  Vinh Ton 6 months, 2 weeks ago.

  • Creator
    Topic
  • #29124

    Hi

    I have installed hortonworks sandbox couple of days ago, just wondering if any one guide me for the installation of flume and mahout.

Viewing 16 replies - 1 through 16 (of 16 total)

The topic ‘Flume and Mahout’ is closed to new replies.

  • Author
    Replies
  • #39798

    Vinh Ton
    Member

    Sorry, just found this in my spam. I see in your email signature who I can pass on the good word to. Thanks!

    Collapse
    #39657

    Dave
    Moderator

    Hi Vinh,

    Sure, drop me your mail address and I can send you a mail which you can reply to.

    Ok, When you upload a file in HUE in the file browser it goes into HDFS (not the local filesystem as seen by the hue bash shell)
    To see it in the hue bash shell you can run:

    su – hdfs
    hadoop dfs -ls /user/vinh

    Let me know if this helps

    Dave

    Collapse
    #39656

    Vinh Ton
    Member

    Is there somebody I inform of your great support Dave?
    Ok, started ambari and set my $HADOOP_HOME properly. Thanks!

    Now I’m trying to get the data file in to the linux environment. So far not able to see it in the Hue shell. Would appreciate your help if possible.
    Here are the steps I took:
    1) Using the “File Browser”, I uploaded the data file to /usr/vinh
    2) Using the Hue Bash shell,
    2a) I go to /usr/vinh
    2b) ls and it shows nothing. even did a ls -a and still can not see the files. Checked permissions too and everybody should be able to read it

    What did I miss? It seems like the /usr/vinh in the job browser is mapped differently than what shows in the Hue shell.

    Collapse
    #39623

    Dave
    Moderator

    Hi Vinh,

    If you run: ambari-start

    Then you can go to 127.0.0.1:8080 (user: admin password: admin)
    You can start the services here.

    HADOOP_HOME should point to /usr/lib/hadoop

    Let me know if you have any other questions.

    Thanks

    Dave

    Collapse
    #39622

    Vinh Ton
    Member

    Thanks so much Dave, appreciate the links and am walking through it. I think the Sandbox has different configurations. If true, this is understandable.

    Where is $HADOOP_HOME supposed to point to?
    I’m on the HUE’s BASH cmd line and I’m in a hadoop dir but it doesn’t have a bin folder and therefore I am not able to run start-all.sh script like I would expect:

    [vinh@sandbox /]$ bin etc lib media proc selinux tmp var
    boot hadoop lib64 mnt root srv usr virtualization
    dev home lost+found opt sbin sys vagrant
    [vinh@sandbox /]$ [vinh@sandbox hadoop]$ hdfs mapred oozie zookeeper
    [vinh@sandbox hadoop]$

    Collapse
    #39617

    Dave
    Moderator

    Just a FYI:

    https://cwiki.apache.org/confluence/display/MAHOUT/Quickstart

    This looks pretty straight forward.

    Collapse
    #39616

    Dave
    Moderator

    Hi Vinh,

    I haven’t come across any tutorials for Mahout by Hortonworks, however if I do find some, I’ll be sure to let you know.

    Thanks

    Dave

    Collapse
    #39603

    Vinh Ton
    Member

    Thanks Dave! I got it to install, had about just given up on that.

    I’m googling Mahout tutorials but are there any specific ones Hortonworks recommends?
    I really enjoyed how simple it was to follow the ones built in to the sandbox.

    Collapse
    #39599

    Dave
    Moderator

    Hi Vinh,

    Flume is available in the 2.0 Beta sandbox.

    There is no UI for Mahout but you can actually use it from the command line. You can either get a command line from HUE or log in via ssh/console.

    Once you do that you can run:
    yum install mahout

    Then you can use Mahout on the command line

    I hope this helps,

    Thanks

    Dave

    Collapse
    #39562

    Vinh Ton
    Member

    Bummer, thanks for the reply though

    Collapse
    #39561

    I actually could not figure out anything as I am really new to this area. So I am a graduate student and I was trying my hands on to learn something. However not much of help since sometime. So I uninstalled the software it some time ago :-(

    Collapse
    #39532

    Vinh Ton
    Member

    Did you figure out how to install mahout on the sandbox Ram?

    Collapse
    #29155

    tedr
    Moderator

    Hi Ram,

    That error usually means that the Sandbox is having trouble connecting to our git repository when it is attempting to update the tutorials. Considering that there have been no recent updates to the tutorials since you downloaded the Sandbox, you can safely ignore this message. Though if you want to be able to get the tutorials updated in the future, you will need to make sure that the sandbox has connectivity to the internet. You can check if this is the problem by logging into the sandbox and attempting to ping a site such as google. If the ping is successful then there is another cause for the inability to connect to git. if the ping is unsuccessful you will need to figure out why the sandbox can’t connect to the internet. The latter problem is usually because the networking wasn’t configured correctly when you imported the sandbox.

    Thanks,
    Ted.

    Collapse
    #29144

    Thanks Ted for your note. I can see that in the sandbox commandline environment.

    Error says something like this Erno14 pycurl error 6 could not resolve host.

    I am not sure what it is.. Any help on this would be awesome.

    Collapse
    #29143

    tedr
    Moderator

    Hi Ram,

    These are the correct usernames, When logging in through the browser you are logging in to hadoop/hue, not the Linux system. When you log into the sandbox on the command line or via ssh you are going into the linux system. Where does the error “could not resolve host” show?

    Thanks,
    Ted.

    Collapse
    #29127

    I can also see an error stating “could not resolve host”? can anyone tell me how do I get rid of this. when I am logging in to the browser, my user name shows up as hue and my user name is root for logging in to sandbox. Is that right???

    Collapse
Viewing 16 replies - 1 through 16 (of 16 total)