HDFS Forum

Web HDFS in Java

  • #45531
    K S
    Participant

    Hi All …. Do we have any tutorial or an example telling us how to use Web Hdfs Client from java to perform the various operations like MKDIR, READ etc.. ? .. I really appreciate the help.

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #45564
    Robert Molina
    Moderator

    Hi KS,
    I found this https://sites.google.com/site/hadoopandhive/home/hadoop-how-to-read-a-file-from-hdfs. Also the hdfs apis are here for 1.2 http://hadoop.apache.org/docs/current1/api/

    Let me know if this helps.

    Regards,
    Robert

    #45565
    Robert Molina
    Moderator

    Hi KS,
    Sorry, my previous post was direct java api access rather than the web services api for hdfs. Here are some examples of the web calls
    http://hortonworks.com/blog/webhdfs-%E2%80%93-http-rest-access-to-hdfs/

    then you would just need some java class to that already makes and reads the http calls.

    Regards,
    Robert

    #45663
    Nicholas Sze
    Moderator

    For Java, the API of using WebHDFS and HDFS are the same. We only have to change the URL scheme to webhdfs:// with http port, instead of hdfs:// with rpc port. The example in https://sites.google.com/site/hadoopandhive/home/hadoop-how-to-read-a-file-from-hdfs uses HDFS. For WebHDFS, we only have to change the path URL as below.

    Path pt=new Path(“webhdfs://npvm11.np.wc1.yellowpages.com:50070/user/john/abc.txt”);

    #45664
    K S
    Participant

    Hey guys,
    Thanks for the quick reply … another question, would the following statement work for aplications working outside the cluster

    Path pt=new Path(“http://nameNode:50070/webhdfs/vi/user/john/abc.txt”);

    or can I simply use a normal HTTPCLIENT in java with the same URL as described above.(http://nameNode:50070/webhdfs/vi/user/john/abc.txt)

    #46359
    Robert Molina
    Moderator

    Hi KS,
    When you are referring to outside the cluster, do you mean a separate network or subnet? If that’s the case, first you have to make sure the client can reach the cluster via the network. As far as your path, 50070 is the rpc port and not the http port.

    Regards,
    Robert

    #46361
    K S
    Participant

    HI Robert ,
    Yes. I am referring to the cluster from outside the network. Is it possible using the webhdfs from a Java program ? Another question to this would be, what maximum data size can the web hdfs handle ?
    Any examples regarding this would be of great help!!

    #46668
    Robert Molina
    Moderator

    Hi KS,
    Yes it is possible to use the HDFS java api’s to use hdfs as mentioned by Nicholas Sze’s post. As far as data size limits, I am not aware of any. Once you get it up and running, try the max file size you expect to upload and retrieve and verify if it goes through.

    Regards,
    Robert

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.