Hortonworks Sandbox Forum

importing data with file browser times out

  • #27603
    alex Gordon
    Member

    I am trying to upload a 400 mb google ngram zip file using the file browser into the sandbox.

    I’m getting this error:

    The following error(s) occurred:
    timed out

    Is there a limit on the size we can import? What am I doing wrong?

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #27604
    alex Gordon
    Member

    I apologize, my description of the problem is inaccurate.

    I actually was able to upload 100% of the zip file.

    However, right after uploading it, it gets stuck on the unzip:
    Uploading to: /user/hue
    The file will then be extracted in the path specified above.

    at this point it just errors out.

    are we not allowed to import large zip files?

    #27648
    tedr
    Moderator

    Hi Alex,

    As far as I know there isn’t a limit to the files you upload, only the space on the disk. if the file is 400MB when zipped how much space is it going to take when unzipped? On a new Sandbox there is just under 40GB of space in HDFS.

    Thanks,
    Ted.

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.