HDP on Windows – Other Forum

TeraGen fails with Java.IO.EOFException

  • #49283
    L Vadhula
    Participant

    When trying to generate 250GB of data, I get the following exception consistently on at least one of the nodes. I am not sure how to understand the root cause, can someone please guide me?

    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for blk_-3477522246679853993_1094 java.io.EOFException: while trying to read 65557 bytes
    <… repeats 3 more times>
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_7963516741816314694_1094 0 : Thread is interrupted.
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-6799809211252707875_1094 0 : Thread is interrupted.
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-3477522246679853993_1094 0 : Thread is interrupted.
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for blk_-6799809211252707875_1094 terminating
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for blk_7963516741816314694_1094 terminating
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_3931672859420894473_1094 0 : Thread is interrupted.
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-6799809211252707875_1094 received exception java.io.EOFException: while trying to read 65557 bytes
    2014-02-25 12:37:34,157 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for blk_-3477522246679853993_1094 terminating
    2014-02-25 12:37:34,157 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.216.58.163:50010, storageID=DS-256604440-10.216.58.163-50010-1393359808324, infoPort=50075, ipcPort=8010):DataXceiver
    java.io.EOFException: while trying to read 65557 bytes
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:296)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:340)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:404)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:582)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:404)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)

    TIA,
    VS

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #49423
    Robert Molina
    Moderator

    Hi VS,
    Can you post your hdfs-site.xml configurations for max transfer threads? I am assuming you are using HDP 2.0. That value may need to upped.

    Regards,
    Robert

    #49445
    L Vadhula
    Participant

    Thanks Robert. I am using HDP For Windows 1.2.0.1.3.0.0.
    Here is the hdfs-site.xml file:

    <?xml version=”1.0″?>
    <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
    <!– Put site-specific property overrides in this file. –>
    <configuration>
    <property>
    <name>dfs.name.dir</name>
    <value>D:\HDP_Data\hdfs\nn</value>
    <description>Determines where on the local filesystem the DFS name node
    should store the name table. If this is a comma-delimited list
    of directories then the name table is replicated in all of the
    directories, for redundancy. </description>
    <final>true</final>
    </property>
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    <description>to enable webhdfs</description>
    <final>true</final>
    </property>
    <property>
    <name>heartbeat.recheck.interval</name>
    <value>1</value>
    <description>None </description>
    <final>true</final>
    </property>
    <property>
    <name>dfs.data.dir</name>
    <value>D:\HDP_Data\hdfs\dn</value>
    </property>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is created.
    The default is used if replication is not specified in create time.
    </description>
    </property>
    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50010</value>
    </property>
    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50075</value>
    </property>
    <property>
    <name>dfs.http.address</name>
    <value>MstrNode:50070</value>
    <description>The name of the default file system. Either the
    literal string “local” or a host:port for NDFS.</description>
    <final>true</final>
    </property>
    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:8010</value>
    <description>The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    <description>
    If “true”, enable permission checking in HDFS.
    If “false”, permission checking is turned off,
    but all other behavior is unchanged.
    Switching from one parameter value to the other does not change the mode,
    owner or group of files or directories.
    </description>
    </property>
    <property>
    <name>dfs.secondary.http.address</name>
    <value>MstrNode:50090</value>
    <description>Address of secondary namenode web server</description>
    </property>
    <property>
    <name>dfs.secondary.https.port</name>
    <value>50490</value>
    <description>The https port where secondary-namenode binds</description>
    </property>
    <property>
    <name>dfs.https.address</name>
    <value>MstrNode:50470</value>
    <description>The https address where namenode binds</description>
    </property>
    </configuration>

    #52211
    Robert Molina
    Moderator

    Hi VS,
    It looks like the property is not there, is the full xml page posted?

    Regards,
    Robert Molina

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.