Home Forums Hive / HCatalog Hive parquet problem

Tagged: 

This topic contains 0 replies, has 1 voice, and was last updated by  Kamil Malachowski 5 months, 1 week ago.

  • Creator
    Topic
  • #54264

    Kamil Malachowski
    Participant

    Hi guys,
    I have problem with reading hive tables stored in parquet format, it gives following errorĊ

    Caused by: java.io.IOException: can not read class parquet.format.PageHeader: null
    at parquet.format.Util.read(Util.java:50)
    at parquet.format.Util.readPageHeader(Util.java:26)
    at parquet.hadoop.ParquetFileReader$Chunk.readAllPages(ParquetFileReader.java:418)
    at parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:361)
    at parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:100)
    at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:172)
    at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:130)
    at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:95)
    at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66)
    at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)

    Tables were created with parquet 1.2.5 and copied with distcp to Hortonworks 2.1 clutester with hive 0.13, and I guess parquet 1.3.5.
    I found that my issue may be ralated to https://github.com/Parquet/parquet-mr/pull/349

    Is there any quick workaround, e.g. some settings, that will will resolve my problem?

    Best Regards
    Kamil

You must be logged in to reply to this topic.