HDP on Linux – Installation Forum

Disabling LZO compression

  • #34950

    I have a rpm installation of HDP 1.1.0 and for some reasons, I am not able to install the rpm packages on SUSE enterprise 11, service pack 1. I have read few posts regarding bad gpg key on hadoop-lzo package but found no satisfactory answer. Tried all the options but its all in vain. The header file seems to be corrupt.

    Anyways,I thought of disabling compression, as another alternative. Modified by core-site.xml:


    A list of the compression codec classes that can be used

    for compression/decompression.





    The implementation for lzo codec.


    Also modified mapred-site.xml:
    If the map outputs are compressed, how should they be


    If the job outputs are to compressed as SequenceFiles, how should
    they be compressed? Should be one of NONE, RECORD or BLOCK.


    I restarted the cluster and when I am trying to run a simple wordcount example. I am getting the following error:
    hadoop jar /usr/lib/hadoop/hadoop-examples.jar wordcount passws passws-out
    13/09/09 12:13:23 INFO input.FileInputFormat: Total input paths to process : 1
    13/09/09 12:13:23 INFO mapred.JobClient: Cleaning up the staging area hdfs://linux:8020/user/root/.staging/job_201309091212_0001
    java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
    at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
    at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:134)
    at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:46)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258)

    Can someone please help me in either installing hadoop lzo rpm package or in disabling the lzo compression. I don’t understand why its taking lzo compression by default even after disabling it? Am I missing something ?

to create new topics or reply. | New User Registration

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.