Hortonworks Sandbox Forum

Stdout log capture

  • #52352
    Dominic Fox

    I have a WordCount MapReduce job, which tries to write log messages to stdout during Map and Reduce execution, using both log4j’s Logger and System.out.println() statements. Neither output is captured in the job history for the job when I run it against an unaltered HDP 2.1 Sandbox VM instance – in fact, the logs (when viewed through the job history browser) always look like this:

    Log Type: stderr
    Log Length: 222
    log4j:WARN No appenders could be found for logger (org.apache.hadoop.ipc.Server).
    log4j:WARN Please initialize the log4j system properly.
    log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

    Log Type: stdout
    Log Length: 0

    Log Type: syslog
    Log Length: 48335
    […lots of syslog stuff, but none of my log messages]

    Is there some piece of configuration I am missing?

to create new topics or reply. | New User Registration

  • Author
  • #52355
    Dominic Fox

    (here, for reference, is the Wordcount class I’m using)

    package com.opencredo.hadoop.logging;

    import org.apache.log4j.Logger;
    import org.apache.log4j.LogManager;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapred.*;

    import java.io.IOException;
    import java.util.Iterator;
    import java.util.StringTokenizer;

    public class WordCount {

    private static final Logger logger = LogManager.getLogger(WordCount.class);

    public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
    String line = value.toString();

    logger.warn(“Mapping line ” + line);
    System.out.println(“Mapping line ” + line);
    System.err.println(“Mapping line ” + line);

    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    String nextToken = tokenizer.nextToken();

    logger.warn(“Outputting word ” + nextToken);

    output.collect(word, one);

    public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
    logger.error(“Reducing values for ” + key);

    int sum = 0;
    while (values.hasNext()) {
    sum += values.next().get();
    logger.error(“Outputting sum ” + sum + ” for key ” + key);
    output.collect(key, new IntWritable(sum));

    public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordCount.class);




    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));



    same issue in 2.2…cant see logs in console or jobhistory

    can someone please help

    Robert Molina

    Hi Eric,
    Can you clarify if you are running the vmware or virtualbox application? On the virtualbox side, there is a known issue with not being able to see the logs for the nodemanager webui, it is referenced in the release notes as BUG­-34592. Can you simply run a simple pi job and see the logs? On VMware, I was able to view the logs after running a simple pi job as hdfs user :
    yarn jar /usr/hdp/ pi 10 10

    After the job finished, I was able to go to the job history server:

    which shows the job id I executed, clicking on it, it brought me to the job page showing the map and reduce tasks. From there, there is a log link which I am able to click on as well, which shows log information such as:

    2015-04-20 23:31:12,219 INFO [Thread-85] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://sandbox.hortonworks.com:8020 /user/hdfs/.staging/job_1429552652390_0001
    2015-04-20 23:31:12,227 INFO [Thread-85] org.apache.hadoop.ipc.Server: Stopping server on 57728
    2015-04-20 23:31:12,229 INFO [IPC Server listener on 57728] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 57728
    2015-04-20 23:31:12,232 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
    2015-04-20 23:31:12,232 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted

    Also another way to get logs is you can run the following command mentioned in this blog:

    yarn logs -applicationId <application ID> ­

    For instance, in my example job I ran, the app id is

    Hence I would run the command as
    yarn logs -applicationId application_1429552652390_0001

    Hope that helps.

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.