Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
May 06, 2015
prev slideNext slide

MapReduce Framework Deployment for YARN Rolling-Upgrades

This is the 3rd post in a series that explores the theme of supporting rolling-upgrades & downgrades of a Hadoop YARN cluster. See the introductory post here.

Background and Motivation

Before HDP 2.2, Hadoop MapReduce applications depended on MapReduce jars being deployed on all the nodes in a cluster. The java classpath of all the tasks and the ApplicationMaster of a MapReduce job were set to point to the deployed jars. During cluster upgrades, the jars deployed locally on each host were replaced with newer versions.

Starting with HDP 2.2 (based on Apache Hadoop 2.6), we support rolling upgrades and downgrades of a Hadoop YARN cluster. Given this, the previous way of deploying MR jars can cause a single MapReduce job to consume different versions of MR jars while it is still running. For example, a job launched on a cluster with version X may still be running after the cluster is updated to version X+1, but all tasks of the same job launched after the upgrade will consume the new MapReduce jars. Sometimes, it is not possible to determine whether a new task will use the newer version or not. This can lead to runtime errors and some very unexpected results – this situation is depicted in the picture below.

yarn_mr_1

To solve this problem, instead of adding local installation of MR jars to a job’s classpath, HDP 2.2 now relies on a shared tarball of MR jars on HDFS which are distributed to each node at the time of container-launch via YARN’s Distributed Cache.

A brief recap of YARN’s Distributed Cache

Distributed Cache (as we discussed before) is a facility provided by YARN to optimally cache file resources (text, archives, jars etc.) needed by applications on cluster machines. Application frameworks working on top of YARN (like Hadoop MapReduce, Apache Tez) usually wrap around this functionality and expose them to end-users too.

In case of MapReduce, jobs can specify files to be cached, via URLs (hdfs:// or http://) in the configuration object JobConf. Distributed Cache assumes that (a) the files specified via URLs are already present on a remote file-system (e.g. HDFS) at the path specified by the URL and (b) the specified files are accessible by every node of the cluster. During container (for e.g MR tasks) launch stage, YARN copies the necessary files to the specific node and the MR framework optionally adds them to the CLASSPATH before any tasks of the job are executed on this node. Distributed Cache’s efficiency stems from the fact that the files are only copied once per application, transparently reused across containers and potentially across applications again and again.

Deploying MapReduce through Distributed Cache to deal with cluster upgrades/downgrades

When starting HDP 2.2, specific versions of the MapReduce application framework can be deployed at runtime via the Distributed Cache rather than relying on static versions copied during installation. It means that even after a version of HDP is installed or an existing installation is upgraded to a newer version, the user’s MR job can, if desired, still pick up the older version of MR jars for its tasks to run against. Due to this, the runtime context of an MR job stays the same and a job’s tasks can stay consistent throughout the job’s whole lifecycle – very helpful in scenarios of rolling upgrades for a Hadoop YARN cluster.

With this architecture change, the snapshot of an upgrade in progress looks like the following. Here, Job #1 starts before the cluster upgrade and uses the older MR jars. Job #5 starts while the upgrade was in progress but uses the new MR jars. At the specific point of time depicted, NodeManagers #1, #2 and #5 are still on older version, #4 is on the newer version, node #3 is going through upgrade.

yarn_mr_2

This architecture change is part of the bigger umbrella effort MAPREDUCE-4150 (Versioning and rolling upgrades for MR2). The specific work to deploy MR via distributed-cache is tracked at Apache JIRA issue MAPREDUCE-4421 and done by Jason Lowe from Yahoo!

Configuration

The first step is to place a specific version of MapReduce tarball on HDFS under a directory that applications can access, as shown below:
$HADOOP_HOME/bin/hdfs dfs -put hadoop-2.6.0.tar.gz /mapred/framework/

Set the configuration property mapreduce.application.framework.path in mapred-site.xml to a URL pointing to the archive uploaded previously. The property also supports creating an alias for the archive if a URL fragment identifier is specified as following. Here, as an example, we set the alias to be mr-framework.


<property>
<name>mapreduce.application.framework.path</name>
<value>hdfs:/mapred/framework/hadoop-2.6.0.tar.gz#mr-framework</value>
</property>

Next, we set the property mapreduce.application.classpath to include related MR jars in classpath. Note here that we’re matching the directory with the alias we used in the framework path – mr-framework.


<property>
<name>mapreduce.application.classpath</name>
<value> $PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.6.0/hadoop/lib/hadoop-lzo-0.6.0.2.6.0.jar:/etc/hadoop/conf/secure

</value>
</property>

We can also upload multiple versions of MR tarballs to HDFS and have different mapred-site.xml files pointing to different locations. Users can then run jobs against a specific mapred-site.xml. Following is an example of running a MR job against 2.5 version of MR tarball:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi -conf etc/hdp-2.1.*/mapred-site.xml 10 10

In HDP 2.2, the classpath template may look like:

$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure

The hdp.version environmental variable points to the version of HDP that should be used for the MR jobs. By default, the hdp.version variable points to the version of the “currently used” HDP. With specific mapred-site.xml, users can manually replace this variable into a specific version. For example, for a mapred-site.xml under hadoop-2.5 directory, users may replace ${hdp.version} with 2.5.0.

A few troubleshooting tips

  1. If you want to know the exact version of MR jars that are being used by a specific running job, you can check the logs corresponding to the MapReduce ApplicationMaster to look for localization of related jars from the distributed cache. For example:

    2014-11-20 08:19:30,199 INFO [main] org.mortbay.log: Extract jar: file://filecache/{…}/hadoop-2.6.0.tar.gz/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_42544_mapreduce____.pryk9q/webapp

  2. If shuffle encryption is also enabled in the cluster, it is possible that MR jobs get failed with exceptions like below:

    2014-10-10 02:17:16,600 WARN [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to junping-du-centos6.x-3.cs1cloud.internal:13562 with 1 map outputs
    javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731)
    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241)
    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235)
    at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:11206)
    at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136)
    at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593)
    at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181)
    at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434)
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:81)
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:61)
    at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:584)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1193)
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427)
    ….

To fix this problem, you should create a sub-directory under $HADOOP_CONF ($HADOOP_HOME/etc/hadoop by default), and copy ssl-client.xml to there. Then, add this new directory to the classpath of MR jobs via the configuration property mapreduce.application.classpath similar to other jars mentioned above.

Conclusion and Future Work

By deploying MapReduce over Distributed Cache, HDP 2.2 achieves the goal of separating the version of MR framework from the version of the YARN cluster underneath. This makes the execution of MapReduce jobs in a Hadoop YARN cluster that is going through rolling upgrades a much smoother exercise.

There are a few areas that are being worked on for future improvements, including:

  • MAPREDUCE-6146: Thinned down MR tarball to include only necessary JARs.
  • MAPREDUCE-5534: Improve job configuration handling when MapReduce deployed via HDFS.
  • YARN-2464: Provide Hadoop as a local resource (on HDFS), which can be used by other projects.
Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>