MapJoinMemoryExhaustionException on local job

to create new topics or reply. | New User Registration

This topic contains 6 replies, has 3 voices, and was last updated by  Prabhu Ramakrishnan 1 year ago.

  • Creator
  • #44100

    Hi, I am getting the following error when running a query that converts to a local mapjoin in HDP 2.0 installed with the Amabari:

    ERROR mr.MapredLocalTask ( – Hive Runtime Error: Map local work exhausted memory
    org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2013-11-20 07:30:48 Processing rows: 1700000 Hashtable size: 1699999 Memory usage: 965243784 percentage: 0.906
    at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(


    At the beginning of the query execution Hive shows the following message:
    Starting to launch local task to process map join; maximum memory = 1065484288

    I haven’t found which are the options that I need to set to increase the maximum memory alloted for this process. Could anyone tell me?

Viewing 6 replies - 1 through 6 (of 6 total)

You must be to reply to this topic. | Create Account

  • Author
  • #58025

    Prabhu Ramakrishnan

    I just overcame from this issue . by reducing the number of rows I am selecting to join with another table. Instead of joining all the possible data within single query, I split the data into small sub set and achieved the same in multiple steps.

    The total execution time is far less than the single consolidated Query.


    Hi Prabhu. I would suggest you use:
    before running your query to disable local inmemory joins and force the join to be done as a distributed Map-Reduce phase. After running your query you should set the value back to true with:
    Note that this just circunvents the actual issue and is not a real fix but it works fine if you just need to run the query with no regards to performance.


    Prabhu Ramakrishnan


    I am using HDP 2.0 in production with 100 of data nodes. While running a simple join query I am getting bellow error related to insufficient memory. I tried to


    but no impact in the Map Joins still getting same error

    2014-07-28 10:43:43     Starting to launch local task to process map join;      maximum memory = 1065484288
    2014-07-28 10:43:46     Processing rows:        200000  Hashtable size: 199999  Memory usage:   87561240        percentage:     0.082
    2014-07-28 10:43:47     Processing rows:        300000  Hashtable size: 299999  Memory usage:   128557528       percentage:     0.121
    2014-07-28 10:43:48     Processing rows:        400000  Hashtable size: 399999  Memory usage:   173836496       percentage:     0.163
    2014-07-28 10:44:03     Processing rows:        2100000 Hashtable size: 2099999 Memory usage:   886909320       percentage:     0.832
    2014-07-28 10:44:07     Processing rows:        2200000 Hashtable size: 2199999 Memory usage:   915933544       percentage:     0.936
    Execution failed with exit status: 3
    Obtaining error information
    Task failed!
    Task ID:
    FAILED: Execution Error, return code 3 from
    <b>Error from: /tmp/prabhunkl/hive.log </b>
    2014-07-28 10:44:08,289 ERROR mr.MapredLocalTask ( - Hive Runtime Error: Map local work exhausted memory
    org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2014-07-28 10:44:08	Processing rows:	2400000	Hashtable size:	2399999	Memory usage:	997667888	percentage:	0.936
    	at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(
    	at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(
    	at org.apache.hadoop.hive.ql.exec.Operator.process(
    	at org.apache.hadoop.hive.ql.exec.Operator.forward(
    	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(
    	at org.apache.hadoop.hive.ql.exec.Operator.process(
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    	at java.lang.reflect.Method.invoke(
    	at org.apache.hadoop.util.RunJar.main(
    Any recommendations ?
    I appreciate your help.

    Hi Yi,

    I tried setting the property to a lower value (150000000) and finally found one that finishes the query successfully but with more steps.

    Thanks for your help.


    Yi Zhang

    Hi Juan,

    Can you try these settings:, default 1000000000 default 0.5

    also check the mapper size, node manager mem settings.



    From what I see Hive seems not to take into account the parameter set hive.mapjoin.smalltable.filesize=25000000 when I set it in .hiverc or manually in the hive command line.

    I also tried setting the value of that parameter to a very low number like 5 and hive still tries to convert to a map join a table that has several hundred megabytes.

    Any ideas what might be happening? is this a known issue?

Viewing 6 replies - 1 through 6 (of 6 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.