Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
February 02, 2015
prev slideNext slide

Cost-Based Optimizer Makes Apache Hive 0.14 More Than 2.5X Faster

Hortonworks Data Platform (HDP) provides Hadoop for the Enterprise, with a centralized architecture of core enterprise services, for any application and any data. HDP is uniquely built around native YARN services to enable a centralized architecture through which multiple data access applications interact with a shared data set. Apache Hive is one of the most important of those data access applications—the defacto standard for interactive SQL queries over petabytes of data in Hadoop.

This blog shares benchmark-testing results that show performance improvements made by the Apache Hive community with the Cost Based Optimizer (CBO). The CBO is part of Phase 1 of the Stinger.next Initiative to further enhance Hive’s speed, scale and SQL capabilities.

As more analysts and data scientists adopt Hive across their organizations, performance becomes crucial. Many users rely on BI tools to generate queries that are far from optimal in Hive, so Apache Hive 0.14 includes the CBO to maintain high performance while also allowing your team members to continue using their preferred BI tools.

CBOs do this by planning how a query will be executed, using statistics about the data and sophisticated algorithms to come up with query plans that optimize performance and maximize resource utilization across the Hadoop cluster. We want Hive to just work, without users having to concern themselves with performance.

We’re delighted with the performance gains shown in the table below, using the TPC-DS benchmark:

  • 6 out of the 50 queries showed speedups greater than 5x
  • Across all 51 queries, the average speedup was 3x
  • Total workload runtime shrank from 20.6 to 9 hours

We would like to thank the members of the Apache Hive community who contributed code to make these improvements possible – Ashutosh Chauhan, Gunther Hagleitner, Harish Butani, Jesus Camacho Rodriguez, Julian Hyde, John Pullokkaran, Mostafa Mokhtar, and Pengcheng Xiong.

Results: Hive Query Performance Numbers on TPC-DS

Below are the highlights comparing run times of “data mining”, “interactive” and “reporting” queries in Apache Hive 0.14. The numbers compare speeds with CBO on with the same queries when CBO was disabled. We categorized queries by these six CBO effects:

CBO Effect Description
Same plan Execution plan was the same between CBO ON & CBO OFF, because the query explicitly specified a join order.
Lucky query The order in which the tables were referenced in the TPC-DS SQL query was identical to the optimal join order.
Predicate push down Pushed a filter below the join, avoiding an extra filter operator
Join ordering The join order was changed by CBO
Bushy join A wide join tree was produced instead of a traditional left-deep tree
Join simplification CBO is able to identify common join predicates from conjunction and disjunction clause and creating an implied join predicate, doing such avoids costly cross product joins.
TPC-DS Query Results
DATA MINING @ 30TB SCALE
CBO ON CBO OFF X Factor CBO Effect
Q31 956 1,020 1.1 Same plan
Q34 207 465 2.2 Predicate push down
Q39 129 176 1.4 Join ordering
Q71 332 374 1.1 Same plan
Q73 174 391 2.2 Predicate push down
Q88 1,125 1,691 1.5 Predicate push down Join ordering
Q98 208 432 2.1 Predicate push down Join ordering
INTERACTIVE @ 30TB SCALE
CBO ON CBO OFF X Factor CBO Effect
Q3 143 226 1.6 Predicate push down Join ordering
Q7 266 263 1.0 Same plan
Q12 106 162 1.5 Join ordering
Q13 5,903 10,000 1.7 Join simplication
Q15 1,557 10,000 6.4 Bushy Join Join ordering
Q19 228 215 0.9 Same plan
Q26 173 178 1.0 Same plan
Q27 249 252 1.0 Predicate push down
Q42 143 148 1.0 Lucky query
Q43 540 512 0.9 Lucky query
Q52 149 149 1.0 Lucky query
Q55 143 164 1.1 Lucky Query
Q82 1,295 2,223 1.7 Bushy Join Join ordering
Q84 73 138 1.9 Bushy Join Join ordering
Q91 28.57 100 3.5 Bushy Join Join ordering
Q96 156 175 1.1 Join ordering
REPORTING @ 30TB SCALE
CBO ON CBO OFF X Factor CBO Effect
Q17 429 2,957 6.9 Bushy join Join ordering
Q20 132 267 2.0 Join ordering
Q21 28 46 1.6 Join ordering
Q25 443 3,187 7.2 Bushy join Join ordering
Q28 754 2,902 3.1 Lucky query
Q29 950 747 1.0 Bushy join Join ordering
Q32 339 351 1.0 Lucky query
Q40 1,929 2,462 1.3 Join ordering
Q45 88 1,094 12.4 Bushy join Join ordering
Q46 318 407 1.3 Bushy join Join ordering
Q48 257 10,000 39.0 Join Simplification
Q49 390 2,304 5.9 Bushy join Join ordering
Q50 1,113 1,380 1.2 Bushy join Join ordering
Q54 347 1,525 4.4 Bushy join Join ordering
Q56 361 407 1.1 Bushy join Join ordering
Q58 335 1,897 5.7 Bushy join Join ordering
Q60 356 403 1.1 Bushy join Join ordering
Q66 381 1,331 3.5 Join ordering
Q68 298 327 1.1 Lucky query
Q75 2,659 3,511 1.3 Join ordering
Q79 438 535 1.2 Predicate Push Down
Q85 1,355 2,252 1.7 Join ordering
Q87 1,398 1,468 1.1 Same plan
Q89 288 273 0.9 Same plan
Q90 113 116 1.0 Lucky query
Q92 922 1,028 1.1 Same plan
Q97 1,270 1,258 1.0 Same plan

Testing Environment & Parameters

Software

Hortonworks Data Platform 2.2 with:

  • Hadoop 2.6.0
  • Tez 0.5.2
  • Hive 0.14.0

HDP was deployed using Apache Ambari 1.7.0. For the most part, the cluster used the Ambari defaults (except where noted below). Hive 0.14.0 runs were done using Java 7 (default JVM).

Apache Tez and MapReduce were tuned to process all queries using 4 GB containers at a target container-to-disk ratio of 2.0. The ratio is important because it minimizes disk thrash and maximizes throughput.

Other Settings:

  • yarn.nodemanager.resource.memory-mb was set to 49152
  • Default virtual memory for a job’s map-task and reduce-task were set to 4096
  • hive.tez.dynamic.partition.pruning was set to false
  • hive.stats.fetch.column.stats was set to true
  • hive.tez.container.size was set to 4096
  • hive.tez.java.opts was set to -Xmx3800m
  • Tez app masters were given 8 GB
  • mapreduce.map.java.opts and mapreduce.reduce.java.opts were set to -Xmx3800m. This is smaller than 4096 to allow for some garbage collection overhead
  • hive.auto.convert.join.noconditionaltask.size was set to 320000000.

The following additional optimizations were used for Hive 0.14.0:

  • Vectorized Query enabled
  • ORCFile formatted data
  • Map-join auto conversion enabled

Hardware

20 physical nodes, each with:

  • 2x Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz for total of 16 CPU cores/machine
  • Hyper-threading enabled
  • 256GB RAM per node
  • 6x 4TB WDC WD4000FYYZ-0 drives per node
  • 10 Gigabit interconnect between the nodes

Note: Based on the YARN Node Manager’s Memory Resource setting used below, only 48 GB of RAM per node was dedicated to query processing, the remaining 200 GB of RAM were available for system caches and HDFS.

Linux Configurations
  • /proc/sys/net/core/somaxconn = 512
  • /proc/sys/vm/dirty_writeback_centisecs = 6000
  • /proc/sys/vm/swappiness = 0
  • /proc/sys/vm/zone_reclaim_mode = 0
  • /sys/kernel/mm/redhat_transparent_hugepage/defrag = never
  • /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag = no
  • /sys/kernel/mm/transparent_hugepage/khugepaged/defrag = 0

Data

  • TPC-DS Scale 30,000 data, partitioned by day
  • Hive 0.14.0 was run against data stored in ORCFile

Queries

  • The test was driven by the Hive Testbench https://github.com/hortonworks/hive-testbench.
  • First, the tpcds-build and tpcds-setup scripts were used to generate 30 terabytes (TB) of data. tpcds-setup first generated the data in text format, then converted it into ORCFile and RCFile formatted data, partitioned by day. After data was generated, “hadoop balancer” was used to balance data across the cluster.
  • A total of 50 queries from the industry standard TPC-DS benchmark were run.

Note: Queries were run as-is with no additional hinting, and there was no special tuning used for any of the queries.

Tags:

Comments

  • Cannot reproduce plan changes described above as “Predicate push down”. Could you give execution plan with and without cbo for any of queries of the category (Q1, Q27, Q34, Q73, Q88,…)?

  • In the benchmark results you posted on 6/2/14 that compared Hive 0.10 to 0.13, you reported that “Total time to run all queries decreased from 7.8 days to 9.3 hours”. In this benchmark of Hive 0.14 with and without CBO, you report that “Total workload runtime shrank from 20.6 to 9 hours”. In both cases you use a 30TB dataset and presumably the same TPS-DS queries were used in both. It also looks like the cluster configurations are identical. What am I missing that would account for the total runtime to go from 9.3 hours with Hive 0.13 to 20.6 hours with Hive 0.14 (without CBO)?

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    If you have specific technical questions, please post them in the Forums

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>