Hey guys –
I’m running Hue over my HDP 2.1. It appears that when Hive is configured to use Tez as its execution engine (which is the default), Hue would generate excessive Tez applications for each query. This becomes a real problem when several queries are executed one after the other, because it fills up the Resource Manager queue and makes all jobs to freeze until timeout.
This does not happen when execution engine is set to “mr” (MapReduce) or when the query is executed in “hive” or “hiveserver2″. It looks like a Hue-only problem.
For example – see the log below. I ran a single query which generated 3 different Tez applications (jobs):
“application_1401210649023_0227″ and “application_1401210649023_0228″ were created straight away. “application_1401210649023_0229″ was added once the query completed successfully.
(I published the log as a Google drive document:)
My assumption is that the “application_1401210649023_0227″ is a job used to describe the table, “application_1401210649023_0228″ is the query itself and “application_1401210649023_0229″ is used to format the output.