Hi, I’ve been looking at the features of Tez.
1. From this blog, http://hortonworks.com/blog/introducing-tez-faster-hadoop-processing/ , it seems that Tez is a non-MR framework enabling the execution of a DAG in ONE job. This is not feasible in the MapReduce framework since a MR job can only consist of two steps, i.e., map and reduce. So you cannot do map-map-reduce or map-reduce-map-reduce, in a single job.
2. However, when I look at the manual of Tex here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-18.104.22.168/bk_installing_manually_book/content/rpm-chap-tez.html , it says: “The Tez AMPoolService or Tez Service is a service that launches and makes available a pool of pre-launched MapReduce AMs ( Tez AMs ). These AMs in the pool can, in turn, be configured to pre-allocate a number of containers to allow jobs to be launched and completed faster. To use the Tez Service, the clients must submit the jobs to this service instead of the ResourceManager.”
It seems that Tez is still conceptually under MR framework. Performance is improved compared to out-of-box MR framework by (1) pre-launching AM for MapReduce jobs (2)container reuse for MR tasks.
So which understanding is true, 1 or 2?
Thanks for the clarification in advance!