Posted On: Jun 2, 2016
You can now use Apache Tez 0.8.3, Apache Phoenix 4.7.0, and upgraded versions of Presto (0.147), Apache HBase (1.2.1), and Apache Mahout (0.12.0) on Amazon EMR release 4.7.0. Also, Amazon Redshift JDBC drivers are now included on Amazon EMR clusters for use with components such as the Spark-Redshift connector.
Tez is an execution framework on Hadoop YARN that offers fast performance from optimized query plans and enhanced resource management. You can use Tez with Apache Hive and Apache Pig instead of Hadoop MapReduce by enabling it with an Amazon EMR configuration object. Phoenix is used for low-latency SQL with ACID transaction capabilities over data stored in Apache HBase. You can easily create secondary indexes for additional performance, and create different views over the same underlying HBase table.
You can create an Amazon EMR cluster with release 4.7.0 by choosing release label “emr-4.7.0” from the AWS Management Console, AWS CLI, or SDK. You can specify Tez, Phoenix, Presto-Sandbox, HBase, and Mahout to install these applications on your cluster. Please visit the Amazon EMR documentation for more information about release 4.7.0, Tez 0.8.3, Phoenix 4.7.0, Presto 0.147, HBase 1.2.1, and Mahout 0.12.0.