1)One of the key ones is low latency for executing SQL queries on top of Hadoop. And part of this has to do with bypassing the MapReduce infrastructure which involves significant overhead, especially when starting and stopping JBMs.
2)Cloudera also claims several magnitudes of improvement in performance compared to executing the same SQL queries using Hive.
3)Another benefit is that if we really wanted to look under the hood at what Cloudera has provided in Impala or if we wanted to tinker with the code, the source code is available for you to access and download.
Asked In: Many Interviews |