Apache Tajo at the Bay Area HUG

On Nov. 5, the Bay Area Hadoop User Group (HUG) held a special meetup at the LinkedIn headquarters in Mountain View, CA. The event showcased two new Apache Incubator projects: Tajo and Samza.

Dr. Hyun-sik Choi introducing Apache Tajo

In the session “Apache Tajo: A Big Data Warehouse on Hadoop”, Dr. Hyun-sik Choi—Research Director at Gruter and lead Apache Tajo developer—gave a talk outlining the Tajo project.

Choi emphasized that Tajo had been designed to run low-latency, scalable ETL and ad-hoc queries on large data sets. He also noted that, because it is equipped with advanced database techniques, Tajo maintains the benefits of fault tolerance without being constricted by the shortcomings of the old MapReduce processing model.

Apache Tajo Introduction – Bay Area HUG Nov. 2013: LinkedIn Special Event from Gruter Corp

Jeong-shik Jang, VP of Gruter, followed the talk with a case study on the deployment of Tajo at South Korea’s largest mobile carrier, SK Telecom. The project passed benchmark testing with flying colors, and has been proceeding strongly since.

“Thanks to features such as dynamic task scheduling, Tajo has performed remarkably well when pitted against both Hive and Impala,” he explained.

“Surprisingly, even some short-duration queries outperformed Impala, despite Tajo needing to materialize its query output to HDFS and its intermediate output to the local disk so as to support fault-tolerance.”

Jeong-shik Jang presenting an Apache Tajo deployment case study

Apache Tajo Case Study – Bay Area HUG Nov 2013 from Gruter Corp

Choi reported that the event was a big positive for the Apache Tajo project, raising awareness of the capabilities of the SQL-on-Hadoop solution in the Bay Area and enabling developers to see the Tajo project’s impressive progress.

Said Choi, “It was an excellent opportunity for us to introduce Silicon Valley data infra engineers to the work we’ve been doing with Tajo. Our test results have been very strong, but nothing beats communicating with people face-to-face.”

He went on to explain that he had received a lot of positive feedback from those in attendance, as well as some important guidance on the project’s roadmap. “We’ve been able to get a better sense of the priorities of developers on the ground, and we’ve picked up another serious Apache Tajo committer—which is, of course, the lifeblood of open-source software.”

Beer time after the meetup

Young-kil Kwon feels welcome at the LinkedIn cafe

The Gruter team in Silicon Vally

* More photos here