Tag Archives: Apache Tajo

Apache Tajo™ 0.11.0 released!

The Apache Software Foundation announced the release of Apache Tajo v0.11.0 on Oct 28. The new release heralds significant enhancements for easier data integration and interoperability with other hadoop ecosystems. "Tajo 0.11.0 represents a very important milestone. It introduced critical features and functions that let us build out a modern data warehouse system," said Hyunsik Choi, Vice ...

Tajo at ApacheCon – Big Data Europe 2015

Gruter_ApacheCon_BigData_2015_jihoonSon

Gruter senior developer and Apache Tajo co-founder, Dr. Jihoon Son, was a presenter at Apache:Big Data Europe, held in Budapest, Hungary on September 30. Son's challenging session looked at the new features in the coming release Tajo 0.11, including query federation, JDBC-based storage support and self-describing data format support among others. With the coming release scheduled ...

Broadcast Join in Tajo

Example of Repartition Join

(This post is originally published in https://jihoonson.wordpress.com/.) Join is one of the most expensive operations in relational world. Many researchers have been studied for efficient join processing. In distributed systems, there are two well-known join execution algorithms, i.e., repartition join and broadcast join. (There are many other join algorithms including recently introduced Track join, hyper shuffle join, and Tributary join, but ...

Setting up an Apache Tajo Cluster on Amazon EMR

TajoEMR_image02

Note. Bootstrap action script for EMR 4.x was added. Check out the differences introduced in 4.x with release of EMR 4.0 at Jul 2015. Apache Tajo™, or simply “Tajo”, is an open-source relational and distributed big data warehouse (“Big DW”) system which runs on Apache Hadoop and other stores. Tajo is designed for low-latency and scalable ...

Apache Tajo™ 0.10.0 now available!

gruter_tajo_logo

The Apache Software Foundation announced the release of Apache Tajo v0.10 on Mar 9. The release heralds significant enhancements to the enterprise “SQL-on-Hadoop” big data warehouse solution, including performance improvements and wider ecosystem integration. "Tajo has evolved over the last couple of years into a mature 'SQL-on-Hadoop' engine," said Hyunsik Choi, Vice President of Apache Tajo and Gruter ...

Apache Tajo on Hadoopsphere.com

In a two-part article series entitled "Technical Deep Dive Into Apache Tajo", HadoopSphere.com conducts a Q&A with Dr. Hyunsik Choi, PMC Chair of Apache Tajo. In the first article of the series, Choi explains Tajo's design logic, including its unique distributed processing framework, pluggable storage manager, and advanced query optimization capabilities. see: http://www.hadoopsphere.com/2015/02/technical-deep-dive-into-apache-tajo.html

Gruter at Big Data World Convention 2014

gruter_bwc2014_photo_2

Gruter Senior Developer Jaehwa Jung spoke at BWC 2014 which was held in Busan, South Korea, on Oct. 22-23. The Big Data World Convention is an initiative sponsored by the South Korean Ministry of Science, ICT and Future Planning, among other influential national South Korean IT bodies. Gruter Senior Developer Jaehwa Jung speaks to the BWC 2014 ...

Tajo now on Gartner’s SQL-on-Hadoop radar

Nick Heudecker, Gartner Research Director, recently posted a note looking at the features and performance of Apache Tajo on his Gartner blog. Nick is an analyst in the Gartner Intelligence Information Management Group, and is responsible for coverage of big data and NoSQL technologies. In his post, he noted the robust feature set of Tajo, and ...