The Apache Tajo team has been hard at work since being granted Top-Level Project status by the Apache Software Foundation in March, with Tajo 0.8 representing a major leap forward for the big data warehouse system. Check out the full story in the Tajo 0.8.0 release announcement.
The latest version includes a list of new features and enhancements geared at keeping the SQL-on-Hadoop engine at the forefront of serious long-term duration data queries.
According to Hyunsik Choi, Apache Tajo PMC Chair, Tajo 0.8 sees the resolution of 363 issues in all, including the addition of 25 new features, 81 minor tweaks, and the fixing of 164 bugs.
The source and binary release tarballs for the new release are available for free download at http://tajo.apache.org/downloads.
Apache Tajo 0.8 Select Features
- More comprehensive SQL functionality, including support for quoted identification, datetime data types, and a number of new SQL functions
- Enhanced performance and scalability through I/O and query optimization, and reduced GC overheads and memory usage
- Support for new storage types, including Parquet, Avro and Amazon S3
- Additional Hadoop version support (2.2.0, 2.3.0 and 2.4.0)
- Hive Integration through Hive metastore support
- Improved usability, including the addition of a web UI, tsql shell and JDBC
Apache Tajo 0.9 Roadmap
- Multi-tenancy support, including Yarn cluster, basic User authentication, sparrow scheduler
- More comprehensive SQL functionality, including window function, In/Exist subquery, scalar subquery
- Query optimization, such as Parquet statistics utilization
- DAG framework, such as parallel stage execution
- Query execution with JSON
- and more
Apache Tajo Further Information
- Apache TajoTM site (web)
- Apache TajoTM solution overview (slideshare)
- Apache TajoTM field test work (slideshare)