Partner Integration: Apache Hudi + StarRocks
What is Apache Hudi?
Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics.
What is StarRocks?
StarRocks is a next-generation, blazing-fast massively parallel processing (MPP) database designed to make real-time analytics easy for enterprises. It is built to power sub-second queries at scale. StarRocks can read data stored in Apache Hudi.
StarRocks + Apache Hudi = The Modern Open Data Lake
Technical Benefits
-
Our performance tests have shown that StarRocks can get to near local disk performance when using Apache Hudi.
-
No lock in on the query layer. You can change the query layer when it doesn't meet the technical or financial requirements anymore.
-
Get all the capabilities of an OLAP database like the ability to do JOINs and materialized views on the data within Apache Hudi (you can also do a JOIN across an Apache Iceberg, Apache Hudi and Apache Hive table).
-
Many database tools just work out of the box through the Mysql wire compatible protocol support within StarRocks.
Apache Hudi + StarRocks Webinar
Try out our hands on lab!
One of the best way to understand our product is through our hands on labs at https://killercoda.com/starrocks/
Resources
Tutorial: How to query data in Apache Hudi using StarRocks
Video: Apache Hudi Community Call May 2023
Documentation: StarRocks Apache Hudi External Catalog
Documentation: StarRocks querying on hudi.apache.org