Home Conference Sessions Kafka Meets Iceb...

Kafka Meets Iceberg: Real-Time Data Streaming into Modern Data Lakes and Warehouses

Kasun Indrasiri | GOTO Chicago 2024

You need to be signed in to add a collection

In this talk, we'll explore how Kafka serves as a powerful platform for capturing real-time streaming data and how organizations are increasingly adopting Apache Iceberg table format to store data in data lakes and data warehouses. We'll discuss the key benefits of using Apache Iceberg tables in your data lake such as schema evolution, ACID transactions, hidden partitioning, time traveling and efficient querying. Next, we'll dive into how to efficiently stream data from Kafka into Iceberg-based data lakes. Confluent Tableflow will be introduced as a potential solution for streamlining the ingestion of Kafka streams into Iceberg tables within your data lake. A live demo will showcase the seamless integration of Kafka with Iceberg, equipping participants with practical knowledge to enhance their data architectures for powerful real-time analytics. * The role of Kafka in real-time data streaming * Why Apache Iceberg is essential for data lakes and data warehouses * Iceberg fundamentals: Core concepts and key features * Streaming data from Kafka to Iceberg tables in data lakes * Use case: Leveraging Confluent Tableflow to stream Kafka data into data lakes and warehouses

Share on:
linkedin facebook
Copied!

Transcript

In this talk, we'll explore how Kafka serves as a powerful platform for capturing real-time streaming data and how organizations are increasingly adopting Apache Iceberg table format to store data in data lakes and data warehouses. We'll discuss the key benefits of using Apache Iceberg tables in your data lake such as schema evolution, ACID transactions, hidden partitioning, time traveling and efficient querying.

Next, we'll dive into how to efficiently stream data from Kafka into Iceberg-based data lakes. Confluent Tableflow will be introduced as a potential solution for streamlining the ingestion of Kafka streams into Iceberg tables within your data lake. A live demo will showcase the seamless integration of Kafka with Iceberg, equipping participants with practical knowledge to enhance their data architectures for powerful real-time analytics.

  • The role of Kafka in real-time data streaming
  • Why Apache Iceberg is essential for data lakes and data warehouses
  • Iceberg fundamentals: Core concepts and key features
  • Streaming data from Kafka to Iceberg tables in data lakes
  • Use case: Leveraging Confluent Tableflow to stream Kafka data into data lakes and warehouses

About the speakers

Kasun Indrasiri

Kasun Indrasiri

Author of "Microservices for the Enterprise"

Related topics