The Big Data world is evolving from batch-oriented to stream-oriented. Instead of capturing data and then running batch jobs to process it, processing is done as the data arrives to extract valuable information more quickly.
This talk describes the architecture patterns, tools, and techniques that have emerged for these systems. We'll focus on the following topics:
- The trade offs that guide design decisions, such as latency and throughput requirements
- The unique challenges of doing analytics over streams, such as handling late-arriving data, doing expensive calculations like machine learning model training, etc.
- The major tools available for streaming data.
- How to integrate these tools to implement specific scenarios.