Course Outline
Introduction
Overview of Spark Streaming Features and Architecture
- Supported data sources
- Core APIs
Preparing the Environment
- Dependencies
- Spark and streaming context
- Connecting to Kafka
Processing Messages
- Parsing inbound messages as JSON
- ETL processes
- Starting the streaming context
Performing a Windowed Stream Processing
- Slide interval
- Checkpoint delivery configuration
- Launching the environment
Prototyping the Processing Code
- Connecting to a Kafka topic
- Retrieving JSON from data source using Paw
- Variations and additional processing
Streaming the Code
- Job control variables
- Defining values to match
- Functions and conditions
Acquiring Stream Output
- Counters
- Kafka output (matched and non-matched)
Troubleshooting
Summary and Conclusion
Requirements
- Experience with Python and Apache Kafka
- Familiarity with stream-processing platforms
Audience
- Data engineers
- Data scientists
- Programmers
Testimonials (5)
The labs and the slides combine well with Jorge's knowledge and love for Kafka.
Willem - BMW SA
Course - Apache Kafka for Developers
very interactive...
Richard Langford
Course - SMACK Stack for Data Science
Sufficient hands on, trainer is knowledgable
Chris Tan
Course - A Practical Introduction to Stream Processing
Grate skills, examples, very good exercises
Marek Konieczny - G2A.COM Limited
Course - Kafka for Administrators
The course was excellent. Our trainer Andreas was very prepared and answered all the questions that we asked. Also he helped us when we have troubles and explained in details when needed. The best course that i have ever been part of.