Course Outline

Introduction

Overview of Apache Kafka Features and Architecture for Python

  • Core APIs (producer, consumer, streams, connector)
  • Concepts and uses

Accessing Kafka in Python

  • Available Python libraries for use
  • Compression formats supported

Installing Apache Kafka

  • Computer installation
  • Virtual private server and virtual machine installation

Starting Kafka Broker Server

  • Reading and editing using an IDE (Integrated Development Environment)
  • Running Zookeeper
  • Logs folder

Creating a Kafka Topic

  • Connecting to a Kafka cluster
  • Reading topic details

Sending Messages Using Producers

  • Initiating a producer
  • Examining incoming messages
  • Running multiple producers

Consuming Messages

  • Kafka Console Consumer
  • Running multiple consumers

Troubleshooting

Summary and Conclusion

Requirements

  • Experience with Python programming language
  • Familiarity with stream-processing platforms

Audience

  • Data engineers
  • Data scientists
  • Programmers
  7 Hours
 

Testimonials

Related Courses

Apache Ignite for Developers

 14 hours

Apache Ignite is an in-memory computing platform that sits between the application and data layer to improve speed, scale, and availability. In this instructor-led, live training, participants will learn the principles behind persistent and pure

Apache Apex: Processing Big Data-in-Motion

 21 hours

Apache Apex is a YARN-native platform that unifies stream and batch processing. It processes big data-in-motion in a way that is scalable, performant, fault-tolerant, stateful, secure, distributed, and easily operable. This instructor-led, live

Unified Batch and Stream Processing with Apache Beam

 14 hours

Apache Beam is an open source, unified programming model for defining and executing parallel data processing pipelines. It's power lies in its ability to run both batch and streaming pipelines, with execution being carried out by one of

Scaling Data Analysis with Python and Dask

 14 hours

Dask is a flexible and high-performance Python library for parallel computing. It scales and accelerates big data processing with other Python-based data science libraries, such as Pandas, Numpy, and Scikit-Learn. This instructor-led, live

Data Analysis with Python, Pandas, and Numpy

 14 hours

Pandas is a Python package that provides data structures for working with structured (tabular, multidimensional, potentially heterogeneous) and time series data.

Accelerating Python Pandas Workflows with Modin

 14 hours

Modin is a parallel data frame system designed to speed up Pandas workflows. It can be used to handle large datasets, leveraging Ray or Dask as the backend framework for distributed computing in Python. This instructor-led, live training (online

Machine Learning with Python and Pandas

 14 hours

Pandas is a Python library for data manipulation and analysis. Using Pandas, users can perform predictive analysis through machine learning. This instructor-led, live training (online or onsite) is aimed at data scientists who wish to use Pandas

FARM (FastAPI, React, and MongoDB) Full Stack Development

 14 hours

FARM (FastAPI, React, and MongoDB) is similar to MERN, but performs faster with Python and FastAPI replacing Node.js and Express as the backend. FastAPI is a high-performance Python web framework used by top companies, such as Microsoft, Uber, and

Developing APIs with Python and FastAPI

 14 hours

FastAPI is an open source, high-performance web framework for building APIs with Python. It is used by many large companies, such as Uber, Netflix, and Microsoft. This instructor-led, live training (online or onsite) is aimed at developers who

Apache Flink Fundamentals

 28 hours

Apache Flink is an open-source framework for scalable stream and batch data processing. This instructor-led, live training introduces the principles and approaches behind distributed stream and batch data processing, and walks participants

Confluent KSQL

 7 hours

Confluent KSQL is a stream processing framework built on top of Apache Kafka. It enables real-time data processing using SQL operations. This instructor-led, live training (online or onsite) is aimed at developers who wish to implement Apache

Apache NiFi for Administrators

 21 hours

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a

Apache NiFi for Developers

 7 hours

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a

Spark Streaming with Python and Kafka

 7 hours

Apache Spark Streaming is a scalable, open source stream processing system that allows users to process real-time data from supported sources. Spark Streaming enables fault-tolerant processing of data streams. This instructor-led, live

Apache Storm

 28 hours

Apache Storm is a distributed, real-time computation engine used for enabling real-time business intelligence. It does so by enabling applications to reliably process unbounded streams of data (a.k.a. stream processing). "Storm is for