Course Outline

To request a customized course outline for this training, please contact us.

 

Requirements

  • An understanding of Scala and Java
  • An understanding of Apache Kafka and YARN
  14 Hours
 

Testimonials

Related Courses

Apache Ambari: Efficiently Manage Hadoop Clusters

 21 hours

Apache Ambari is an open-source management platform for provisioning, managing, monitoring and securing Apache Hadoop clusters. In this instructor-led live training participants will learn the management tools and practices provided by Ambari to

Apache Ignite for Developers

 14 hours

Apache Ignite is an in-memory computing platform that sits between the application and data layer to improve speed, scale, and availability. In this instructor-led, live training, participants will learn the principles behind persistent and pure

Apache Apex: Processing Big Data-in-Motion

 21 hours

Apache Apex is a YARN-native platform that unifies stream and batch processing. It processes big data-in-motion in a way that is scalable, performant, fault-tolerant, stateful, secure, distributed, and easily operable. This instructor-led, live

Unified Batch and Stream Processing with Apache Beam

 14 hours

Apache Beam is an open source, unified programming model for defining and executing parallel data processing pipelines. It's power lies in its ability to run both batch and streaming pipelines, with execution being carried out by one of

Apache Flink Fundamentals

 28 hours

Apache Flink is an open-source framework for scalable stream and batch data processing. This instructor-led, live training introduces the principles and approaches behind distributed stream and batch data processing, and walks participants

Hortonworks Data Platform (HDP) for Administrators

 21 hours

Hortonworks Data Platform (HDP) is an open-source Apache Hadoop support platform that provides a stable foundation for developing big data solutions on the Apache Hadoop ecosystem. This instructor-led, live training (online or onsite) introduces

Data Analysis with Hive/HiveQL

 7 hours

This course covers how to use Hive SQL language (AKA: Hive HQL, SQL on Hive, HiveQL) for people who extract data from Hive

Impala for Business Intelligence

 21 hours

Cloudera Impala is an open source massively parallel processing (MPP) SQL query engine for Apache Hadoop clusters. Impala enables users to issue low-latency SQL queries to data stored in Hadoop Distributed File System and Apache

Confluent KSQL

 7 hours

Confluent KSQL is a stream processing framework built on top of Apache Kafka. It enables real-time data processing using SQL operations. This instructor-led, live training (online or onsite) is aimed at developers who wish to implement Apache

Real-Time Stream Processing with MapR

 7 hours

In this instructor-led, live training, participants will learn the core concepts behind MapR Stream Architecture as they develop a real-time streaming application. By the end of this training, participants will be able to build producer and

Tigon: Real-time Streaming for the Real World

 14 hours

Tigon is an open-source, real-time, low-latency, high-throughput, native YARN, stream processing framework that sits on top of HDFS and HBase for persistence. Tigon applications address use cases such as network intrusion detection and analytics,

Apache NiFi for Administrators

 21 hours

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a

Apache NiFi for Developers

 7 hours

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a

Spark Streaming with Python and Kafka

 7 hours

Apache Spark Streaming is a scalable, open source stream processing system that allows users to process real-time data from supported sources. Spark Streaming enables fault-tolerant processing of data streams. This instructor-led, live

Apache Storm

 28 hours

Apache Storm is a distributed, real-time computation engine used for enabling real-time business intelligence. It does so by enabling applications to reliably process unbounded streams of data (a.k.a. stream processing). "Storm is for