Apache Spark in the Cloud Training Course

The initial learning curve for Apache Spark can be steep, requiring significant effort to achieve the first results. This course is designed to help participants navigate through this challenging phase. By the end of the course, attendees will have a solid grasp of the fundamentals of Apache Spark, including the distinction between RDD and DataFrame. They will also become proficient in using both Python and Scala APIs, gain an understanding of executors and tasks, and more. Additionally, adhering to best practices, this course places strong emphasis on cloud deployment strategies, with a focus on Databricks and AWS. Students will learn to differentiate between AWS EMR and AWS Glue, one of the latest Spark services offered by AWS.

AUDIENCE:

Data Engineers, DevOps Specialists, Data Scientists

This course is available as onsite live training in United Arab Emirates or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

Introduction:

Apache Spark in Hadoop Ecosystem
Short intro for python, scala

Basics (theory):

Architecture
RDD
Transformation and Actions
Stage, Task, Dependencies

Using Databricks environment understand the basics (hands-on workshop):

Exercises using RDD API
Basic action and transformation functions
PairRDD
Join
Caching strategies
Exercises using DataFrame API
SparkSQL
DataFrame: select, filter, group, sort
UDF (User Defined Function)
Looking into DataSet API
Streaming

Using AWS environment understand the deployment (hands-on workshop):

Basics of AWS Glue
Understand differencies between AWS EMR and AWS Glue
Example jobs on both environment
Understand pros and cons

Extra:

Introduction to Apache Airflow orchestration

Requirements

Programing skills (preferably python, scala)

SQL basics

21 Hours

Need help picking the right course?

Testimonials (3)

Having hands on session / assignments

Poornima Chenthamarakshan - Intelligent Medical Objects

Course - Apache Spark in the Cloud

1. Right balance between high level concepts and technical details. 2. Andras is very knowledgeable about his teaching. 3. Exercise

Apache Spark in the Cloud Training Course

Course Outline

Requirements

Testimonials (3)

Poornima Chenthamarakshan - Intelligent Medical Objects

Course - Apache Spark in the Cloud

Steven Wu - Intelligent Medical Objects

Course - Apache Spark in the Cloud

Lim Meng Tee - Jobstreet.com Shared Services Sdn. Bhd.

Course - Apache Spark in the Cloud

Upcoming Courses

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

Apache Spark in the Cloud Training Course

Course Outline

Requirements

Testimonials (3)

Poornima Chenthamarakshan - Intelligent Medical Objects

Course - Apache Spark in the Cloud

Steven Wu - Intelligent Medical Objects

Course - Apache Spark in the Cloud

Lim Meng Tee - Jobstreet.com Shared Services Sdn. Bhd.

Course - Apache Spark in the Cloud

Upcoming Courses

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Apache Spark in the Cloud

Related Courses

Artificial Intelligence - the most applied stuff - Data Analysis + Distributed AI + NLP

Big Data Analytics in Health

Introduction to Graph Computing

Hadoop and Spark for Administrators

Hortonworks Data Platform (HDP) for Administrators

A Practical Introduction to Stream Processing

SMACK Stack for Data Science

Apache Spark Fundamentals

Spark for Developers

OBJECTIVE:

AUDIENCE :

Scaling Data Pipelines with Spark NLP

Python and Spark for Big Data (PySpark)

Python, Spark, and Hadoop for Big Data

Apache Spark SQL

Apache Spark MLlib

Stratio: Rocket and Intelligence Modules with PySpark

Related Categories

Apache Spark

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites