Course Outline

Introduction

  • Apache Spark vs Hadoop MapReduce

Overview of Apache Spark Features and Architecture

Choosing a Programming Language

Setting up Apache Spark

Creating a Sample Application

Choosing the Data Set

Running Data Analysis on the Data

Processing of Structured Data with Spark SQL

Processing Streaming Data with Spark Streaming

Integrating Apache Spark with 3rd Part Machine Learning Tools

Using Apache Spark for Graph Processing

Optimizing Apache Spark

Troubleshooting

Summary and Conclusion

Requirements

  • Experience with the Linux command line
  • A general understanding of data processing
  • Programming experience with Java, Scala, Python, or R

Audience

  • Developers
 21 Hours

Testimonials (2)

Related Categories