Course Outline

Week 1 Big Data concepts

  • VVVV (Velocity, Volume, Variety, Veracity) definition
  • Limits to traditional data processing capacity
  • Distributed Processing
  • Statistical Analysis
  • Machine Learning Analysis Types
  • Data Visualization
  • Distributed Processing (e.g. map-reduce)
  • Introduction to used languages
  • R language crash-course
  • Python crash course

Weeks 2&3 Performing Data Analysis

  • Statistical Analysis
  • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Inferential Statistics (estimating)
  • Forecasting with Correlation and Regression models
  • Time Series analysis
  • Basics of Machine Learning
  • Supervised vs unsupervised learning
  • Classification and clustering
  • Estimating cost of specific methods
  • Filter

Week 4 Natural Language Processing

  • Processing text
  • Understanding meaning of the text
  • Automatic text generation
  • Sentiment/Topic Analysis
  • Computer Vision

Week 5&6 Tooling concept

  • Data storage solution (SQL, NoSQL, hierarchical, object oriented, document oriented)
  • MySQL, Cassandra, MongoDB, Elasticsearch, HDFS, etc...)
  • Choosing right solution to the problem
  • Distributed Processing
  • Spark
  • Machine Learning with Spark (MLLib)
  • Spark SQL
  • Scalability
  • Public cloud (AWS, Google, etc...)
  • Private cloud (OpenStack, cloud foundry)
  • Autoscalability

Week 7 Soft Skills

  • Advisory & Leadership Skills
  • Making an impact: data-driven story telling
  • Understanding your audience
  • Effective data presentation - getting your message across
  • Influence effectiveness and change leadership
  • Handling difficult situations


  • End of Programme graduation exam


Participants to have good grounding in maths, at least high school level.

Though programming skills are not required, any programming skills will be useful.

Participants will be assessed and interviewed prior to participation in this training programme.

 245 Hours

Testimonials (4)