Course Outline

Introduction to Data mining and Machine Learning

  • Statistical learning vs. Machine learning
  • Iteration and evaluation
  • Bias-Variance trade-off

Regression

  • Linear regression
  • Generalizations and Nonlinearity
  • Exercises

Classification

  • Bayesian refresher
  • Naive Bayes
  • Dicriminant analysis
  • Logistic regression
  • K-Nearest neighbors
  • Support Vector Machines
  • Neural networks
  • Decision trees
  • Exercises

Cross-validation and Resampling

  • Cross-validation approaches
  • Bootstrap
  • Exercises

Unsupervised Learning

  • K-means clustering
  • Examples
  • Challenges of unsupervised learning and beyond K-means

Advanced topics

  • Ensemble models
  • Mixed models
  • Boosting
  • Examples

Multidimensional reduction

  • Factor Analysis
  • Principal Component Analysis
  • Examples

Requirements

This course is part of the Data Scientist skill set (Domain: Analytical Techniques and Methods)

  14 Hours
 

Testimonials

Related Courses

AdaBoost Python for Machine Learning

 14 hours

AdaBoost is an algorithm that is used together with other machine learning models for optimal performance. It uses ensemble learning techniques, combining weaker models to form more accurate predictions. This instructor-led, live training (online

Artificial Intelligence (AI) with H2O

 14 hours

H2O is an open source predictive analytics platform. It supports R, Python, Scala, Java and REST. This instructor-led, live training (online or onsite) is aimed at technical persons who wish to build machine learning models using algorithms such

AutoML with Auto-Keras

 14 hours

Auto-Keras (Also known as Autokeras or Auto Keras) is an open source Python library for automated machine learning (AutoML). This instructor-led, live training (online or onsite) is aimed at data scientists as well as less technical persons who

AutoML

 14 hours

AutoML is user-friendly machine learning software that automates much of the work needed to select an ideal machine learning algorithm, its parameter settings, and pre-processing methods. This instructor-led, live training (online or onsite) is

Google Cloud AutoML

 7 hours

Google Cloud AutoML is a machine learning (ML) platform that enables users to build, train, and deploy customized ML models specific to their business needs. This instructor-led, live training (online or onsite) is aimed at data scientists, data

AutoML with Auto-sklearn

 14 hours

Auto-sklearn is a Python package built around the scikit-learn machine learning library. It automatically searches for the right learning algorithm for a new machine learning dataset and optimizes its parameters. This instructor-led, live

Pattern Recognition

 21 hours

This instructor-led, live course provides an introduction into the field of pattern recognition and machine learning. It touches on practical applications in statistics, computer science, signal processing, computer vision, data mining, and

DataRobot

 7 hours

DataRobot is a machine learning platform that streamlines the building and deployment of predictive models. DataRobot accelerates predictive analytics, helping businesses make smarter decisions. This instructor-led, live training (online or

Data Mining with Weka

 14 hours

Waikato Environment for Knowledge Analysis (Weka) is an open-source data mining visualization software. It provides a collection of machine learning algorithms for data preparation, classification, clustering, and other data mining

H2O AutoML

 14 hours

H2O AutoML is an artificial intelligence platform that automates the process of building, selecting and optimizing large numbers of machine learning models. This instructor-led, live training (online or onsite) is aimed at data scientists who

Machine Learning for Mobile Apps using Google’s ML Kit

 14 hours

ML Kit is a mobile SDK provided by Google for integrating machine learning technologies into Android and iOS apps. It features usable APIs for barcode scanning, face detection, image labeling, object detection and tracking, text recognition,

Pattern Matching

 14 hours

Pattern Matching is a technique used to locate specified patterns within an image. It can be used to determine the existence of specified characteristics within a captured image, for example the expected label on a defective product in a factory

Machine Learning with Random Forest

 14 hours

Random Forest is an algorithm for machine learning that is used mostly for classification and regression. It utilizes multiple decision trees to generate more precise and accurate predictions. This instructor-led, live training (online or onsite)

RapidMiner for Machine Learning and Predictive Analytics

 14 hours

RapidMiner is an open source data science software platform for rapid application prototyping and development. It includes an integrated environment for data preparation, machine learning, deep learning, text mining, and predictive

Apache SystemML for Machine Learning

 14 hours

Apache SystemML is a distributed and declarative machine learning platform. SystemML provides declarative large-scale machine learning (ML) that aims at flexible specification of ML algorithms and automatic generation of hybrid runtime plans