Get in Touch

Course Outline

Introduction

This section offers a general overview of when to apply machine learning, key considerations, and the underlying concepts, including its advantages and limitations. Topics covered include data types (structured, unstructured, static, streamed), data validity and volume, data-driven versus user-driven analytics, statistical models compared to machine learning models, challenges in unsupervised learning, the bias-variance trade-off, iteration and evaluation, cross-validation techniques, and distinctions between supervised, unsupervised, and reinforcement learning.

KEY TOPICS

1. Comprehending Naive Bayes

  • Fundamental concepts of Bayesian methods
  • Probability principles
  • Joint probability
  • Conditional probability via Bayes' theorem
  • The Naive Bayes algorithm
  • Naive Bayes classification
  • The Laplace estimator
  • Utilizing numeric features with Naive Bayes

2. Comprehending Decision Trees

  • Divide and conquer strategies
  • The C5.0 decision tree algorithm
  • Selecting the optimal split
  • Pruning decision trees

3. Comprehending Neural Networks

  • Transition from biological to artificial neurons
  • Activation functions
  • Network topology
  • Determining the number of layers
  • Direction of information flow
  • Node count per layer
  • Training neural networks using backpropagation
  • Deep Learning

4. Comprehending Support Vector Machines

  • Classification using hyperplanes
  • Identifying the maximum margin
  • Handling linearly separable data
  • Handling non-linearly separable data
  • Employing kernels for non-linear spaces

5. Comprehending Clustering

  • Clustering as a machine learning objective
  • The k-means clustering algorithm
  • Using distance metrics for cluster assignment and updating
  • Determining the appropriate number of clusters

6. Evaluating Performance for Classification

  • Working with classification prediction data
  • Examining confusion matrices
  • Using confusion matrices to assess performance
  • Beyond accuracy – alternative performance metrics
  • The kappa statistic
  • Sensitivity and specificity
  • Precision and recall
  • The F-measure
  • Visualizing performance trade-offs
  • ROC curves
  • Estimating future performance
  • The holdout method
  • Cross-validation
  • Bootstrap sampling

7. Optimizing Standard Models for Enhanced Performance

  • Utilizing caret for automated parameter tuning
  • Developing a simple tuned model
  • Customizing the tuning process
  • Enhancing model performance through meta-learning
  • Understanding ensembles
  • Bagging
  • Boosting
  • Random forests
  • Training random forests
  • Evaluating random forest performance

ADDITIONAL TOPICS

8. Comprehending Classification via Nearest Neighbors

  • The kNN algorithm
  • Calculating distance
  • Selecting an appropriate k value
  • Preparing data for kNN application
  • Why the kNN algorithm is considered lazy?

9. Comprehending Classification Rules

  • Separate and conquer approach
  • The One Rule algorithm
  • The RIPPER algorithm
  • Extracting rules from decision trees

10. Comprehending Regression

  • Simple linear regression
  • Ordinary least squares estimation
  • Correlations
  • Multiple linear regression

11. Comprehending Regression Trees and Model Trees

  • Incorporating regression into trees

12. Comprehending Association Rules

  • The Apriori algorithm for association rule learning
  • Measuring rule interest – support and confidence
  • Constructing a set of rules using the Apriori principle

Supplementary Material

  • Spark/PySpark/MLlib and Multi-armed bandits

Requirements

Proficiency in Python

 21 Hours

Testimonials (7)

Upcoming Courses

Related Categories