Course Outline

Introduction

  • Artificial neural networks vs decision tree based algorithms

Overview of XGBoost Features

  • Elements of a Gradient Boosting algorithm
  • Focus on computational speed and model performance
  • XGBoost vs Logistic Regression, Random Forest, and standard Gradient Boosting

The Evolution of Tree-Based Algorithms

  • Decision Trees, Bagging, Random Forest, Boosting, Gradient Boosting
  • System optimization
  • Algorithmic enhancements

Preparing the Environment

  • Installing SciPy and scikit-learn

Creating a XGBoost Model

  • Downloading a data set
  • Solving a common classification problem
  • Training the XGBoost model for classification
  • Solve a common regression task

Monitoring Performance

  • Evaluating and reporting performance
  • Early Stopping

Plotting Features by Importance

  • Calculating feature importance
  • Deciding which input variables to keep or discard

Configuring Gradient Boosting

  • Review the learning curves on training and validation datasets
  • Adjusting the learning rate
  • Adjusting the number of trees

Hyperparameter Tuning

  • Improving the performance of an XGBoost model
  • Designing a controlled experiment to tune hyperparameters
  • Searching combinations of parameters

Creating a Pipeline

  • Incorporating an XGBoost model into an end-to-end machine learning pipeline
  • Tuning hyperparameters within the pipeline
  • Advanced preprocessing techniques

Troubleshooting

Summary and Conclusion

Requirements

  • Experience writing machine learning models

Audience

  • Data scientists
  • Machine learning engineers
  14 Hours
 

Testimonials

Related Courses

Data Mining with Weka

  14 hours

AdaBoost Python for Machine Learning

  14 hours

Machine Learning with Random Forest

  14 hours

Machine Learning for Mobile Apps using Google’s ML Kit

  14 hours

DataRobot

  7 hours

Artificial Intelligence (AI) with H2O

  14 hours

H2O AutoML

  14 hours

AutoML with Auto-sklearn

  14 hours

AutoML with Auto-Keras

  14 hours

AutoML

  14 hours

Google Cloud AutoML

  7 hours

RapidMiner for Machine Learning and Predictive Analytics

  14 hours

Advanced Analytics with RapidMiner

  14 hours

Pattern Recognition

  21 hours

Pattern Matching

  14 hours