Course Outline
-
Data preprocessing
- Data Cleaning
- Data integration and transformation
- Data reduction
- Discretization and concept hierarchy generation
-
Statistical inference
- Probability distributions, Random variables, Central limit theorem
- Sampling
- Confidence intervals
- Statistical Inference
- Hypothesis testing
-
Multivariate linear regression
- Specification
- Subset selection
- Estimation
- Validation
- Prediction
-
Classification methods
- Logistic regression
- Linear discriminant analysis
- K-nearest neighbours
- Naive Bayes
- Comparison of Classification methods
-
Neural Networks
- Fitting neural networks
- Training neural networks issues
-
Decision trees
- Regression trees
- Classification trees
- Trees Versus Linear Models
-
Bagging, Random Forests, Boosting
- Bagging
- Random Forests
- Boosting
-
Support Vector Machines and Flexible disct
- Maximal Margin classifier
- Support vector classifiers
- Support vector machines
- 2 and more classes SVM’s
- Relationship to logistic regression
-
Principal Components Analysis
-
Clustering
- K-means clustering
- K-medoids clustering
- Hierarchical clustering
- Density based clustering
-
Model Assesment and Selection
- Bias, Variance and Model complexity
- In-sample prediction error
- The Bayesian approach
- Cross-validation
- Bootstrap methods
Testimonials
I like the exercises done.
Nour Assaf
The hands-on exercise and the trainer capacity to explain complex topics in simple terms.
youssef chamoun
The information given was interesting and the best part was towards the end when we were provided with Data from Durex and worked on Data we are familiar with and perform operations to get results.
Jessica Chaar
I really enjoyed the all the best.
Halil polat - Amazon Development Center Poland Sp. z o.o.
The trainer concentrated on the key topics.
- Amazon Development Center Poland Sp. z o.o.
Expertise and huge knowledge of the trainer.
- Amazon Development Center Poland Sp. z o.o.
I was benefit from the guidance and sharing life examples + answering all questions.
Marta Melloch - Amazon Development Center Poland Sp. z o.o.
Very well transferred knowledge by the teacher. No unanswered questions.
Karolin Papaj - Mowi Poland SA
Related Courses
Programming with Big Data in R
21 hoursBig Data is a term that refers to solutions destined for storing and processing large data sets. Developed by Google initially, these Big Data solutions have evolved and inspired other similar projects, many of which are available as open-source. R
Marketing Analytics using R
21 hoursAudience Business owners (marketing managers, product managers, customer base managers) and their teams; customer insights professionals. Overview The course follows the customer life cycle from acquiring new customers, managing the
Introduction to R
21 hoursR is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among
Neural Network in R
14 hoursThis course is an introduction to applying neural networks in real world problems using R-project software.
Advanced R Programming
7 hoursThis course is for data scientists and statisticians that already have basic R & C++ coding skills and R code and need advanced R coding skills. The purpose is to give a practical advanced R programming course to participants interested in
Data Mining with R
14 hoursR is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data
Econometrics: Eviews and Risk Simulator
21 hoursEconometrics is the application of economic data and statistical methods to provide quantitative analysis of economic phenomena. This instructor-led, live training (online or onsite) is aimed at anyone who wishes to learn and master the
Statistical Analysis using SPSS
21 hoursSPSS is software for editing and analyzing data.
Statistical and Econometric Modelling
21 hoursThe objective of the course is to enable participants to gain a mastery of the fundamentals of statistical and econometric modelling.
Forecasting with R
14 hoursThis course allows delegate to fully automate the process of forecasting with R
R for Data Analysis and Research
7 hoursAudience managers developers scientists students Format of the course on-line instruction and discussion OR face-to-face workshops
HR Analytics for Public Organisations
14 hoursThis instructor-led, live training (online or onsite) is aimed at HR professionals who wish to use analytical methods improve organisational performance. This course covers qualitative as well as quantitative, empirical and
Talent Acquisition Analytics
14 hoursThis instructor-led, live training (online or onsite) is aimed at HR professionals and recruitment specialists who wish to use analytical methods improve organisational performance. This course covers qualitative as well as
Knowledge Discovery in Databases (KDD)
21 hoursKnowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In
Introduction to Data Visualization with Tidyverse and R
7 hoursThe Tidyverse is a collection of versatile R packages for cleaning, processing, modeling, and visualizing data. Some of the packages included are: ggplot2, dplyr, tidyr, readr, purrr, and tibble. In this instructor-led, live training,