Thank you for sending your enquiry! One of our team member will contact you shortly.
Thank you for sending your booking! One of our team member will contact you shortly.
Course Outline
Quick Overview
- Data Sources
- Minding Data
- Recommender systems
- Target Marketing
Datatypes
- Structured vs unstructured
- Static vs streamed
- Attitudinal, behavioural and demographic data
- Data-driven vs user-driven analytics
- data validity
- Volume, velocity and variety of data
Models
- Building models
- Statistical Models
- Machine learning
Data Classification
- Clustering
- kGroups, k-means, the nearest neighbours
- Ant colonies, birds flocking
Predictive Models
- Decision trees
- Support vector machine
- Naive Bayes classification
- Neural networks
- Markov Model
- Regression
- Ensemble methods
ROI
- Benefit/Cost ratio
- Cost of software
- Cost of development
- Potential benefits
Building Models
- Data Preparation (MapReduce)
- Data cleansing
- Choosing methods
- Developing model
- Testing Model
- Model evaluation
- Model deployment and integration
Overview of Open Source and commercial software
- Selection of R-project package
- Python libraries
- Hadoop and Mahout
- Selected Apache projects related to Big Data and Analytics
- Selected commercial solution
- Integration with existing software and data sources
Requirements
Understanding of traditional data management and analysis methods like SQL, data warehouses, business intelligence, OLAP, etc... Understanding of basic statistics and probability (mean, variance, probability, conditional probability, etc....)
Testimonials
The content, as I found it very interesting and think it would help me in my final year at University.
Krishan Mistry - NBrown Group
Richard's training style kept it interesting, the real world examples used helped to drive the concepts home.
Jamie Martin-Royle - NBrown Group
Related Courses
Apache Airflow
21 hours
Apache Hama
14 hours
Apache Accumulo Fundamentals
21 hours
Apache Drill
21 hours
Apache Drill Query Optimization
7 hours