Course Outline

Introduction

  • Data mining as the analysis step of the KDD process ("Knowledge Discovery in Databases")
  • Subfield of computer science
  • Discovering patterns in large data sets

Sources of methods

  • Artificial intelligence
  • Machine learning
  • Statistics
  • Database systems

What is involved?

  • Database and data management aspects
  • Data pre-processing
  • Model and inference considerations
  • Interestingness metrics
  • Complexity considerations
  • Post-processing of discovered structures
  • Visualization
  • Online updating

Data mining main tasks

  • Automatic or semi-automatic analysis of large quantities of data
  • Extracting previously unknown interesting patterns
    • groups of data records (cluster analysis)
    • unusual records (anomaly detection)
    • dependencies (association rule mining)

Data mining

  • Anomaly detection (Outlier/change/deviation detection)
  • Association rule learning (Dependency modeling)
  • Clustering
  • Classification
  • Regression
  • Summarization

Use and applications

  • Able Danger
  • Behavioral analytics
  • Business analytics
  • Cross Industry Standard Process for Data Mining
  • Customer analytics
  • Data mining in agriculture
  • Data mining in meteorology
  • Educational data mining
  • Human genetic clustering
  • Inference attack
  • Java Data Mining
  • Open-source intelligence
  • Path analysis (computing)
  • Reactive business intelligence

Data dredging, data fishing, data snooping

Requirements

Fair knowledge about relational data structures, SQL

  21 Hours
 

Testimonials

Related Courses

Knowledge Discovery in Databases (KDD)

  21 hours

Statistics with SPSS Predictive Analytics Software

  14 hours

From Data to Decision with Big Data and Predictive Analytics

  21 hours

Data Mining with R

  14 hours

Oracle SQL Intermediate - Data Extraction

  14 hours

Data Mining and Analysis

  28 hours

Introductory R for Biologists

  28 hours

Data Mining & Machine Learning with R

  14 hours

Data Visualization

  28 hours

Data Science for Big Data Analytics

  35 hours

Process Mining

  21 hours

Data Vault: Building a Scalable Data Warehouse

  28 hours

MonetDB

  28 hours

Foundation R

  7 hours

KNIME Analytics Platform for BI

  21 hours