Course Outline
Introduction
- Data mining as the analysis step of the KDD process ("Knowledge Discovery in Databases")
- Subfield of computer science
- Discovering patterns in large data sets
Sources of methods
- Artificial intelligence
- Machine learning
- Statistics
- Database systems
What is involved?
- Database and data management aspects
- Data pre-processing
- Model and inference considerations
- Interestingness metrics
- Complexity considerations
- Post-processing of discovered structures
- Visualization
- Online updating
Data mining main tasks
- Automatic or semi-automatic analysis of large quantities of data
- Extracting previously unknown interesting patterns
- groups of data records (cluster analysis)
- unusual records (anomaly detection)
- dependencies (association rule mining)
Data mining
- Anomaly detection (Outlier/change/deviation detection)
- Association rule learning (Dependency modeling)
- Clustering
- Classification
- Regression
- Summarization
Use and applications
- Able Danger
- Behavioral analytics
- Business analytics
- Cross Industry Standard Process for Data Mining
- Customer analytics
- Data mining in agriculture
- Data mining in meteorology
- Educational data mining
- Human genetic clustering
- Inference attack
- Java Data Mining
- Open-source intelligence
- Path analysis (computing)
- Reactive business intelligence
Data dredging, data fishing, data snooping
Requirements
Fair knowledge about relational data structures, SQL
Testimonials
I really enjoyed learning new and interesting things.
- SIVECO Romania SA
Related Courses
From Data to Decision with Big Data and Predictive Analytics
21 hoursAudience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to
Data Mining and Analysis
28 hoursObjective: Delegates be able to analyse big data sets, extract patterns, choose the right variable impacting the results so that a new model is forecasted with predictive
Data Mining with R
14 hoursR is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data
MonetDB
28 hoursMonetDB is an open-source database that pioneered the column-store technology approach. In this instructor-led, live training, participants will learn how to use MonetDB and how to get the most value out of it. By the end of this training,
Oracle SQL Intermediate - Data Extraction
14 hoursThe objective of the course is to enable participants to gain a mastery of how to work with the SQL language in Oracle database for data extraction at intermediate level.
Introductory R for Biologists
28 hoursR is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among
Statistics with SPSS Predictive Analytics Software
14 hoursGoal: Learning to work with SPSS at the level of independence The addressees: Analysts, researchers, scientists, students and all those who want to acquire the ability to use SPSS package and learn popular data mining
Data Vault: Building a Scalable Data Warehouse
28 hoursData Vault Modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or "all the data, all the time". Its
Data Visualization
28 hoursThis course is intended for engineers and decision makers working in data mining and knoweldge discovery. You will learn how to create effective plots and ways to present and represent your data in a way that will appeal to the decision makers
Foundation R
7 hoursThe objective of the course is to enable participants to gain a mastery of the fundamentals of R and how to work with data.
Data Mining & Machine Learning with R
14 hoursR is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data
Data Science for Big Data Analytics
35 hoursBig data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer,
Knowledge Discovery in Databases (KDD)
21 hoursKnowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In
KNIME Analytics Platform for BI
21 hoursKNIME Analytics Platform is a leading open source option for data-driven innovation, helping you discover the potential hidden in your data, mine for fresh insights, or predict new futures. With more than 1000 modules, hundreds of ready-to-run
Process Mining
21 hoursProcess mining, or Automated Business Process Discovery (ABPD), is a technique that applies algorithms to event logs for the purpose of analyzing business processes. Process mining goes beyond data storage and data analysis; it bridges data with