Course Outline

Introduction

Overview of Data Cleaning

  • Why is Data Cleaning Important?

Case Study: When Big Data Is Dirty

Developing A Thorough Data Cleaning Strategy

Common Data Cleaning Tools

  • Drake
  • OpenRefine
  • Pandas (for Python)
  • Dplyr (for R)

Achieving High Data Integrity

  • Complete
  • Correct
  • Accurate
  • Relevant
  • Consistent

Automating the Data Cleaning Process

Monitoring Your Data Cleaning System

Summary and Conclusion

Requirements

  • An understanding of data analytics concepts.

Audience

  • Data Scientists
  • Data Analysts
  • Business Analysts
  7 Hours
 

Testimonials

Related Courses

A Practical Introduction to Data Analysis and Big Data

  35 hours

Datameer for Data Analysts

  14 hours

Excel For Statistical Data Analysis

  14 hours

Data and Analytics - from the ground up

  42 hours

NLP: Natural Language Processing with R

  21 hours

SQL Advanced level for Analysts

  21 hours

Data Analytics With R

  21 hours

Elasticsearch for Developers

  14 hours

Data Analysis with Hive/HiveQL

  7 hours

Embedding Projector: Visualizing Your Training Data

  14 hours

kdb+ and q: Analyze Time Series Data

  21 hours

MATLAB Fundamentals, Data Science & Report Generation

  35 hours

Data Analysis with Python, Pandas, and Numpy

  14 hours

Knowledge Discovery in Databases (KDD)

  21 hours

Apache Kylin: From Classic OLAP to Real-Time Data Warehouse

  14 hours