Course Outline

Introduction

Overview of MemSQL

Understanding the MemSQL Architecture

Quick Start with MemSQL Using MemSQL Ops

Understanding Essential MemSQL Concepts

  • Overview of MemSQL Commands
  • Working with Rowstore and Columnstore
  • Implementing Data Distribution
  • Using Shard Keys
  • Implementing Distributed Joins
  • Using Reference Tables
  • Understanding Application Cluster Topologies

Installing and Upgrading MemSQL

  • Designing a Cluster
  • Doing Manual Installation
  • Expanding a Cluster
  • Implementing an Upgrade
  • Securing MemSQL

Working with Schema Design and Query Optimization

  • Working with Transactions
  • Working with Geospatial Data
  • Understanding Index Types
  • Using Sparsity and Normalized Forms
  • Hands-on: Using a Reference Table to Query JSON with Variant Array Lengths
  • Working with Shard Key Strategies
  • Identifying a Sharding Strategy
  • Understanding Analyze, Explain, and Profile
  • Implementing Schema Optimization for Query Performance
  • Using Query Hints

Diving Deep into Administering MemSQL Operations

  • Using the MemSQL Ops Command Line Interface
  • Administering a Cluster
  • Understanding Administrator Key Concepts
  • Backing Up and Restoring Data
  • Scaling Cluster Size
  • Dealing with Cluster Failures
  • Managing High Availability
  • Monitoring MemSQL
  • Working with the Trace Log
  • Using Durability and Recovery
  • Running Diagnostics

Working with MemSQL Procedural SQL (MPSQL)

  • Using Table-Valued Functions
  • Using User-Defined Functions
  • Using User-Defined Aggregate Functions
  • Using Stored Procedures

Implementing Performance Benchmarking and Fine-Tuning

  • Using Experimental Metrics
  • Performance Testing with dbbench
  • Hands-on: Working with a Database Workload Generator
  • Using Management Views
  • Implementing Workload Profiling
  • Hands-on: MemSQL Top

Working with MemSQL Pipelines and Real-Time Data Ingestion

  • Using the MemSQL Connector for Apache Spark
  • Using MemSQL Pipelines with Apache Kafka and AWS S3

Creating Real-Time Applications

  • Working with Business Intelligence Dashboards
  • Using MemSQL Pipelines for Machine Learning
  • Implementing a Real-Time Dashboard
  • Implementing Predictive Analytics

Troubleshooting MemSQL

Summary and Conclusion

Requirements

  • Experience with Linux, relational database systems, and SQL platforms
  • Experience with Scala, Java, or Python programming
  28 Hours
 

Testimonials

Related Courses

Aerospike for Developers

 14 hours

This course covers everything a database developer needs to know to successfully develop applications using Aerospike.

Amazon Redshift

 21 hours

Amazon Redshift is a petabyte-scale cloud-based data warehouse service in AWS. In this instructor-led, live training, participants will learn the fundamentals of Amazon Redshift. By the end of this training, participants will be able

Big Data & Database Systems Fundamentals

 14 hours

The course is part of the Data Scientist skill set (Domain: Data and Technology).

Pivotal Greenplum for Developers

 21 hours

Pivotal Greenplum is a Massively Parallel Processing (MPP) Data Warehouse platform based on PostgreSQL. This instructor-led, live training (online or onsite) is aimed at developers who wish to set up a multi-node Greenplum database. By the end

Greenplum Database

 14 hours

Greenplum Database is a database software for business intelligence and data warehousing. Users can run Greenplum Database for massive parallel data processing. This instructor-led, live training (online or onsite) is aimed at administrators who

Big Data Storage Solution - NoSQL

 14 hours

When traditional storage technologies don't handle the amount of data you need to store there are hundereds of alternatives. This course try to guide the participants what are alternatives for storing and analyzing Big Data and what are theirs

HBase for Developers

 21 hours

This course introduces HBase – a NoSQL store on top of Hadoop.  The course is intended for developers who will be using HBase to develop applications,  and administrators who will manage HBase clusters. We will walk a developer

OrientDB for Developers

 14 hours

OrientDB is a NoSQL Multi-Model Database that works with Graph, Document, Key-Value, GeoSpatial, and Reactive models. Its flexibility allows users to manage different kinds of data under one centralized database. In this instructor-led, live

Riak: Build Applications with High Data Accuracy

 14 hours

Riak is an Erlang based open-source document database, similar to CouchDB. It is created and maintained by Basho. In this instructor-led, live training, participants will learn how to build, run and operate a Riak based web application. By the

Scylla Database

 21 hours

Scylla is an open-source distributed NoSQL data store. It is compatible with Apache Cassandra but performs at significantly higher throughputs and lower latencies. In this course, participants will learn about Scylla's features and

NoSQL Database with Microsoft Azure Cosmos DB

 14 hours

Microsoft Azure Cosmos DB is a fully managed NoSQL database service designed for high-speed data processing and storage scaling. It supports multiple data models and open-source APIs, such as MongoDB and Cassandra. This instructor-led, live

Data Vault: Building a Scalable Data Warehouse

 28 hours

Data Vault Modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or "all the data, all the time". Its

Apache Druid for Real-Time Data Analysis

 21 hours

Apache Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business

Apache Kylin: From Classic OLAP to Real-Time Data Warehouse

 14 hours

Apache Kylin is an extreme, distributed analytics engine for big data. In this instructor-led live training, participants will learn how to use Apache Kylin to set up a real-time data warehouse. By the end of this training, participants will