Course Outline

Greenplum Database Overview?

  • What is Greenplum Database?
  • Greenplum Database features
  • Greenplum Database architecture

Preparing the Development Environment

  • Installing and configuring Greenplum Database

Administration

  • Creating users
  • Creating a group
  • Adding users

Database

  • Creating a database
  • Granting privileges
  • Creating a schema

psql Command Line

  • Loading and exporting data
  • Executing scripts

Queries and Performance

  • Analyzing tables
  • Changing optimizers
  • Partitioning tables

In-Database Analytics

  • Setting up Apach Zepellin
  • Aggregating data
  • Assembling results
  • Using Apache Madlib
  • Performing linear regression

Summary and Conclusion

Requirements

  • An understanding of RDBMS (Relation Database Management Systems)

Audience

  • Administrators
  14 Hours
 

Testimonials

Related Courses

Amazon Redshift

 21 hours

Amazon Redshift is a petabyte-scale cloud-based data warehouse service in AWS. In this instructor-led, live training, participants will learn the fundamentals of Amazon Redshift. By the end of this training, participants will be able

Big Data & Database Systems Fundamentals

 14 hours

The course is part of the Data Scientist skill set (Domain: Data and Technology).

Pivotal Greenplum for Developers

 21 hours

Pivotal Greenplum is a Massively Parallel Processing (MPP) Data Warehouse platform based on PostgreSQL. This instructor-led, live training (online or onsite) is aimed at developers who wish to set up a multi-node Greenplum database. By the end

MemSQL

 28 hours

MemSQL is an in-memory, distributed, SQL database management system for cloud and on-premises. It's a real-time data warehouse that immediately delivers insights from live and historical data. In this instructor-led, live training,

Big Data Business Intelligence for Govt. Agencies

 35 hours

Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of

Big Data Architect

 35 hours

Day 1 - provides a high-level overview of essential Big Data topic areas. The module is divided into a series of sections, each of which is accompanied by a hands-on exercise. Day 2 - explores a range of topics that relate analysis practices and

Vespa: Serving Large-Scale Data in Real-Time

 14 hours

Vespa is an open-source big data processing and serving engine created by Yahoo.  It is used to respond to user queries, make recommendations, and provide personalized content and advertisements in real-time. This instructor-led, live

Programming with Big Data in R

 21 hours

Big Data is a term that refers to solutions destined for storing and processing large data sets. Developed by Google initially, these Big Data solutions have evolved and inspired other similar projects, many of which are available as open-source. R

Big Data Storage Solution - NoSQL

 14 hours

When traditional storage technologies don't handle the amount of data you need to store there are hundereds of alternatives. This course try to guide the participants what are alternatives for storing and analyzing Big Data and what are theirs

A Practical Introduction to Data Analysis and Big Data

 35 hours

Participants who complete this instructor-led, live training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into

From Data to Decision with Big Data and Predictive Analytics

 21 hours

Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to

Data Vault: Building a Scalable Data Warehouse

 28 hours

Data Vault Modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or "all the data, all the time". Its

Apache Druid for Real-Time Data Analysis

 21 hours

Apache Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business

Data Science for Big Data Analytics

 35 hours

Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer,

Apache Kylin: From Classic OLAP to Real-Time Data Warehouse

 14 hours

Apache Kylin is an extreme, distributed analytics engine for big data. In this instructor-led live training, participants will learn how to use Apache Kylin to set up a real-time data warehouse. By the end of this training, participants will