Introduction to Impala
- What is Impala?
- How Impala Differs from Relational Databases
- Limitations and Future Directions
- Using the Impala Shell
- The Impala Daemon, Statestore and Catalogue service
- Explore a New Impala Instance
- Load CSV Data from Local Files
- Point an Impala Table at Existing Data Files
Analyzing Data with Impala
- Describe the Impala Table
- Basic Syntax and Querying
- Data Types
- Filtering, Sorting, and Limiting Results
- Joining and Grouping Data
- Data Loading and Querying Examples
- Improving Impala Performance
- How Impala works with Hadoop file formats
- Hands-On Exercise: Interactive Analysis with Impala
Programming Impala Applications
- Overview of the Impala SQL Dialect
- Overview of Impala Programming Interfaces
- Troubleshooting Impala SQL Syntax Issues
- Troubleshooting I/O Capacity Problems
- Impala Web User Interface for Debugging
- knowledge of SQL
The fact that all the data and software was ready to use on an already prepared VM, provided by the trainer in external disks.
I mostly liked the trainer giving real live Examples.
I genuinely enjoyed the big competences of Trainer.
I genuinely enjoyed the many hands-on sessions.
It was very hands-on, we spent half the time actually doing things in Clouded/Hardtop, running different commands, checking the system, and so on. The extra materials (books, websites, etc. .) were really appreciated, we will have to continue to learn. The installations were quite fun, and very handy, the cluster setup from scratch was really good.
Lot of hands-on exercises.
Ambari management tool. Ability to discuss practical Hadoop experiences from other business case than telecom.
The VM I liked very much The Teacher was very knowledgeable regarding the topic as well as other topics, he was very nice and friendly I liked the facility in Dubai.
Safar Alqahtani - Elm Information Security
Training topics and engagement of the trainer
- Izba Administracji Skarbowej w Lublinie
Communication with people attending training.
Andrzej Szewczuk - Izba Administracji Skarbowej w Lublinie
practical things of doing, also theory was served good by Ajay
Dominik Mazur - Capgemini Polska Sp. z o.o.
- Capgemini Polska Sp. z o.o.
usefulness of exercises
- Algomine sp.z.o.o sp.k.
I found the training good, very informative....but could have been spread over 4 or 5 days, allowing us to go into more details on different aspects.
- Veterans Affairs Canada
I really enjoyed the training. Anton has a lot of knowledge and laid out the necessary theory in a very accessible way. It is great that the training was a lot of interesting exercises, so we have been in contact with the technology we know from the very beginning.
Szymon Dybczak - Algomine sp.z.o.o sp.k.
I found this course gave a great overview and quickly touched some areas I wasn't even considering.
- Veterans Affairs Canada
I genuinely liked work exercises with cluster to see performance of nodes across cluster and extended functionality.
The trainers in depth knowledge of the subject
Ajay was a very experienced consultant and was able to answer all our questions and even made suggestions on best practices for the project we are currently engaged on.
That I had it in the first place.
Peter Scales - CACI Ltd
The NIFI workflow excercises
answers to our specific questions
Alluxio: Unifying Disparate Storage Systems7 hours
Alluxio is an open-source virtual distributed storage system that unifies disparate storage systems and enables applications to interact with data at memory speed. It is used by companies such as Intel, Baidu and Alibaba. In this instructor-led,
Administrator Training for Apache Hadoop35 hours
Audience: The course is intended for IT specialists looking for a solution to store and process large data sets in a distributed system environment Goal: Deep knowledge on Hadoop cluster
Apache Hadoop: Manipulation and Transformation of Data Performance21 hours
This course is intended for developers, architects, data scientists or any profile that requires access to data either intensively or on a regular basis. The major focus of the course is data manipulation and transformation. Among the tools
Hadoop Administration21 hours
The course is dedicated to IT specialists that are looking for a solution to store and process large data sets in distributed system environment Course goal: Getting knowledge regarding Hadoop cluster
Hadoop For Administrators21 hours
Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. In this three (optionally, four) days course, attendees will learn about the business benefits and use cases for Hadoop and its ecosystem, how to plan
Hadoop for Business Analysts21 hours
Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to tradional BI analytics world. This course will introduce an analyst to the core components of
Hadoop for Developers (4 days)28 hours
Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. This course will introduce a developer to various components (HDFS, MapReduce, Pig, Hive and HBase) Hadoop
Advanced Hadoop for Developers21 hours
Apache Hadoop is one of the most popular frameworks for processing Big Data on clusters of servers. This course delves into data management in HDFS, advanced Pig, Hive, and HBase. These advanced programming techniques will be beneficial to
Hadoop for Developers and Administrators21 hours
Hadoop is the most popular Big Data processing framework.
Hadoop for Project Managers14 hours
As more and more software and IT projects migrate from local processing and data management to distributed processing and big data storage, Project Managers are finding the need to upgrade their knowledge and skills to grasp the concepts and
Hadoop Administration on MapR28 hours
Audience: This course is intended to demystify big data/hadoop technology and to show it is not difficult to understand.
HBase for Developers21 hours
This course introduces HBase – a NoSQL store on top of Hadoop. The course is intended for developers who will be using HBase to develop applications, and administrators who will manage HBase clusters. We will walk a developer
Impala for Business Intelligence21 hours
Cloudera Impala is an open source massively parallel processing (MPP) SQL query engine for Apache Hadoop clusters. Impala enables users to issue low-latency SQL queries to data stored in Hadoop Distributed File System and Apache
Apache Avro: Data Serialization for Distributed Applications14 hours
Audience Developers Format of the Course Lectures, hands-on practice, small tests along the way to gauge understanding
Samza for Stream Processing14 hours
Apache Samza is an open-source near-realtime, asynchronous computational framework for stream processing. It uses Apache Kafka for messaging, and Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource