Course Outline
Introduction
- Graph databases and libraries
Understanding Graph Data
- The graph as a data structure
- Using vertices (dots) and edges (lines) to model real-world scenarios
Using Graph Databases to Model, Persist and Process Graph Data
- Local graph algorithms/traversals
- neo4j, OrientDB and Titan
Exercise: Modeling Graph Data with neo4j
- Whiteboard data modeling
Beyond Graph Databases: Graph Computing
- Understanding the property graph
- Graph modeling different scenarios (software graph, discussion graph, concept graph)
Solving Real-World Problems with Traversals
- Algorithmic/directed walk over the graph
- Determining circular cependencies
Case Study: Ranking Discussion Contributors
- Ranking by number and depth of contributed discussions
- A note on sentiment and concept analysis
Graph Computing: Local, In-Memory Graph toolkits
- Graph analysis and visualization
- JUNG, NetworkX, and iGraph
Exercise: Modeling Graph Data with NetworkX
- Using NetworkX to model a complex system
Graph Computing: Batch Processing Graph Frameworks
- Leveraging Hadoop for storage (HDFS) and processing (MapReduce)
- Overview of iterative algorithms
- Hama, Giraph, and GraphLab
Graph Computing: Graph-Parallel Computation
- Unifying ETL, exploratory analysis, and iterative graph computation within a single system
- GraphX
Setup and Installation
- Hadoop and Spark
GraphX Operators
- Property, structural, join, neighborhood aggregation, caching and uncaching
Iterating with Pregel API
- Passing arguments for sending, receiving and computing
Building a Graph
- Using vertices and edges in an RDD or on disk
Designing Scalable Algorithms
- GraphX Optimization
Accessing Additional Algorithms
- PageRank, Connected Components, Triangle Counting
Exercis: Page Rank and Top Users
- Building and processing graph data using text files as input
Deploying to Production
Closing Remarks
Requirements
- An undersanding of Java programming and frameworks
- A general understanding of Python is helpful but not required
- A general understanding of database concepts
Audience
- Developers
Testimonials
Richard is very calm and methodical, with an analytic insight - exactly the qualities needed to present this sort of course.
Kieran Mac Kenna
share concept diagram and also sample for hands dirty
Mark Yang - FMR
Applicable scenarios and cases
zhaopeng liu - Fmr
case analysis
国栋 张
all parts of this session
Eric Han - Fmr
We know a lot more about the whole environment.
John Kidd
The trainer made the class interesting and entertaining which helps quite a bit with all day training.
Ryan Speelman
I think the trainer had an excellent style of combining humor and real life stories to make the subjects at hand very approachable. I would highly recommend this professor in the future.
Ernesto did a great job explaining the high level concepts of using Spark and its various modules.
Michael Nemerouf
Richard was very willing to digress when we wanted to ask semi-related questions about things not on the syllabus. Explanations were clear and he was up front about caveats in any advice he gave us.
- ARM Limited
The VM I liked very much The Teacher was very knowledgeable regarding the topic as well as other topics, he was very nice and friendly I liked the facility in Dubai.
Safar Alqahtani - Elm Information Security
Small group (4 trainees) and we could progress together. Also the trainer could so help everybody.
- ICE International Copyright Enterprise Germany GmbH
Ajay was very friendly, helpful and also knowledgable about the topic he was discussing.
Biniam Guulay - ICE International Copyright Enterprise Germany GmbH
The lab exercises. Applying the theory from the first day in subsequent days.
- Dell
The trainer was passionate and well-known what he said I appreciate his help and answers all our questions and suggested cases.
Doing similar exercises different ways really help understanding what each component (Hadoop/Spark, standalone/cluster) can do on its own and together. It gave me ideas on how I should test my application on my local machine when I develop vs when it is deployed on a cluster.
Thomas Carcaud - IT Frankfurt GmbH
get to learn spark streaming , databricks and aws redshift
Lim Meng Tee - Jobstreet.com Shared Services Sdn. Bhd.
The content and the knowledge .
Jobstreet.com Shared Services Sdn. Bhd.
It was very informative. I've had very little experience with Spark before and so far this course has provided a very good introduction to the subject.
Intelligent Medical Objects
It was great to get an understanding of what is going on under the hood of Spark. Knowing what's going on under the hood helps to better understand why your code is or is not doing what you expect it to do. A lot of the training was hands on which is always great and the section on optimizations was exceptionally relevant to my current work which was nice.
Intelligent Medical Objects
This is a great class! I most appreciate that Andras explains very clearly what Spark is all about, where it came from, and what problems it is able to solve. Much better than other introductions I've seen that just dive into how to use it. Andras has a deep knowledge of the topic and explains things very well.
Intelligent Medical Objects
The live examples that were given and showed the basic aspects of Spark.
Intelligent Medical Objects
1. Right balance between high level concepts and technical details. 2. Andras is very knowledgeable about his teaching. 3. Exercise
Steven Wu - Intelligent Medical Objects
Having hands on session / assignments
Poornima Chenthamarakshan - Intelligent Medical Objects
Trainer adjusted the training slightly based on audience request , so throw some light on few diff topics that we have requested
Intelligent Medical Objects
His pace, was great. I loved the fact he went into theory too so that I understand WHY i would do the things he is asking.
Intelligent Medical Objects
I think the trainer had an excellent style of combining humor and real life stories to make the subjects at hand very approachable. I would highly recommend this professor in the future.
The trainer was passionate and well-known what he said I appreciate his help and answers all our questions and suggested cases.
Related Courses
Apache Jena: Creating a Semantic Web Application
21 hoursApache Jena is an open source Java framework for building Semantic Web and Linked Data applications. In this instructor-led, live training, participants will learn how to use Apache Jena to build and deploy a Semantic Web Application. By
Blazegraph: Creating a Graph Database Application
21 hoursBlazegraph is an open source, Java-based RDF graph database for storing and representing data with complex relationships. It supports Blueprints and RDF/SPARQL 1.1. In this instructor-led, live training, participants will learn how to use
Flockdb: A Simple Graph Database for Social Media
7 hoursFlockDB is an open source distributed, fault-tolerant graph database for managing wide but shallow network graphs. It was initially used by Twitter to store relationships among users. In this instructor-led, live training, participants will learn
JanusGraph
14 hoursJanusGraph is a graph database for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. This instructor-led, live training (online or onsite) is aimed at engineers who wish
Introduction to Semantic MediaWiki
7 hoursMediaWiki is a free and open-source wiki software. This one-day course provides participants with an introduction to Semantic MediaWiki.
Beyond the Relational Database: Neo4j
21 hoursRelational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly
Building Graph Databases with Neo4j AuraDB
14 hoursNeo4j AuraDB is a fully-managed graph database service. It is fast, reliable, and fully-automated, making it easy to build graph database applications in the cloud. This instructor-led, live training (online or onsite) is aimed at developers who
Semantic Web Overview
7 hoursThe Semantic Web is a collaborative movement led by the World Wide Web Consortium (W3C) that promotes common formats for data on the World Wide Web. The Semantic Web provides a common framework that allows data to be shared and reused across
SPARQL
14 hoursSPARQL is a query language for querying RDF (Resource Description Framework) data. It is similar to SQL for relational data in databases. This instructor-led, live training (online or onsite) is aimed at technical persons who wish to query