Spark GraphX

GraphX is a distributed graph processing framework on top of Spark. Graphs are data structures comprising vertices and the edges connecting them. GraphX provides functions for building graphs, represented as Graph RDDs. It provides an API for expressing graph computation that can model user-defined graphs by using the Pregel abstraction API. It also provides an optimized runtime for this abstraction. GraphX also contains implementations of the most important algorithms of graph theory, such as page rank, connected components, shortest paths, SVD++, and others.


We cover Spark Graphx in detail in
Chapter 10, Everything is Connected - GraphX.

A newer module known as GraphFrames is in development, which makes it easier to do Graph processing using DataFrame-based Graphs. GraphX is to RDDs what GraphFrames are to DataFrames/datasets. Also, this is currently separate from GraphX and is expected to support all the functionality of GraphX in the future, when there might be a switch over to GraphFrames.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.44.100