Example 1 – counting

The graph has been loaded, and we know the data volumes in the data files. But what about the data content in terms of vertices and edges in the actual graph itself? It is very simple to extract this information using the vertices and edges count function shown as follows:

println( "vertices : " + graph.vertices.count )
println( "edges : " + graph.edges.count )

Running the graph1 example using the example name and the .jar file created earlier will provide the count information. The master URL is supplied to connect to the Spark cluster, and some default parameters are supplied for the executor memory and total executor cores:

spark-submit 
--class graph1
--master spark://localhost:7077
--executor-memory 700M
--total-executor-cores 100
/home/hadoop/spark/graphx/target/scala-2.10/graph-x_2.10-1.0.jar

The Spark cluster job graph1 provides the following output, which is what would be expected and matches the data files:

vertices : 6
edges : 12
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.19.174