Constructing the graph using edge

As we saw in the previous sections, we have edges and vertices, which is an RDD. As this is an RDD, we can get an edge. We have a lot of methods that are available on the normal RDD. We can use the max method, min method, sum method, and all other actions. We will apply the reduce method, so the reduce method will take two edges, we will take e1, e2, and we can perform some logic on it.

The e1 edge is an edge that has an attribute, destination, and a source, as shown in the following screenshot:

Since the edge is chaining together two vertices, we can perform some logic here. For example, if the e1 edge attribute is equal to friend, we want to lift an edge using the filter operation. So, the filter method is taking only one edge, and then if the edge e1 is a friend, it will be perceived automatically. We can see that at the end we can collect it and perform a toList so that the API that is on Spark is available for our use. The following code will help us implement our logic:

import org.apache.spark.SparkContext
import org.apache.spark.graphx.{Edge, Graph, VertexId}
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SparkSession
import org.scalatest.FunSuite

class EdgeAPI extends FunSuite {
val spark: SparkContext = SparkSession.builder().master("local[2]").getOrCreate().sparkContext

test("Should use Edge API") {
//given
val users: RDD[(VertexId, (String))] =
spark.parallelize(Array(
(1L, "a"),
(2L, "b"),
(3L, "c"),
(4L, "d")
))

val relationships =
spark.parallelize(Array(
Edge(1L, 2L, "friend"),
Edge(1L, 3L, "friend"),
Edge(2L, 4L, "wife")
))

val graph = Graph(users, relationships)

//when
val resFromFilter = graph.edges.filter((e1) => e1.attr == "friend").collect().toList
println(resFromFilter)

It also has a couple of methods on the top of the standard RDD. For example, we can do a map edge, which will take an edge, and we can take an attribute and map every label to uppercase, as follows:

    val res = graph.mapEdges(e => e.attr.toUpperCase)
On the graph, we can also perform group edges. Grouping edges is similar to GROUP BY, but only for edges.

Type the following command to print line-mapping edges:

    println(res.edges.collect().toList)

Let's start our code. We can see in the output that our code has filtered the wife edge—we only perceive the friend edge from vertex ID 1 to ID 2, and also vertex ID 1 to ID 3, and map edges as shown in the following screenshot:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.220.22