The in-degree

The in-degree tells us how many vertices come into the second vertex, but not the other way around. This time, we can see that for the 2L instance of VertexId, there's only one inbound vertex. We can see that 2L has a relationship with 1L, 3L has a relationship with 1L as well, and 4L has a relationship with 1L. In the following resulting dataset, there will be no data for VertexId 1L, because 1L is the input. So, 1L would only be a source and not a destination:

  test("should calculate in-degree of vertices") {
//given
val users: RDD[(VertexId, (String))] =
spark.parallelize(Array(
(1L, "a"),
(2L, "b"),
(3L, "c"),
(4L, "d")
))


val relationships =
spark.parallelize(Array(
Edge(1L, 2L, "friend"),
Edge(1L, 3L, "friend"),
Edge(2L, 4L, "wife")
))

val graph = Graph(users, relationships)

//when
val degrees = graph.inDegrees.collect().toList

//then
degrees should contain theSameElementsAs List(
(2L, 1L),
(3L, 1L),
(4L, 1L)
)
}

The preceding characteristic of the in-degree is a very useful property. We use the in-degree when we are unable to find out which of our pages are very important because they are linked through the page, not from it.

By running this test, we can see that it works as expected:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.43.26