The in-degree

The in-degree tells us how many vertices come into the second vertex, but not the other way around. This time, we can see that for the 2L instance of VertexId, there's only one inbound vertex. We can see that 2L has a relationship with 1L, 3L has a relationship with 1L as well, and 4L has a relationship with 1L. In the following resulting dataset, there will be no data for VertexId 1L, because 1L is the input. So, 1L would only be a source and not a destination:

  test("should calculate in-degree of vertices") {
    //given
    val users: RDD[(VertexId, (String))] =
      spark.parallelize(Array(
        (1L, "a"),
        (2L, "b"),
        (3L, "c"),
        (4L, "d")
      ))


    val relationships =
      spark.parallelize(Array(
        Edge(1L, 2L, "friend"),
        Edge(1L, 3L, "friend"),
        Edge(2L, 4L, "wife")
      ))

    val graph = Graph(users, relationships)

    //when
    val degrees = graph.inDegrees.collect().toList

    //then
    degrees should contain theSameElementsAs List(
      (2L, 1L),
      (3L, 1L),
      (4L, 1L)
    )
  }

The preceding characteristic of the in-degree is a very useful property. We use the in-degree when we are unable to find out which of our pages are very important because they are linked through the page, not from it.

By running this test, we can see that it works as expected:

Table of Contents for The in-degree

Create new playlist

Sign In

Sign Up

Table of Contents for
The in-degree