Example 5 – connected components

When a large graph is created from data, it might contain unconnected sub graphs or sub graphs that are isolated from each other and might contain no bridging or connecting edges between them. These algorithms provide a measure of that connectivity. It might be important depending on your processing to know that all vertices are connected.

The Scala code for this example calls two graph methods, connectedComponents and stronglyConnectedComponents. The strong method required a maximum iteration count, which has been set to 1000. These counts are acting on the graph vertices:

val iterations = 1000
val connected = graph.connectedComponents().vertices
val connectedS = graph.stronglyConnectedComponents(iterations).vertices

The vertex counts are then joined with the original vertex records so that connection counts can be associated with the vertex information such as person name:

val connByPerson = vertices.join(connected).map {
   case (id, ( (person,age) , conn )) => (conn, id, person)
}
val connByPersonS = vertices.join(connectedS).map {
   case (id, ( (person,age) , conn )) => (conn, id, person)
}

The results are then output using a case statement and formatted for printing:

connByPerson.collect().foreach {
   case (conn, id, person) =>
     println ( f"Weak $conn $id $person" )
}

As expected, for the connectedComponents algorithm, the results show that, for each vertex, there is only one component. That means that all the vertices are members of a single graph as the graph diagram earlier in the chapter showed:

Weak 1 4 Jim
Weak 1 6 Flo
Weak 1 2 Sarah
Weak 1 1 Mike
Weak 1 3 John
Weak 1 5 Kate

The stronglyConnectedComponents method gives a measure of the connectivity in a graph, taking into account the direction of the relationships between them. The results for the stronglyConnectedComponents algorithm are output as follows:

connByPersonS.collect().foreach {
   case (conn, id, person) =>
     println ( f"Strong $conn $id $person" )
}

You might notice from the graph that the relationships Sister and Friend act from vertices Flo (6) and Jim (4) to Mike (1) as the edge and vertex data shows:

6,1,Sister
4,1,Friend

1,Mike,48
4,Jim,53
6,Flo,52

So the strong method output shows that, for most vertices, there is only one graph component signified by 1 in the second column. However, vertices 4 and 6 are not reachable due to the direction of their relationship, and so they have a vertex ID instead of a component ID:

Strong 4 4 Jim
Strong 6 6 Flo
Strong 1 2 Sarah
Strong 1 1 Mike
Strong 1 3 John
Strong 1 5 Kate

Table of Contents for Example 5 – connected components

Create new playlist

Sign In

Sign Up

Table of Contents for
Example 5 – connected components