For this project, I will create a src/main/scala/InfluenceDiagram.scala
file. For demo purpose, I will just recreate the graph from Chapter 2, Data Pipelines and Modeling:
import scalax.collection.Graph import scalax.collection.edge._ import scalax.collection.GraphPredef._ import scalax.collection.GraphEdge._ import scalax.collection.edge.Implicits._ object InfluenceDiagram extends App { var g = Graph[String, LDiEdge](("'Weather'"~+>"'Weather Forecast'")("Forecast"), ("'Weather Forecast'"~+>"'Vacation Activity'")("Decision"), ("'Vacation Activity'"~+>"'Satisfaction'")("Deterministic"), ("'Weather'"~+>"'Satisfaction'")("Deterministic")) println(g.mkString(";")) println(g.isDirected) println(g.isAcyclic) }
The ~+>
operator is used to create a directed labeled edge between two nodes defined in scalax/collection/edge/Implicits.scala
, which, in our case, are of the String
type. The list of other edge types and operators is provided in the following table:
The following table shows graph edges from scalax.collection.edge.Implicits
(from http://www.scala-graph.org/guides/core-initializing.html)
Edge Class |
Shortcut/Operator |
Description |
---|---|---|
Hyperedges | ||
|
|
hyperedge |
|
|
weighted hyperedge |
|
|
key-weighted hyperedge |
|
|
labeled hyperedge |
|
|
key-labeled hyperedge |
|
|
weighted labeled hyperedge |
|
|
key-weighted labeled hyperedge |
|
|
weighted key-labeled hyperedge |
|
|
key-weighted key-labeled hyperedge |
Directed hyperedges | ||
|
| |
|
|
weighted directed hyperedge |
|
|
key-weighted directed hyperedge |
|
|
labeled directed hyperedge |
|
|
key-labeled directed hyperedge |
|
|
weighted labeled directed hyperedge |
|
|
key-weighted labeled directed hyperedge |
|
|
weighted key-labeled directed hyperedge |
|
|
key-weighted key-labeled directed hyperedge |
Undirected edges | ||
|
|
undirected edge |
|
|
weighted undirected edge |
|
|
key-weighted undirected edge |
|
| |
|
|
key-labeled undirected edge |
|
|
weighted labeled undirected edge |
|
|
key-weighted labeled undirected edge |
|
|
weighted key-labeled undirected edge |
|
|
key-weighted key-labeled undirected edge |
Directed edges | ||
|
|
directed edge |
|
|
weighted directed edge |
|
|
key-weighted directed edge |
|
|
labeled directed edge |
|
|
key-labeled directed edge |
|
|
weighted labeled directed edge |
|
|
key-weighted labeled directed edge |
|
|
weighted key-labeled directed edge |
|
|
key-weighted key-labeled directed edge |
You saw the power of graph for Scala: the edges can be weighted and we may potentially construct a multigraph (key-labeled edges allow multiple edges for a pair of source and destination nodes).
If you run SBT on the preceding project with the Scala file in the src/main/scala
directory, the output will be as follows:
[akozlov@Alexanders-MacBook-Pro chapter07(master)]$ sbt [info] Loading project definition from /Users/akozlov/Src/Book/ml-in-scala/chapter07/project [info] Set current project to Working with Graph Algorithms (in build file:/Users/akozlov/Src/Book/ml-in-scala/chapter07/) > run [warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list Multiple main classes detected, select one to run: [1] org.akozlov.chapter07.ConstranedDAG [2] org.akozlov.chapter07.EnronEmail [3] org.akozlov.chapter07.InfluenceDiagram [4] org.akozlov.chapter07.InfluenceDiagramToJson Enter number: 3 [info] Running org.akozlov.chapter07.InfluenceDiagram 'Weather';'Vacation Activity';'Satisfaction';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Weather Forecast'~>'Vacation Activity' 'Decision Directed: true Acyclic: true 'Weather';'Vacation Activity';'Satisfaction';'Recommend to a Friend';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Satisfaction'~>'Recommend to a Friend' 'Probabilistic;'Weather Forecast'~>'Vacation Activity' 'Decision Directed: true Acyclic: true
If continuous compilation is enabled, the main method will be run as soon as SBT detects that the file has changed (in the case of multiple classes having the main method, SBT will ask you which one to run, which is not great for interactivity; so you might want to limit the number of executable classes).
I will cover different output formats in a short while, but let's first see how to perform simple operations on the graph.
First, we already know that the graph is directed and acyclic, which is a required property for all decision diagrams so that we know we did not make a mistake. Let's say that I want to make the graph more complex and add a node that will indicate the likelihood of me recommending a vacation in Portland, Oregon to another person. The only thing I need to add is the following line:
g += ("'Satisfaction'" ~+> "'Recommend to a Friend'")("Probabilistic")
If you have continuous compilation/run enabled, you will immediately see the changes after pressing the Save File button:
'Weather';'Vacation Activity';'Satisfaction';'Recommend to a Friend';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Satisfaction'~>'Recommend to a Friend' 'Probabilistic;'Weather Forecast'~>'Vacation Activity' 'Decision Directed: true Acyclic: true
Now, if we want to know the parents of the newly introduced node, we can simply run the following code:
println((g get "'Recommend to a Friend'").incoming) Set('Satisfaction'~>'Recommend to a Friend' 'Probabilistic)
This will give us a set of parents for a specific node—and thus drive the decision making process. If we add a cycle, the acyclic method will automatically detect it:
g += ("'Satisfaction'" ~+> "'Weather'")("Cyclic") println(g.mkString(";")) println("Directed: " + g.isDirected) println("Acyclic: " + g.isAcyclic) 'Weather';'Vacation Activity';'Satisfaction';'Recommend to a Friend';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Satisfaction'~>'Recommend to a Friend' 'Probabilistic;'Satisfaction'~>'Weather' 'Cyclic;'Weather Forecast'~>'Vacation Activity' 'Decision Directed: true Acyclic: false
Note that you can create the graphs completely programmatically:
var n, m = 0; val f = Graph.fill(45){ m = if (m < 9) m + 1 else { n = if (n < 8) n + 1 else 8; n + 1 }; m ~ n } println(f.nodes) println(f.edges) println(f) println("Directed: " + f.isDirected) println("Acyclic: " + f.isAcyclic) NodeSet(0, 9, 1, 5, 2, 6, 3, 7, 4, 8) EdgeSet(9~0, 9~1, 9~2, 9~3, 9~4, 9~5, 9~6, 9~7, 9~8, 1~0, 5~0, 5~1, 5~2, 5~3, 5~4, 2~0, 2~1, 6~0, 6~1, 6~2, 6~3, 6~4, 6~5, 3~0, 3~1, 3~2, 7~0, 7~1, 7~2, 7~3, 7~4, 7~5, 7~6, 4~0, 4~1, 4~2, 4~3, 8~0, 8~1, 8~2, 8~3, 8~4, 8~5, 8~6, 8~7) Graph(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1~0, 2~0, 2~1, 3~0, 3~1, 3~2, 4~0, 4~1, 4~2, 4~3, 5~0, 5~1, 5~2, 5~3, 5~4, 6~0, 6~1, 6~2, 6~3, 6~4, 6~5, 7~0, 7~1, 7~2, 7~3, 7~4, 7~5, 7~6, 8~0, 8~1, 8~2, 8~3, 8~4, 8~5, 8~6, 8~7, 9~0, 9~1, 9~2, 9~3, 9~4, 9~5, 9~6, 9~7, 9~8) Directed: false Acyclic: false
Here, the element computation provided as the second parameter to the fill method is repeated 45
times (the first parameter). The graph connects every node to all of its predecessors, which is also known as a clique in the graph theory.
Graph for Scala enables us to set constraints that cannot be violated by any future graph update. This comes in handy when we want to preserve some detail in the graph structure. For example, a Directed Acyclic Graph (DAG) should not contain cycles. Two constraints are currently implemented as a part of the scalax.collection.constrained.constraints
package—connected and acyclic, as follows:
package org.akozlov.chapter07 import scalax.collection.GraphPredef._, scalax.collection.GraphEdge._ import scalax.collection.constrained.{Config, ConstraintCompanion, Graph => DAG} import scalax.collection.constrained.constraints.{Connected, Acyclic} object AcyclicWithSideEffect extends ConstraintCompanion[Acyclic] { def apply [N, E[X] <: EdgeLikeIn[X]] (self: DAG[N,E]) = new Acyclic[N,E] (self) { override def onAdditionRefused(refusedNodes: Iterable[N], refusedEdges: Iterable[E[N]], graph: DAG[N,E]) = { println("Addition refused: " + "nodes = " + refusedNodes + ", edges = " + refusedEdges) true } } } object ConnectedWithSideEffect extends ConstraintCompanion[Connected] { def apply [N, E[X] <: EdgeLikeIn[X]] (self: DAG[N,E]) = new Connected[N,E] (self) { override def onSubtractionRefused(refusedNodes: Iterable[DAG[N,E]#NodeT], refusedEdges: Iterable[DAG[N,E]#EdgeT], graph: DAG[N,E]) = { println("Subtraction refused: " + "nodes = " + refusedNodes + ", edges = " + refusedEdges) true } } } class CycleException(msg: String) extends IllegalArgumentException(msg) object ConstranedDAG extends App { implicit val conf: Config = ConnectedWithSideEffect && AcyclicWithSideEffect val g = DAG(1~>2, 1~>3, 2~>3, 3~>4) // Graph() println(g ++ List(1~>4, 3~>1)) println(g - 2~>3) println(g - 2) println((g + 4~>5) - 3) }
Here is the command to run the program that tries to add or remove nodes that violate the constraints:
[akozlov@Alexanders-MacBook-Pro chapter07(master)]$ sbt "run-main org.akozlov.chapter07.ConstranedDAG" [info] Loading project definition from /Users/akozlov/Src/Book/ml-in-scala/chapter07/project [info] Set current project to Working with Graph Algorithms (in build file:/Users/akozlov/Src/Book/ml-in-scala/chapter07/) [info] Running org.akozlov.chapter07.ConstranedDAG Addition refused: nodes = List(), edges = List(1~>4, 3~>1) Graph(1, 2, 3, 4, 1~>2, 1~>3, 2~>3, 3~>4) Subtraction refused: nodes = Set(), edges = Set(2~>3) Graph(1, 2, 3, 4, 1~>2, 1~>3, 2~>3, 3~>4) Graph(1, 3, 4, 1~>3, 3~>4) Subtraction refused: nodes = Set(3), edges = Set() Graph(1, 2, 3, 4, 5, 1~>2, 1~>3, 2~>3, 3~>4, 4~>5) [success] Total time: 1 s, completed May 1, 2016 1:53:42 PM
Adding or subtracting nodes that violate one of the constraints is rejected. The programmer can also specify a side effect if an attempt to add or subtract a node that violates the condition is made.
Graph for Scala supports importing/exporting graphs to JSON, as follows:
object InfluenceDiagramToJson extends App { val g = Graph[String,LDiEdge](("'Weather'" ~+> "'Weather Forecast'")("Forecast"), ("'Weather Forecast'" ~+> "'Vacation Activity'")("Decision"), ("'Vacation Activity'" ~+> "'Satisfaction'")("Deterministic"), ("'Weather'" ~+> "'Satisfaction'")("Deterministic"), ("'Satisfaction'" ~+> "'Recommend to a Friend'")("Probabilistic")) import scalax.collection.io.json.descriptor.predefined.{LDi} import scalax.collection.io.json.descriptor.StringNodeDescriptor import scalax.collection.io.json._ val descriptor = new Descriptor[String]( defaultNodeDescriptor = StringNodeDescriptor, defaultEdgeDescriptor = LDi.descriptor[String,String]("Edge") ) val n = g.toJson(descriptor) println(n) import net.liftweb.json._ println(Printer.pretty(JsonAST.render(JsonParser.parse(n)))) }
To produce a JSON representation for a sample graph, run:
[kozlov@Alexanders-MacBook-Pro chapter07(master)]$ sbt "run-main org.akozlov.chapter07.InfluenceDiagramToJson" [info] Loading project definition from /Users/akozlov/Src/Book/ml-in-scala/chapter07/project [info] Set current project to Working with Graph Algorithms (in build file:/Users/akozlov/Src/Book/ml-in-scala/chapter07/) [info] Running org.akozlov.chapter07.InfluenceDiagramToJson { "nodes":[["'Recommend to a Friend'"],["'Satisfaction'"],["'Vacation Activity'"],["'Weather Forecast'"],["'Weather'"]], "edges":[{ "n1":"'Weather'", "n2":"'Weather Forecast'", "label":"Forecast" },{ "n1":"'Vacation Activity'", "n2":"'Satisfaction'", "label":"Deterministic" },{ "n1":"'Weather'", "n2":"'Satisfaction'", "label":"Deterministic" },{ "n1":"'Weather Forecast'", "n2":"'Vacation Activity'", "label":"Decision" },{ "n1":"'Satisfaction'", "n2":"'Recommend to a Friend'", "label":"Probabilistic" }] } [success] Total time: 1 s, completed May 1, 2016 1:55:30 PM
For more complex structures, one might need to write custom descriptors, serializers, and deserializers (refer to http://www.scala-graph.org/api/json/api/#scalax.collection.io.json.package).
3.147.238.1