Drawing Connections between Entity Columns

Correlation is a statistical function to measure how two features are related. When feature columns have discrete values, we can measure co-occurrence by plotting joint distributions. But co-occurrence is sometimes insufficient for discovering the semantics of a relationship.

The concept of co-occurrence can be generalized if we think of it as one type of interaction. Many such types of interactions can be defined between discrete entities. Multiple interaction variables give us a better shot at being more effective when encapsulating some form of relatedness.

Interactions can be directed or undirected. Datasets containing entities can be visualized as directed (or undirected) flow graphs. The edges between entities are the interaction variables that we can analyze. Our goal in this chapter is to make charts that help us to visualize and analyze datasets that contain entity identifiers and directed (or undirected) relationship values defining pair-wise relationships between entities.

We will create charts to visualize entities and relationships—as networks with nodes, edges, and via flow graphs. In this chapter, we will specifically cover:

  • Datasets
  • Directed force networks
  • Chord diagrams
  • Sunburst chart
  • Sankey's diagram
  • Partitioning
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.133.220