Using Gephi generators

Gephi provides a wide range of network generators that can be used to better understand the formation and growth patterns forecast by some of the most prominent models in the network literature. These generators are valuable for building some knowledge for how different models work, and what happens to a network when we change various assumptions. In this section, we'll cover several of the most prominent models from the network graph literature, and then share examples from Gephi that will help further our understanding for how each model works.

If you're familiar with some of the network literature from Easley and Kleinberg, Barabasi, Strogatz and Watts, Newman, or any other good source, you will be familiar with most, if not all, of the following methods. Do not be concerned if you are not familiar with these sources. We will not be discussing the methods at any sort of deep technical level; that is both beyond the scope of this book and can be better learned from the creators and practitioners of the models. Many resources are provided in the Appendix, Data Sources and Other Web Resources, of this book.

The idea behind the various generators is to provide insight into how each of the methods work, using assumptions developed within each model. Gephi provides you with the ability to adjust different assumptions and to quickly view the results, helping to create a sort of interactive visual learning course without having to venture beyond the software. So with that minimal background, let's move into some of the generators, where I will provide a brief synopsis of the model and then share a sample output.

We'll begin with a simple random graph found by navigating to the File | Generate | Random Graph... menu location. This is the most basic of all network models that is built on the assumption that any single node has the same probability of connecting with any other node in the graph. We can adjust the probability level in Gephi as well as the number of nodes to be generated. The higher we set the probability, the denser the network will become. With a level of 0.05 and 50 nodes, I generated the first graph followed by a second graph with a 0.10 probability level (note that you must delete the first workspace prior to creating a new graph, otherwise Gephi will place them all in the same window). The first three of these graphs use the Force Atlas 2 model to effectively show the network patterns. Here, we are with just a 0.05 connection probability, as shown in the following diagram:

Using Gephi generators

Random graph with .05 probability

Notice how the network is not fully connected—we have a single large component as well as a few nodes with no edges at all. Now, the following diagram illustrates the network with a 0.10 connection probability:

Using Gephi generators

Random graph with 0.10 probability

Notice how the increased probability created far more connections between nodes, resulting in a much denser, yet still random graph.

Next, we'll take a look at the Barabasi-Albert scale free model (there are three other Barabasi-Albert models to play with), once again using 50 nodes, but added one at a time. Each new node will also have a single edge when it is created, in theory, which can be connected to any existing node in the network. What we would anticipate seeing with this model is the emergence of hubs with large degrees surrounded by many nodes with a lower-level of connectedness. While this doesn't fully comprehend all the nuances of preferential attachment, it takes us in that direction, with earlier nodes that have more opportunities to be connected to by later nodes, and thus to become hubs (high degree nodes) within the network. Let's take a look at the following diagram:

Using Gephi generators

Barabasi-Albert scale free model

I elected to size the nodes based on degrees, making it easier to spot the network hubs. Here, it is easy to see the presence of multiple hubs in the network, akin to an airline industry hub and spoke model. It is interesting to see several secondary hubs that are often (but not always) directly connected to the primary one. One of the strengths of the Barabasi-Albert model is that it more closely models existing network structures, such as the Web, compared to any sort of random model. However, in this case, time of entry into the network is the single most critical growth criterion; we know from the real world that this is just one of many factors that determine the hub and spoke system seen here.

Another interesting category of models are the so-called small world networks made famous through the idea of six degrees of separation, where no two people on the planet, regardless of distance or dissimilarity are more than six degrees away from one another. While the number has varied in various experiments and real-world situations, the answer has not strayed far from the original figure. We are indeed living in a relatively small world, in spite of the physical distance between people.

We have three generators within this category: two from Watts and Strogatz, and one from Kleinberg. For this discussion, I'll work with the Watts-Strogatz small world Alpha model, although each of the three will illustrate the principle of small worlds very effectively. In this case, the number of nodes has been set to 50 for consistency with an average number of degrees equal to 4 and an alpha setting of 3.5. Small world patterns are often masked by the force-directed algorithms, so I have opted for a simple circular layout. Here's the result:

Using Gephi generators

Watts-Strogatz small world model

Notice the difference in this model relative to both the random and scale free models shown previously. Even though I have once again sized the nodes to reflect their degree, the level of variation is much smaller than in the scale free model. In simple terms, there are no hubs in the graph, and in their place are a lot of well-connected nodes, and importantly, no isolated members of the network. Also, notice that many of the edges connect distant points on the graph rather than being restricted to more localized connections. This is the essence of small world models—member nodes are highly interconnected, have limited numbers (if any) of hubs, and are typically not subjected to a high degree of homophily, which would make traversing the graph far more difficult.

It is also important to understand the implications of a small world network for the diffusion of ideas, information, and innovation. Small worlds tend to be very effective at sharing and dispersing information and connecting persons who are physically remote. This obviously has a significant impact on information flow, whether it is trivial (cat videos going viral) or more serious (information about NSA surveillance). While a small world network clearly has profound consequences with respect to information flow, it typically has a smaller effect in spreading contagion given that the latter requires more direct physical contact than the former. However, there are still potential instances where a small world network can influence the spread of disease, often related to the ability to travel the globe far more easily than in previous eras. Today, individuals are more likely than ever before to have first degree relationships with others in another country or even on a different continent. As much as these relationships lead to overseas travel, the potential rises for the spread of once localized disease strains to be transmitted to new regions and to eventually impact a far larger population than was previously available.

Now that we have walked through several network models, we'll use Gephi to construct and analyze some contagion networks, where we can merge various network models with some contagion scenarios.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.196.172