Primary filtering functions in Gephi

Gephi filters are categorized into several groups, shown as individual folders in the Filters tab. Within each of these folders are multiple filtering selections that can be used on their own as simple filters, or combined to create complex filters. Primary filter categories include:

  • Attributes: This folder houses many options that enable filtering on nodes, edges, partitions, clusters, and various graph measures such as eccentricity and various centrality levels. In addition, the user-defined attributes (such as a new column) can be found and acted on from this folder.
  • Edges: This filter is applied strictly to the connections within the network.
  • Operator: This filter allows you to execute a few functions on the graph.
  • Topology: This filter offers a range of options where you can use graph measures such as degree ranges to filter the network.

Here's a screenshot showing the primary and secondary filtering folders:

Primary filtering functions in Gephi

Filtering options in the Filters tab

Let's take a more detailed look at these folders so that we acquire enough understanding to spend time working with them on our example network. The next few sections separate the filters into groups just as they are laid out in Gephi. Not every filtering option is covered here. Additional options exist for semantic and dynamic filtering. While we won't explore them here, you should note that the dynamic filters can be used in instances where we have a time-based network where node and edge values are dynamic. The semantic filters can be used while working with RDF data structures such as SPARQL. For more information on RDF, you can visit the site http://www.w3.org/RDF/.

The following functions will nonetheless provide you with a powerful toolkit for navigating your network graphs.

Attributes

The Attributes filter allows you to query your graph based on the specific values within the network, including ID, Labels, Weight, and Modularity Class. This set of tools is highly useful when you wish to focus on a specific element of a group of values within the larger network. We'll provide basic instructions here for how you can expect these individual filters to operate, and then we'll work on them later in the chapter using some of the most essential ones on an actual dataset:

  • The Equal function operates much as you might anticipate. Several options are available to take advantage of; for nodes, we can look for specific values based on Id, Label, or Modularity Class (assuming that you have clustered or partitioned your data). If we are looking to learn more about edges, then we can likewise use the ID and Label values, while also being able to specify the edge weight value to filter by.
  • Inter Edges can be used to focus only on those edges that connect nodes within various partitions or clusters. This is particularly effective when the focus is on connections inside a group, and could certainly be used to shed light on networks with high levels of homophily. As is the case with other edge functions, one of the most compelling uses for this filter is to help clarify the connection patterns in an otherwise dense network.
  • Intra Edges plays the opposite role to the Inter Edges functionality, highlighting only those connections that occur across groups. This will obviously be useful in cases where we are less interested in within the group communications, but are highly drawn to understand patterns between groups, and to determine which nodes are critical to these paths.
  • The Non-null condition simply helps to hide missing values from the network graph, allowing us to focus on populated variables only. Options here are the same as for the Equal function we just discussed, giving us the ability to remove both nodes and edges that have missing values.
  • The Partition filters are especially adept at creating custom views of individual partitions or clusters within the larger network, making it possible to quickly create subset versions of the entire network. Not only does this help in making the network more navigable for analysis, it also leads to visual results that can be far easier for viewers to comprehend.
  • The Partition Count filter works on partitions as well, but does so using the counts within each partition, as opposed to the number that identifies each group. If our goal is to learn more on partitions with few members, the threshold can be set to remove larger groups from the graph, leaving us with only the smaller partitions being viewable. The opposite is true as well, if our focus is on heavily populated groups.
  • With the Range filter condition, we have the capability to extend some of what was made available in the Equal filter. For instance, we can now specify a range of edge weights to display (say from 2 through 5). This can also be used to display a range of partitions in the same fashion, by differentiating this tool from the other partition filters.

Edges

A pair of edge filters exists beyond those already discussed in the Attributes section, which gives us the ability to further highlight desired patterns in the network that are mentioned as follows:

  • If our goal is to examine or highlight a range of connection strength, then the Edge Weight filter is a highly useful tool. With this filter, all edge weights within specified minimum and maximum values can be highlighted, making it quite simple to draw attention to critical network paths.
  • The Self-Loop filters can be applied in cases where a node connects with itself. This filter requires a subfilter (equal, partition, not equal, and so on) to activate the filter. We can then use these conditions to focus our attention on those nodes either with or without self-loops.

Operator

Several operator functions exist that can help us to build more complex filters. In our section titled Working with complex filters, described later in this chapter, we'll explore the practical use of these functions through a series of examples. In this section, our focus will be on providing a theoretical construct to help you understand when and how you might put each of these operators to use.

  • The INTERSECTION operator enables the construction of highly complex filters using multiple conditions to narrow a network dataset. This can be thought of as resembling a database query where multiple conditions must be satisfied to return a set of records.
  • MASK (Edges) can be used to customize the edges that are shown within a network graph. The filter provides four possible criteria via a set of radio buttons. These selections include any, both, source, and target.
  • NOT (Edges) is used to remove certain edges from a view, either for practical or perhaps cosmetic reasons. As with the other operator functions, you must choose another filter for applying this criteria. For instance, we could elect to hide all of the edges that go across groups (inter edges), or conversely, all those within a group (intra edges).
  • NOT (Nodes) can be employed to remove specific nodes from the network graph, and can be applied using other attributes that group the data, such as a class or other categorization. When used in this fashion, all nodes that belong to a specified group will be hidden from the view.
  • The UNION operator is used to combine multiple conditions within a single data attribute. For example, in the case where we have categories from 1 through 25, we could use a union query to display both categories 1 through 5 and 20 through 25 while hiding the remainder. However with the other operator functions, we need to use separate filters such as Equal or Partition to build the union query.

Topology

Some of the most interesting filtering options are found in the Topology folder of the Filters tab. This is where you should go when you wish to learn more about the behaviors within the network, as opposed to focusing on highlighting specific elements within the network based on their specific group or position within the network. In this section, you'll learn more about how to navigate the network using a variety of filters that examine network structure and the role played by specific entities within the network.

  • One of the starting points to understand influence within a network is to focus on the importance of specific individual nodes within the network. In Gephi, this can be done using the Degree Range filter, which enables filtering based on the number of connections each node possesses. In an undirected network, we are indifferent to the direction of the connection; in fact, it plays no role whatsoever. If the network is directed, then we can wish to defer to the In Degree Range and Out Degree Range options to better understand the patterns of influence within the network.
  • Using the Ego Network function allows us to easily understand which other entities a single network node is connected to, at the first, second, third, and max degree levels. This allows us to see how the network is accessed by a specific individual working through the network and illustrates the possible paths required to access the second or third degree connections.
  • The Giant Component filter enables users to hide portions of the network that are not part of the giant component or largest part of the network. In the case of a fully connected network, this filter will have no effect, but in other cases it will help to drive visual focus on the largest component in the network.
  • We previously noted the In Degree Range filter, and how it can be useful to determine the levels of influence within a network. In a directed graph, this filter helps us to set thresholds that expose the nodes with the highest numbers of inbound edges from other nodes. This is a critical element for understanding which nodes serve as hubs and are relied on by other members for information or indirect connections.
  • The Neighbors Network filter can be used in a similar fashion to the Ego Network filter, but with the ability to move beyond a single node. Thus, we can examine the neighbor network for a specific group within a network, rather than a single member, and extend it to include first, second, and third degree connections.
  • The Out Degree Range filter lets us examine the degree to which nodes connect to other nodes; a sort of reverse hub effect if you will. While nodes with high levels of out degrees (relative to In Degrees) are typically not the influencers in a network, they might serve valuable functions as transmitters of information, acting as a conduit to many external information sources. The ability to isolate these nodes can help to understand the network structure and how information flows between members.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.151.107