Case study – fraud analytics

Let's look at how we can use SNA to detect fraud. With humans being social animals, human behavior is said to be affected by the people that you are surrounded by. The word homophily has been coined to represent the effect their social network has on a person. Extending this concept, a homophilic network is a group of people who are likely to be associated with each other due to some common factor; for example, having the same origin or hobbies, being part of the same gang or the same university, or some combination of other factors.

If we want to analyze fraud in a homophilic network, we can take advantage of the relationships between the person under investigation and other people in the network, whose risk of involvement in fraud has already been carefully calculated. Flagging a person due to their company is sometimes also called guilt by association.

In an effort to understand the process, let's first look at a simple case. For that, let's use a network with nine vertices and eight edges. In this network, four of the vertices are known fraud cases and are classified as fraud (F). Five of the remaining people have no fraud-related history and are classified as non-fraud (NF). 

We will write a code with the following steps to generate this graph:

  1. Let's import the packages that we need:
import networkx as nx
import matplotlib.pyplot as plt
  1. Define the data structures of vertices and edges:
vertices = range(1,10)
edges= [(7,2), (2,3), (7,4), (4,5), (7,3), (7,5), (1,6),(1,7),(2,8),(2,9)]
  1. Let's first instantiate the graph:
G = nx.Graph()
  1. Now, let's draw the graph:
G.add_nodes_from(vertices) 
G.add_edges_from(edges)
pos=nx.spring_layout(G)

  1. Let's define the NF nodes:
nx.draw_networkx_nodes( G,pos,
nodelist=[1,4,3,8,9],
with_labels=True,
node_color='g',
node_size=1300)
  1. Now, let's create the nodes that are known to be involved in fraud:
nx.draw_networkx_nodes(G,pos, 
nodelist=[2,5,6,7],
with_labels=True,
node_color='r',
node_size=1300)
  1. Let's create labels for the nodes:
nx.draw_networkx_edges(G,pos,edges,width=3,alpha=0.5,edge_color='b') labels={} labels[1]=r'1 NF' labels[2]=r'2 F' labels[3]=r'3 NF' labels[4]=r'4 NF' labels[5]=r'5 F' labels[6]=r'6 F' labels[7]=r'7 F' labels[8]=r'8 NF' labels[9]=r'9 NF' 
nx.draw_networkx_labels(G,pos,labels,font_size=16)

Once the preceding code runs, it will show us a graph like this:

Note that we have already conducted detailed analysis to classify each node as a graph or non-graph. Let's assume that we add another vertex, named q, to the network, as shown in the following figure. We have no prior information about this person and whether this person is involved in fraud or not. We want to classify this person as NF or F based on their links to the existing members of the social network:

We have devised two ways to classify this new person, represented by node q, as F or NF:

  • Using a simple method that does not use centrality metrics and additional information about the type of fraud
  • Using a watchtower methodology, which is an advanced technique that uses the centrality metrics of the existing nodes, as well as additional information about the type of fraud

We will discuss each method in detail.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.141.6