Segmentation overview

But what is segmentation? It is the process by which data is bundled together in segments or clusters. If there are a million customer records, and let's say we want to know what is common among those customers, to do this, we will start aggregating a few customer records or bunching some customers together based on similarities in their profiles. What we are essentially doing is forming close-knit clusters of customers. If all these customers were similar, then it would make the whole process easy. We would end up with one cluster, and we could describe it easily. But this seldom happens. There are always some demographic or transactional properties by which we want to segregate customers. This is important, as we want to customize our offerings to various clusters of customers. Remember the general rule of marketing: no two customers are the same in their intrinsic needs and wants. One thing to note is that clustering isn't the only way to conduct segmentation; it is merely one of the most popular analytical choices. Let's look at the following example to understand what clustering is about.

Here is a clustering illustration:

data class; 
   input id female tall grade; 
datalines; 
1 1 1 3 
2 0 3 1 
3 0 3 1 
4 0 1 1 
5 1 2 4 
6 1 2 4 
; 
run; 
 
proc cluster data=class method=centroid out=tree; 
   id id; 
   var Female Tall Grade; 
run;

The following table has six records and three variables, apart from the id variable. The female variable has the value 1 for females and 0 for males. The variables tall and grade have different values for height and scores in a test. Higher values represent a greater height or grade secured. We have a hypothesis that instead of describing the pupils of this class data in six different ways, we can perform segmentation and cluster them into three different segments.

The cluster chart in Figure 7.9 shows how various individuals have been put into similar clusters. Individuals 1, 5, and 6 are in cluster A, individuals 2 and 3 are in cluster B, and individual 4 is like a separate cluster, C. Leaving statistics aside for a moment, is there any logic in this clustering? Cluster A can be described as having females only, with the highest grades in the class and with two thirds of its constituents of medium height. Cluster B can be described as males, who are the tallest in the class and have the lowest grades. Customer 4, or cluster C, seems to be quite different from cluster B, even though both have male constituents. Cluster C has a constituent who is a male, is short, and has the lowest possible grade. If we wanted a two-segment solution, then perhaps clusters B and C could be joined together.

However, in a three-segment solution, customer 4 is perhaps best suited for cluster C:

Figure 7.9: Clustering illustration

Table of Contents for Segmentation overview

Create new playlist

Sign In

Sign Up

Table of Contents for
Segmentation overview