Clustering

This certainly helps, but the colors are so granular that we have over 450 unique colors in total. Let's use a bit of clustering to get this down to a more manageable range. Since we have the RBG values for each color, we can create a three-dimensional space to cluster them using the k-means algorithm. I won't go into the details of the algorithm here, but it is a fairly simple iterative algorithm based upon generating clusters by measuring the distance to centers and repeating. The algorithm does require us to select the k, or the number of clusters we expect. Because RGB ranges from 0 to 256, we'll use the square root of 256, which is 16. That should give us a manageable number while retaining the characteristics of our palette.

First, we'll split our RGB values into individual columns:

def get_csplit(x): 
    try: 
        return x[0], x[1], x[2] 
    except: 
        return None, None, None 
 
dfc['reds'], dfc['greens'], dfc['blues'] = zip(*dfc['main_rgb'].map(get_csplit)) 

Next, we'll use this to run our k-means model and retrieve the center values:

from sklearn.cluster import KMeans 
 
clf = KMeans(n_clusters=16) 
clf.fit(dfc[['reds', 'greens', 'blues']].dropna()) 
 
clusters = pd.DataFrame(clf.cluster_centers_, columns=['r', 'g', 'b']) 
 
clusters 

This generates the following output:

Now, we have the sixteen most popular dominant colors from the first image in each picture. Let's check whether they are using our pandas DataFrame.style() method and the function we created previously to color our cells. We'll need to set our index equal to the hex value of the three columns to use our color_cells function, so we'll do that as well:

def hexify(x): 
    rgb = [round(x['r']), round(x['g']), round(x['b'])] 
    hxc = mpc.rgb2hex([(x/255) for x in rgb]) 
    return hxc 
 
clusters.index = clusters.apply(hexify, axis=1) 
 
clusters['color'] = ' ' 
 
clusters.style.apply(color_cells, subset=['color'], axis=0) 

This generates the following output:

So there you have it; those are the most common colors you will see (at least for the first image) in the most frequently shared content. This is a bit more on the drab side than I had expected as the first several all seem to be shades of beige and gray.

Now, let's move on and examine the headlines of our stories.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.176.80