This score is complementary to the previous one. Its purpose is to provide a piece of information about the assignment of samples belonging to the same class. More precisely, a good clustering algorithm should assign all samples with the same true label to the same cluster. From our previous analysis, we know that, for example, the digit 7 has been wrongly assigned to both clusters 9 and 1; therefore, we expect a non-perfect completeness score. The definition is symmetric to the homogeneity score:
The rationale is very intuitive. When H(Ypred|Ytrue) is low (c → 1), it means that the knowledge of the ground truth reduces the uncertainty about the predictions. Therefore, if we know that all the sample of subset A have the same label yi, we are quite sure that all the corresponding predictions have been assigned to the same cluster. The completeness score for our example is:
from sklearn.metrics import completeness_score
print(completeness_score(digits['target'], Y))
0.747718831945
Again, the value confirms our hypothesis. The residual uncertainty is due to a lack of completeness because a few samples with the same label have been split into blocks that are assigned to wrong clusters. It's obvious that a perfect scenario is characterized by having both homogeneity and completeness scores equal to 1.