Entropy maximization/reduction

In this section, we'll see how can we prevent task bias by maximizing and minimizing entropy. We know that entropy is a measure of randomness. So, we maximize entropy by allowing the model to make a random guess over the predicted labels with equal probability. By making random guesses over the predicted label, we can prevent task bias.

How do we compute the entropy? Let's denote entropy by . The entropy for is computed by sampling from over its output probabilities, over predicted labels:

In the previous equation, is the predicted label by the model.

So, we maximize the entropy before updating the model parameter. Next, we minimize the entropy after updating the model parameter. So, what do we mean by minimizing the entropy? Minimizing the entropy implies that we don't add any randomness over the predicted labels and we allow the model to predict the label with high confidence.

So, our goal is to maximize the entropy reduction for each of the tasks and it can be represented as follows:

We incorporate our entropy term with the meta objective and try to find the optimal parameter , so our meta objective becomes the following:

And is the balancing coefficient between both of these terms.

Table of Contents for Entropy maximization/reduction

Create new playlist

Sign In

Sign Up

Table of Contents for
Entropy maximization/reduction