Weight calculation

We've seen that, by associating weights with the gradients, we can understand which tasks have strong gradient agreement and which tasks have strong gradient disagreement.

We know that these weights are proportional to the inner product of the gradients of a task and an average of gradients of all of the tasks in the sampled batch of tasks. How can we calculate these weights?

The weights are calculated as follows:

Let's say we sampled a batch of tasks. Then, for each task in a batch, we sample k data points, calculate loss, update the gradients, and find the optimal parameter for each of the tasks. Along with this, we also store the gradient update vector of each task in . It can be calculated as .

So, the weights for an task is the sum of the inner products of and divided by a normalization factor. The normalization factor is proportional to the inner product of and .

Let's better understand how these weights are calculated exactly by looking at the following code:

for i in range(num_tasks):
    g = theta - theta_[i]

#calculate normalization factor
normalization_factor = 0

for i in range(num_tasks):
     for j in range(num_tasks):
         normalization_factor += np.abs(np.dot(g[i].T, g[j]))

#calcualte weights 
w = np.zeros(num_tasks)

for i in range(num_tasks):
     for j in range(num_tasks):
         w[i] += np.dot(g[i].T, g[j])
 
     w[i] = w[i] / normalization_factor

Table of Contents for Weight calculation

Create new playlist

Sign In

Sign Up

Table of Contents for
Weight calculation