Read operation

Unlike NTM, in MANN, we use two different weight vectors to perform read and write operations. Read operations in MANN are the same as the NTM. Since we know that, in MANN, we perform read operation using content-based similarity, we compare the key vector, , emitted by a controller with each of the rows in the memory matrix ,, to learn the similarity. We use cosine similarity as a similarity measure for checking the similarity and can be expressed as follows:

So, our weight vector becomes as follows:

But, unlike NTM, we don't use the key strength, , here. The superscript r in denotes that it is a read weight vector. Our final weight vector is the softmax over the weights—that is, the following:

Our read vector is the linear combination of weights, , and memory matrix, , as follows:

Let's see how to build this is in TensorFlow.

First, we calculate read weight vector using content-based similarity:

    def read_head_addressing(k, prev_M):
        
        k = tf.expand_dims(k, axis=2)
        inner_product = tf.matmul(prev_M, k)
        k_norm = tf.sqrt(tf.reduce_sum(tf.square(k), axis=1, keep_dims=True))
        M_norm = tf.sqrt(tf.reduce_sum(tf.square(prev_M), axis=2, keep_dims=True))
        norm_product = M_norm * k_norm
        K = tf.squeeze(inner_product / (norm_product + 1e-8)) 

        K_exp = tf.exp(K)
        w = K_exp / tf.reduce_sum(K_exp, axis=1, keep_dims=True) 
        
        return w

Then, we get the read weight vector:

 w_r = read_head_addressing(k, prev_M)

We perform the read operation, which is the linear combination of the read weight vector and memory:

  read_vector_list = []
        with tf.variable_scope('reading'):
            for i in range(self.head_num):
                read_vector = tf.reduce_sum(tf.expand_dims(w_r_list[i], dim=2) * M, axis=1)
                read_vector_list.append(read_vector)

Table of Contents for Read operation

Create new playlist

Sign In

Sign Up

Table of Contents for
Read operation