We will define all the operations in a new Python file named models.py. First, let's create some operations to compute loss and accuracy:
def compute_loss(logits, labels): labels = tf.squeeze(tf.cast(labels, tf.int32)) cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels) cross_entropy_mean = tf.reduce_mean(cross_entropy) tf.add_to_collection('losses', cross_entropy_mean) return tf.add_n(tf.get_collection('losses'), name='total_loss') def compute_accuracy(logits, labels): labels = tf.squeeze(tf.cast(labels, tf.int32)) batch_predictions = tf.cast(tf.argmax(logits, 1), tf.int32) predicted_correctly = tf.equal(batch_predictions, labels) accuracy = tf.reduce_mean(tf.cast(predicted_correctly, tf.float32)) return accuracy
In these methods, logits is the output of the model and labels is the ground truth data from the dataset. In the compute_loss method, we use tf.nn.sparse_softmax_cross_entropy_with_logits so we don't need to normalize the logits with softmax methods. Besides, we don't need to make the labels a one-hot vector. In the compute_accuracy method, we compare the max value in logits with tf.argmax and compare it with the labels to get the accuracy.
Next, we are going to define the operations for the learning_rate and the optimizer:
def get_learning_rate(global_step, initial_value, decay_steps, decay_rate): learning_rate = tf.train.exponential_decay(initial_value, global_step, decay_steps, decay_rate, staircase=True) return learning_rate def train(total_loss, learning_rate, global_step, train_vars): optimizer = tf.train.AdamOptimizer(learning_rate) train_variables = train_vars.split(",") grads = optimizer.compute_gradients( total_loss, [v for v in tf.trainable_variables() if v.name in train_variables] ) train_op = optimizer.apply_gradients(grads, global_step=global_step) return train_op
In the train method, we configure the optimizer to only compute and apply gradients to some variables defined in the train_vars string. This allows us to only update the weights and biases for the last layer, fc8, and freeze other layers. train_vars is a string that contains a list of variables split by commas, for example, models/fc8-pets/weights:0,models/fc8-pets/biases:0.