The output of this model will be a class prediction, from 0-9. We will use a 10-node softmax, as we did with MNIST. Surprisingly, nothing changes in our output layer. We will use the following code to define the output:
output = Dense(10, activation="softmax", name="softmax")(d2)