As before, we can borrow metrics from scikit-learn to measure our model. To do so, however, we will need to make some easy conversions from the model's categorical output of y, as scikit-learn expects class labels, not binary class indicators.
To make the leap, we will start by making our prediction, using the following code:
y_softmax = model.predict(data["test_X"])
Then, we will choose the index of the class with the largest probability, which will conveniently be the class using the following code:
y_hat = y_softmax.argmax(axis=-1)
Then, we can use scikit-learn's classification report, as before. The code for the same is as follows:
from sklearn.metrics import classification_report
print(classification_report(test_y, y_hat))
We can actually look at the precision, recall, and f1-score for all 10 classes now. The following figure illustrates the output from sklearn.metrics.classification_report():