In the previous few chapters, we have extensively discussed LIME and SHAP. You have also seen the practical aspect of applying the Python frameworks of LIME and SHAP to explain black-box models. One major limitation of both frameworks is that the method of explanation is not extremely consistent and intuitive with how non-technical end users would explain an observation. For example, if you have an image of a glass filled with Coke and use LIME and SHAP to explain a black-box model used to correctly classify the image as Coke, both LIME and SHAP would highlight regions of the image that lead to the correct prediction by the trained model. But if you ask a non-technical user to describe the image, the user would classify the image as Coke due to the presence of a dark-colored carbonated liquid in a glass that resembles a Cola drink. In other words, human beings tend to relate any observation with known concepts to explain it.
Testing with Concept Activation Vector (TCAV) from Google AI also follows a similar approach in terms of explaining model predictions with known human concepts. So, in this chapter, we will cover how TCAV can be used to provide concept-based human-friendly explanations. Unlike LIME and SHAP, TCAV works beyond feature attribution and refers to concepts such as color, gender, race, shape, any known object, or an abstract idea to explain model predictions. In this chapter, we will discuss the workings of the TCAV algorithm. I will cover some of the advantages and disadvantages of the framework. We will also discuss using this framework for practical problem-solving. In Chapter 2, Model Explainability Methods, under Representation-based explanation, you did get some exposure to TCAV, but in this chapter, we will cover the following topics:
It's time to get started now!
This code tutorial and the requisite resources can be downloaded or cloned from the GitHub repository for this chapter at https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/tree/main/Chapter08. Similar to other chapters, Python and Jupyter notebooks are used to implement the practical application of the theoretical concepts covered in this chapter. However, I would recommend you run the notebooks only after you have gone through this chapter for a better understanding.
The idea of TCAV was first introduced by Kim et al. in their work – Interpretability beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) (https://arxiv.org/pdf/1711.11279.pdf). The framework was designed to provide interpretability beyond feature attribution, particularly for deep learning models that rely on low-level transformed features that are not human-interpretable. TCAV aims to explain the opaque internal state of the deep learning model using abstract, high-level, human-friendly concepts. In this section, I will present you with an intuitive understanding of TCAV and explain how it works to provide human-friendly explanations.
So far, we have covered many methods and frameworks to explain ML models through feature-based approaches. But it might occur to you that since most ML models operate on low-level features, the feature-based explanation approaches might highlight features that are not human-interpretable. For example, for explaining image classifiers, pixel intensity values or pixels coordinates in an image might not be useful for end users without any technical background in data science and ML. So, these features are not user-friendly. Moreover, feature-based explanations are always restricted by the selection of features and the number of features present in the dataset. Out of all the features selected by the feature-based explanation methods, end users might be interested in a particular feature that is not picked by the algorithm.
So, instead of this approach, concept-based approaches provide a much wider abstraction that is human-friendly and more relevant as interpretability is provided in terms of the importance of high-level concepts. So, TCAV is a model interpretability framework from Google AI that implements the idea of a concept-based explanation method in practice. The algorithm depends on Concept Activation Vectors (CAV), which provide an interpretation of the internal state of ML models using human-friendly concepts. In a more technical sense, TCAV uses directional derivatives to quantify the importance of human-friendly, high-level concepts for model predictions. For example, while describing hairstyles, concepts such as curly hair, straight hair, or hair color can be used by TCAV. These user-defined concepts are not the input features of the dataset that are used by the algorithm during the training process.
The following figure illustrates the key question addressed by TCAV:
In the next section, let's try to understand the idea of model explanation using abstract concepts.
By now, you may have an intuitive understanding of the method of providing explanations with abstract concepts. But why do you think this is an effective approach? Let's take another example. Suppose you are working on building a deep learning-based image classifier for detecting doctors from images. After applying TCAV, let's say that you have found out that the concept importance of the concept white male is maximum, followed by stethoscope and white coat. The concept importance of stethoscope and white coat is expected, but the high concept importance of white male indicates a biased dataset. Hence, TCAV can help to evaluate fairness in trained models.
Essentially, the goal of CAVs is to estimate the importance of a concept (such as color, gender, and race) for the prediction of a trained model, even though the concepts were not used during the model training process. This is because TCAV learns concepts from a few example samples. For example, in order to learn a gender concept, TCAV needs a few data instances that have a male concept and a few non-male examples. Hence, TCAV can quantitatively estimate the trained model's sensitivity to a particular concept for that class. For generating explanations, TCAV perturbs data points toward a concept that is relatable to humans, and so it is a type of global perturbation method. Next, let's try to learn the main objectives of TCAV.
I found the approach of TCAV to be very unique as compared to other explanation methods. One of the main reasons is because the developers of this framework established clear goals that resonate with my own understanding of human-friendly explanations. The following are the established goals of TCAV:
Now that we have an idea of what can be achieved using TCAV, let's discuss the general approach to how TCAV works.
In this section, we will cover the workings of TCAV in more depth. The overall workings of this algorithm can be summarized in the following methods:
Now, I will try to further simplify the approach of TCAV without using too many mathematical notions. Let's assume we have a model for identifying zebras from images. To apply TCAV, the following approach can be taken:
Thus, we have covered the intuitive workings of the TCAV algorithm. Next, let's cover how TCAV can actually be implemented in practice.
In this section, we will explore the practical applications of TCAV for explaining pre-trained image explainers with concept importance. The entire notebook tutorial is available in the code repository of this chapter at https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/blob/main/Chapter08/Intro_to_TCAV.ipynb. This tutorial is presented based on the notebook provided in the original GitHub project repository of TCAV https://github.com/tensorflow/tcav. I recommend that you all refer to the main project repository of TCAV since the credit for implementation should go to the developers and contributors of TCAV.
In this tutorial, we will cover how to apply TCAV to validate the concept importance of the concept of stripes as compared to the honeycomb pattern for identifying tigers. The following images illustrate the flow of the approach used by TCAV to ascertain concept importance using a simple visualization:
Let's begin by setting up our Jupyter notebook.
Similar to the other tutorial examples covered in the previous chapters, to install the necessary Python modules required to run the notebook, you can use the pip install command in a Jupyter notebook:
!pip install --upgrade pandas numpy matplotlib tensorflow tcav
You can import all the modules to validate the successful installation of these frameworks:
import tensorflow as tf
import tcav
Next, let's take a look at the data that we'll be working with.
I felt that the data preparation process, which is provided in the original project repository of TCAV, is slightly time-consuming. So, I have already prepared the necessary datasets, which you can refer to from this project repository. Since we will be validating the importance of the concept of stripes for images of tigers, we will need an image dataset for tigers. The data is collected from the ImageNet collection and is provided in the project repository at https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/tree/main/Chapter08/images/tiger. The images are randomly curated and collected using the data collection script provided in the TCAV repository: https://github.com/tensorflow/tcav/tree/master/tcav/tcav_examples/image_models/imagenet.
In order to run TCAV, you would need to have the necessary concept images, target class images, and random dataset images. For this tutorial, I have prepared the concept images from the Broden dataset (http://netdissect.csail.mit.edu/data/broden1_224.zip), as suggested in the main project example. Please go through the research work that led to the creation of this dataset: https://github.com/CSAILVision/NetDissect. You can also explore the Broden dataset texture images provided at https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/tree/main/Chapter08/concepts/broden_concepts to learn more. I recommend you to experiment with other concepts or other images and play around with TCAV-based concept importance!
Broden dataset
David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, and Antonio Torralba. Network Dissection: Quantifying Interpretability of Deep Visual Representations. Computer Vision and Pattern Recognition (CVPR), 2017.
As TCAV also requires some random datasets to ascertain the statistical significance of the concepts learned from target image examples, I have provided some sample random images in the project repository, thereby simplifying the running of the tutorial notebook! But as always, you should experiment with other random image examples as well. These random images are also collected using the image fetcher script provided in the main project: https://github.com/tensorflow/tcav/blob/master/tcav/tcav_examples/image_models/imagenet/download_and_make_datasets.py.
To proceed further, you need to define the variables for the target class and the concepts:
target = 'tiger'
concepts = ['honeycombed', 'striped']
You can also create the necessary paths and directories to store the generated activations and CAVs as mentioned in the notebook tutorial. Next, let's discuss the model used in this example.
In this example, we will use a pre-trained deep learning model to highlight the fact that even though TCAV is considered to be a model-specific approach, as it is only applicable to neural networks, it does not make an assumption of the network architecture as such and can work well with most deep neural network models.
For this example, we will use the pre-trained GoogleNet model, https://paperswithcode.com/method/googlenet, based on the ImageNet dataset (https://www.image-net.org/). The model files are provided in the code repository: https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/tree/main/Chapter08/models/inception5h. You can load the trained model using the following lines of code:
model_to_run = 'GoogleNet'
sess = utils.create_session()
GRAPH_PATH = "models/inception5h/tensorflow_inception_graph.pb"
LABEL_PATH = "models/inception5h/imagenet_comp_graph_label_strings.txt"
trained_model = model.GoogleNetWrapper_public(sess,
GRAPH_PATH,
LABEL_PATH)
The model wrapper is actually used to get the internal state and tensors of the trained model. Concept importance is actually computed based on the internal neuron activations and hence, this model wrapper is important. For more details about the workings of the internal API, please refer to the following link: https://github.com/tensorflow/tcav/blob/master/Run_TCAV_on_colab.ipynb.
Next, we would need to generate the concept activations using the ImageActivationGenerator method:
act_generator = act_gen.ImageActivationGenerator(
trained_model, source_dir, activation_dir,
max_examples=100)
Next, we will explore model explainability using TCAV.
As discussed before, TCAV is currently used to explain neural networks and the inner layers of a neural network. So, it is not model-agnostic, but rather a model-centric explainability method. This requires us to define the bottleneck layer of the network:
bottlenecks = [ 'mixed4c']
The next step will be to apply the TCAV algorithm to create the concept activation vectors. The process also includes performing statistical significance testing using two side t-test between the concept importance of the target class and the random samples:
num_random_exp= 15
mytcav = tcav.TCAV(sess,
target,
concepts,
bottlenecks,
act_generator,
[0.1],
cav_dir=cav_dir,
num_random_exp=num_random_exp)
The original experiment mentioned in the TCAV paper, https://arxiv.org/abs/1711.11279, mentioned using at least 500 random experiments to identify the statistically significant concepts. But for the sake of simplicity, and to achieve faster results, we are using 15 random experiments. You can experiment with more random experiments as well.
Finally, we can get the results and visualize the concept importance:
results = mytcav.run(run_parallel=False)
utils_plot.plot_results(results,
num_random_exp=num_random_exp)
This will generate the following plot that helps us to compare concept importance:
As you can observe from Figure 8.3, the striped concept has significantly higher concept importance than the honeycombed concept for identifying tigers.
Now that we have covered the practical application part, let me give you a similar challenge as an exercise. Can you now use the ImageNet dataset and ascertain the importance of the concept of water to ships and of clouds or sky to airplanes? This will help you understand this concept in more depth and give you more confidence to apply TCAV. Next, we will discuss some advantages and limitations of TCAV.
In the previous section, we covered the practical aspects of TCAV. TCAV is indeed a very interesting and novel approach to explaining complex deep learning models. Although it has many advantages, unfortunately, I did find some limitations in terms of the current framework that can definitely be improved in the revised version.
Let's discuss the following advantages first:
These are some of the distinct advantages of TCAV, which makes it more human-friendly than LIME and SHAP. Next, let's discuss some known limitations of TCAV.
The following are some of the known disadvantages of the TCAV approach:
TCAV is highly prone to data drift, adversarial effects, and other data quality issues discussed in Chapter 3, Data-Centric Approaches. If you are using TCAV, you would need to ensure that training data, inference data, and even concept data have similar statistical properties. Otherwise, the concepts generated can become affected due to noise or data impurity issues:
I think all these limitations can indeed be solved to make TCAV a much more robust framework that is widely adopted. If you are interested, you can also reach out to authors and developers of the TCAV framework and contribute to the open source community! In the next section, let's discuss some potential applications of concept-based explanations.
I do see great potential for concept-based explanations such as TCAV! In this section, you will get exposure to some potential applications of concept-based explanations that can be important research topics for the entire AI community, which are as follows:
For example, if we say that it is going to rain today although there is a clear sky now, we usually add a further explanation that suggests that the clear sky can be covered with clouds, which increases the probability of rainfall. In other words, a clear sky is a concept related to a sunny day, while a cloudy sky is a concept related to rainfall. This example suggests that the forecast can be flipped if the concept describing the situation is also flipped. Hence, this is the idea of the concept-based counterfactual. The idea is not very far-fetched as concept bottleneck models (CBMs) presented in the research work by Koh et al., in https://arxiv.org/abs/2007.04612, can implement a similar idea of generated concept-based counterfactuals by manipulating the neuron action of the bottleneck layer.
Figure 8.5 illustrates an example of using a concept-based counterfactual example. There is no existing algorithm or framework that can help us achieve this, yet this can be a useful application of concept-based approaches in computer vision.
I feel this is a wide-open research field and the potential to come up with game-changing applications using concept-based explanations is immense. I do sincerely hope that more and more researchers and AI developers start working on this area to make significant progress in the coming years! Thus, we have arrived at the end of this chapter. Let me now summarize what has been covered in this chapter in the next section.
This chapter covers the concepts of TCAV, a novel approach, and a framework developed by Google AI. You have received a conceptual understanding of TCAV, practical exposure to applying the Python TCAV framework, learned about some key advantages and limitations of TCAV, and finally, I presented some interesting ideas regarding potential research problems that can be solved using concept-based explanations.
In the next chapter, we will explore other popular XAI frameworks and apply these frameworks to solving practical problems.
Please refer to the following resources to gain additional information:
3.144.17.91