Notes on: Montavon, Gr\'egoire, Bach, S., Binder, A., Samek, W., & M\"uller, Klaus-Robert (2015): Explaining Nonlinear Classification Decisions With Deep Taylor Decomposition
Table of Contents
Overview
Propose an expansion for some input / activation about a root point
using local Taylor decomposition, where the expansion is given by the recursion
where
denotes the relevance score of the j-th neuron in a given layer
denotes the relevance score for the i-th neuron in the previous layer (i.e. before the layer which contains
)
Intuitively, one can think of it as follows:
- Gradient
measures sensitivity of some class-label to each pixel when the classifier
is evaluated at the root point
.
- A good root point is one that removes the object (e.g. as detected by the function
), but that minimally deviates from the original point
.
- If
is far away from
, then
is large, hence we assign large relevance to
- If
evaluted at the root point
is large, then that means the input about this root point is very sensitive, hence we assign high releveance to nearby points which includes
- If
This method starts out by looking for a relevance score for the input which satisfies properties of
conservation, i.e. sum assigned relevances in input space corresponds to total relevance detected by model (i.e. relevance i output)
- positivity, i.e.
Notation
positive-valued function
is an image
quantifies the presence (or amount) of a certain type of boject in the image
indicates absence of object in image
assoicate to each pixel
in the image a relevance score
is the heatmap which contains the relevance score of each pixel
denotes the point where on performs Taylor expansion, usually an infinitesimally small distance from the actual point
, in direction of maximum descent, i.e.
with
small.
denotes the set of neurons in the l-th layer (notation introduced by me)
and
denote the activations and relevances in some layer, i.e.
and
denote the activations and relevances in the next layer (relative to layer containing all the
), i.e.
is a root point, which is the nearest point to
such that
Pixel-wise Decomposition of a function
One would like the heatmap to have the following properties
Conservative , i.e. the sum of assigned relevances in pixel space corresponds to the total relevance detected by model
Positive:
We say a heatmapping is consistent if and only if it is both conservative and positive.
Taylor Decomposition
Taylor expansion of the function at some well-chosen root point
, where
which gives
where we identify the summed elements as the relevances assigned to pixels in the image.
We can write this as the element-wise product between the gradient of the function at the root point and difference between the image and the root
:
- Gradient
measures sensitivity of some class-label to each pixel when the classifier
is evaluated at the root point
.
- A good root point is one that removes the object (e.g. as detected by the function
, but that minimally deviates from the original point
.
Deep Taylor Decomposition
- Don't consider whole NN function
Consider mapping of a set of neurons
at a given layer to the relevance
assignted to a neruon
in the next layer, that is:
- Assume two objects are related functionally by some function
=> would like to apply Taylor decomposition on this local function in order to redistribute relevance
onto lower-layer relevances
Assume neighboring root point
such that
, we can write Taylor decomposition of
at
as
that redistributes relevance from one layer to another below, where
denotes the Taylor residual.
If each local Taylor decomposition is conservative, then
which is referred to as layer-wise relevance conservation.
- Positivity of relevance is also ensured
- If Taylor decompositions of local subfunctions are consistent, then the whole deep Taylor decomposition is also consistent
Only problem left is to identify the root-points, but this is indeed handled by the paper depending on whether or not the input is constrained, etc.