Notes on: Montavon, Gr\'egoire, Bach, S., Binder, A., Samek, W., & M\"uller, Klaus-Robert (2015): Explaining Nonlinear Classification Decisions With Deep Taylor Decomposition

Table of Contents

Overview

Propose an expansion for some input / activation montavon15_explain_nonlin_class_decis_with_ed39d9a397196f8f0ce6388b0ea4e0c1dd8becee.png about a root point montavon15_explain_nonlin_class_decis_with_01788aa15d8738a5050de73beeafe6bf26dfd15f.png using local Taylor decomposition, where the expansion is given by the recursion

montavon15_explain_nonlin_class_decis_with_d76f6aa8b994fd1abd53f77e0c3541122be08c7a.png

where

  • montavon15_explain_nonlin_class_decis_with_1e0e1801bb7cacb116a81cdaf5f3f08ee59bc18f.png denotes the relevance score of the j-th neuron in a given layer
  • montavon15_explain_nonlin_class_decis_with_d9e2fed90d2d92222b37a4b9f57c4bedecd8a86f.png denotes the relevance score for the i-th neuron in the previous layer (i.e. before the layer which contains montavon15_explain_nonlin_class_decis_with_e27f5d96a277220941afcc8233c49d665a7defe0.png)

Intuitively, one can think of it as follows:

  • Gradient montavon15_explain_nonlin_class_decis_with_163f641ca971407b4376960d3acef9afbe741a82.png measures sensitivity of some class-label to each pixel when the classifier montavon15_explain_nonlin_class_decis_with_cdd1cc131da6040eca078917132a377727053c44.png is evaluated at the root point montavon15_explain_nonlin_class_decis_with_01788aa15d8738a5050de73beeafe6bf26dfd15f.png.
  • A good root point is one that removes the object (e.g. as detected by the function montavon15_explain_nonlin_class_decis_with_689c66a323bc8a7e92d22e177edbabd1cb957d7c.png), but that minimally deviates from the original point montavon15_explain_nonlin_class_decis_with_ed39d9a397196f8f0ce6388b0ea4e0c1dd8becee.png.
    • If montavon15_explain_nonlin_class_decis_with_d2227659243b803d2a340cf2c101aa9aa9e4b3ab.png is far away from montavon15_explain_nonlin_class_decis_with_e9f4d218474bc10aa94958ec30139aee865c0173.png, then montavon15_explain_nonlin_class_decis_with_81c8c3eb2dc7b4483565ea2ef0260937182acf88.png is large, hence we assign large relevance to montavon15_explain_nonlin_class_decis_with_e9f4d218474bc10aa94958ec30139aee865c0173.png
    • If montavon15_explain_nonlin_class_decis_with_2cb88e88987c946cd9cb27dd6834a72b8c68c720.png evaluted at the root point montavon15_explain_nonlin_class_decis_with_d2227659243b803d2a340cf2c101aa9aa9e4b3ab.png is large, then that means the input about this root point is very sensitive, hence we assign high releveance to nearby points which includes montavon15_explain_nonlin_class_decis_with_e9f4d218474bc10aa94958ec30139aee865c0173.png

This method starts out by looking for a relevance score montavon15_explain_nonlin_class_decis_with_d9e2fed90d2d92222b37a4b9f57c4bedecd8a86f.png for the input which satisfies properties of

  • conservation, i.e. sum assigned relevances in input space corresponds to total relevance detected by model (i.e. relevance i output)

    montavon15_explain_nonlin_class_decis_with_5afb5eb5fe26f316fad90d4b51e79e30d07b528e.png

  • positivity, i.e. montavon15_explain_nonlin_class_decis_with_82741a4630a8f9f36be4ea74f4d082f73a988c4e.png

Notation

  • montavon15_explain_nonlin_class_decis_with_638875588ba08140aee02f20607e321de55ecf13.png positive-valued function
  • montavon15_explain_nonlin_class_decis_with_dc8da2ee866cefceb970fed7eaa642b27d0dcb34.png is an image
  • montavon15_explain_nonlin_class_decis_with_689c66a323bc8a7e92d22e177edbabd1cb957d7c.png quantifies the presence (or amount) of a certain type of boject in the image
  • montavon15_explain_nonlin_class_decis_with_bb40af7193b40c85f4cc5e5cc3ef64083bb0e660.png indicates absence of object in image
  • montavon15_explain_nonlin_class_decis_with_218179d0a94a589fd378960eb40bcae644028098.png assoicate to each pixel montavon15_explain_nonlin_class_decis_with_fefe9e556d399665a26a37824ec578cbffb0cabe.png in the image a relevance score
  • montavon15_explain_nonlin_class_decis_with_7d56d23c709081d5daae477efc00ea2b26bc7727.png is the heatmap which contains the relevance score of each pixel
  • montavon15_explain_nonlin_class_decis_with_b0a80e48efaa59cc7ae0693ac59677243d0f5cbb.png denotes the point where on performs Taylor expansion, usually an infinitesimally small distance from the actual point montavon15_explain_nonlin_class_decis_with_ed39d9a397196f8f0ce6388b0ea4e0c1dd8becee.png, in direction of maximum descent, i.e.

    montavon15_explain_nonlin_class_decis_with_a057f859ebdae759faea041fa93dd4c25dea384d.png

    with montavon15_explain_nonlin_class_decis_with_734ace1ca706cad07392531389b813068bc44391.png small.

  • montavon15_explain_nonlin_class_decis_with_d6598a5b19b41bfa2536bd29f229ce2d914ae3ba.png denotes the set of neurons in the l-th layer (notation introduced by me)
  • montavon15_explain_nonlin_class_decis_with_e9f4d218474bc10aa94958ec30139aee865c0173.png and montavon15_explain_nonlin_class_decis_with_d9e2fed90d2d92222b37a4b9f57c4bedecd8a86f.png denote the activations and relevances in some layer, i.e. montavon15_explain_nonlin_class_decis_with_190ec11f9fc2897c2c7cbd78ff9600d61fe210de.png
  • montavon15_explain_nonlin_class_decis_with_e27f5d96a277220941afcc8233c49d665a7defe0.png and montavon15_explain_nonlin_class_decis_with_1e0e1801bb7cacb116a81cdaf5f3f08ee59bc18f.png denote the activations and relevances in the next layer (relative to layer containing all the montavon15_explain_nonlin_class_decis_with_e9f4d218474bc10aa94958ec30139aee865c0173.png), i.e. montavon15_explain_nonlin_class_decis_with_577220575e4f4726b8af27d3615537c9c4b85304.png
  • montavon15_explain_nonlin_class_decis_with_01788aa15d8738a5050de73beeafe6bf26dfd15f.png is a root point, which is the nearest point to montavon15_explain_nonlin_class_decis_with_ed39d9a397196f8f0ce6388b0ea4e0c1dd8becee.png such that montavon15_explain_nonlin_class_decis_with_287c188e057e537f66954a710e4d0919c20a010b.png

Pixel-wise Decomposition of a function

One would like the heatmap montavon15_explain_nonlin_class_decis_with_7ae8468531bae9ca24f2be33ddc6148de2ba463e.png to have the following properties

  • Conservative , i.e. the sum of assigned relevances in pixel space corresponds to the total relevance detected by model

    montavon15_explain_nonlin_class_decis_with_5afb5eb5fe26f316fad90d4b51e79e30d07b528e.png

  • Positive:

    montavon15_explain_nonlin_class_decis_with_d9becdc57a939ad26b48d588ee1732a830a8b8f5.png

We say a heatmapping is consistent if and only if it is both conservative and positive.

Taylor Decomposition

Taylor expansion of the function montavon15_explain_nonlin_class_decis_with_cdd1cc131da6040eca078917132a377727053c44.png at some well-chosen root point montavon15_explain_nonlin_class_decis_with_01788aa15d8738a5050de73beeafe6bf26dfd15f.png, where

montavon15_explain_nonlin_class_decis_with_ed7ee2b0214012ea463c1e7076985dab1e2640b5.png

which gives

montavon15_explain_nonlin_class_decis_with_cf14d6ea78992467c1b3b21c26de2b0d14e0b89a.png

where we identify the summed elements as the relevances montavon15_explain_nonlin_class_decis_with_218179d0a94a589fd378960eb40bcae644028098.png assigned to pixels in the image.

We can write this as the element-wise product montavon15_explain_nonlin_class_decis_with_9614489c6ea0bf61e8d0d12bf6f5e161c9486733.png between the gradient of the function at the root point and difference between the image and the root montavon15_explain_nonlin_class_decis_with_062c0d86dbee04a1d2f8f3db5a7246bdb30c4b44.png:

montavon15_explain_nonlin_class_decis_with_583392e625cba088687d8d58d3ee69715f6cd06f.png

  • Gradient montavon15_explain_nonlin_class_decis_with_163f641ca971407b4376960d3acef9afbe741a82.png measures sensitivity of some class-label to each pixel when the classifier montavon15_explain_nonlin_class_decis_with_cdd1cc131da6040eca078917132a377727053c44.png is evaluated at the root point montavon15_explain_nonlin_class_decis_with_01788aa15d8738a5050de73beeafe6bf26dfd15f.png.
  • A good root point is one that removes the object (e.g. as detected by the function montavon15_explain_nonlin_class_decis_with_689c66a323bc8a7e92d22e177edbabd1cb957d7c.png, but that minimally deviates from the original point montavon15_explain_nonlin_class_decis_with_ed39d9a397196f8f0ce6388b0ea4e0c1dd8becee.png.

Deep Taylor Decomposition

  • Don't consider whole NN function montavon15_explain_nonlin_class_decis_with_cdd1cc131da6040eca078917132a377727053c44.png
  • Consider mapping of a set of neurons montavon15_explain_nonlin_class_decis_with_11a687af8d90d1f3e1d5a38656049056ec322012.png at a given layer to the relevance montavon15_explain_nonlin_class_decis_with_1e0e1801bb7cacb116a81cdaf5f3f08ee59bc18f.png assignted to a neruon montavon15_explain_nonlin_class_decis_with_e27f5d96a277220941afcc8233c49d665a7defe0.png in the next layer, that is:

    montavon15_explain_nonlin_class_decis_with_2ec4158568bc46076eee61f5c35ac2ef4e8f4974.png

  • Assume two objects are related functionally by some function montavon15_explain_nonlin_class_decis_with_90e4fde2e64e2498ed60fa1b0b3dbd39b51ed63a.png => would like to apply Taylor decomposition on this local function in order to redistribute relevance montavon15_explain_nonlin_class_decis_with_1e0e1801bb7cacb116a81cdaf5f3f08ee59bc18f.png onto lower-layer relevances montavon15_explain_nonlin_class_decis_with_bf4aaae18d4b334bb8df29e7d522225e89553c3d.png
  • Assume neighboring root point montavon15_explain_nonlin_class_decis_with_f42fc4ba8e26bf62ab870a1f2690dba80faea328.png such that montavon15_explain_nonlin_class_decis_with_4a7bdf5fd1ecbb38afa6870e69bf3950a2c9cc82.png, we can write Taylor decomposition of montavon15_explain_nonlin_class_decis_with_b29ac53bb7187c3edf55046b89c5b17e5c0fb599.png at montavon15_explain_nonlin_class_decis_with_11a687af8d90d1f3e1d5a38656049056ec322012.png as

    montavon15_explain_nonlin_class_decis_with_b938b6f410f9c596de69c5db5d25cdd0a630e8af.png

    that redistributes relevance from one layer to another below, where montavon15_explain_nonlin_class_decis_with_03d96c52df5ea6a510607e2260e0745d11e4bd85.png denotes the Taylor residual.

  • If each local Taylor decomposition is conservative, then

    montavon15_explain_nonlin_class_decis_with_50a6835840cbe42b8063635b0168c1e64cf634d5.png

    which is referred to as layer-wise relevance conservation.

  • Positivity of relevance is also ensured
  • If Taylor decompositions of local subfunctions are consistent, then the whole deep Taylor decomposition is also consistent

Only problem left is to identify the root-points, but this is indeed handled by the paper depending on whether or not the input is constrained, etc.