Notes on: Park, M., Jitkrittum, W., & Sejdinovic, D. (2016): K2-abc: approximate bayesian computation with kernel embeddings

Table of Contents

Overview

  1. Adopt [[file:~org-blognotesmathematicsstatistics.org::def:maximum-mean-discrepancy][Maximum Mean Discrepancy (MMD)]]gretton_2012 as non-parametric distance between empirical distributions of simulated and observed data
    • No need to select summary statistic first as the kernel embedding itself plays this role
  2. Apply additional Gaussian smoothning kernel which operates on the corresponding RKHS

Terminology

ABC
Approximate Bayesian Computing, a paradigm which enables simulation-based posterior inference in such cases by measuring the simularity between simulated and observed data in terms of a chosen statistic.

Notation

  • park_jitkrittum_2016_78504c98de3602d3ca0f9c9e1bc9f50301460233.png parameters
  • park_jitkrittum_2016_c3f64048c5805d9a401423a533acbb0372039c00.png generated samples from model with parameters park_jitkrittum_2016_e7a86163f39fa21d4a2ed66946369cdeb900ef42.png
  • park_jitkrittum_2016_688e6e5747f484cd8a4e57dfea18e23bd317e255.png denotes observed data
  • park_jitkrittum_2016_78fe10aaa74d6bed9ab2238be53774fe93bfbf0e.png is the domain of the observations
  • park_jitkrittum_2016_9ab766fe2c081b82a304866d50f951fadbdb5704.png is a metric on park_jitkrittum_2016_78fe10aaa74d6bed9ab2238be53774fe93bfbf0e.png
  • park_jitkrittum_2016_78bb582c5738c8937eda79522ab2d8814eba0be7.png
  • For a probability distribution park_jitkrittum_2016_e31ea3431be6b65896df63147dde19ef615f4e36.png on a domain park_jitkrittum_2016_76879b948635123ecd29d7cd65a1060145ba040e.png, its kernel embedding is defined as

    park_jitkrittum_2016_0653453121f3b0ffa0c35a74d960d29997e0c4fa.png

    i.e. an element of an RKHS park_jitkrittum_2016_04016c2db0a754f30f8b3b0b87bb4fa9ea1528a8.png with an associated kernel park_jitkrittum_2016_4824fc1e4be75bef4827a4f5bc60b7d58d5cdce3.png

Background

  • Consider cases where computation of the likelihood of park_jitkrittum_2016_688e6e5747f484cd8a4e57dfea18e23bd317e255.png is intractable

Kernel MMD

We can obtain an unbiased estimator for MMD. Given

park_jitkrittum_2016_4f92ea683dca16165e0b14e01d2980b83a77bac4.png

an unbiased estimator of the MMD is given by

park_jitkrittum_2016_45e7601337e4941045ca92c5ddff70e6c1fb693b.png

K2-ABC

  • Given park_jitkrittum_2016_24461f9faecf54c43adc07c5fe2e7f0654b9fbd7.png and park_jitkrittum_2016_dfb6a0a43cb771db6819a97c2f1bc22620d31122.png of i.i.d. observations (can be relaxed in practice)
  • Non-parametric distance park_jitkrittum_2016_9ab766fe2c081b82a304866d50f951fadbdb5704.png between empricial distributions
  • Use park_jitkrittum_2016_d357aa65c9a7f225122ad7e5fab27a5ca6abbcd8.png to measure distance between park_jitkrittum_2016_2411b53055213fe164b0ee8b7323de1b53f4342e.png:

    park_jitkrittum_2016_4d124841127e854923086f0d8b95113891e3fe7f.png

    i.e. park_jitkrittum_2016_8a87c5385b8c9859cbaa2caaf9c4454d7a8564d3.png is an unbiased estimate of park_jitkrittum_2016_2c7378c1532095d95b4ad00c128dd37941cc6b92.png between probability distributions used to generate park_jitkrittum_2016_a3a7f43f807b9e381fc50e0fab140c0df0a03e17.png and park_jitkrittum_2016_688152878982e1f3ca99e20b967e66668fedbf17.png

  • Second kernel, which operates directly on the probability measures, and compute the ABC posterior sample weights,

    park_jitkrittum_2016_01463580b38687b4b36431abb1b9b19516e576e7.png

    with a suitably chosen parameter park_jitkrittum_2016_95253ea0c2082f08ac36eed736be50b9577033e8.png.

  • Compare datasets using estimated similiarity park_jitkrittum_2016_ac2acc6c169b2ea7020ce4a767228a515ae8ed2f.png between generating distributions