# Notes on: Gorham, J., & Mackey, L. (2015): Measuring Sample Quality With Stein's Method

## Table of Contents

## Overview

- Motivates Steins method
- Steins method
- , the function selected in the maximization process of the
**Stein Discrepancy**gives us a method, can be intuitively thought of as being a*feature selection*function which finds the features producing the most*discrepancy*between our approx. density and the actual density

- , the function selected in the maximization process of the
- Classical Stein Set and Discrepancy
- Methods for constructing a Stein operator for sufficiently smooth functions
- Proves upper bound for classical Stein Discrepancy

- Methods for computing Stein Discrepancies
- Graph Stein set
- Spanner Stein discrepancies using spanner graphs
- Linear programs used to solve the finite-dimensional subproblems of maximizing the function in each of the components to obtain the Stein discrepancy

- Experiments

## Notation

- Often refer to a generic norm on with associated dual norms for vectors
- is
*target distribution*with open convex support - continuously differentiable density
probability mass function induces a discrete distribution and approx.

for any target expHectation .

*weighted sample*of distinct sample points with weights*encoded in the probability mass function*- real-valued operator
- set of valued functions

## Stuff

Construct quality of measure with following properties:

- detects when a sequence of samples is converging to the target
- detects when a sequence of sample is not converging to the target
- computationally feasible

First consider **expected deviation between sample and target expectations over a class of real-valued test functions **:

- If class of test functions is sufficiently large => implies that the seuqence of sample measures converges weakly to

Varying class of test functions of IPM we recover many well-known probability metrics:

Total Variation Distance generated by:

Wasserstein metric generated by:

## Stein's method

Identify a real-valued operator acting on a set of valued functions of for which

Together, and define the

**Stein discrepancy**,an IPM quality measure with no explicit integration under .

- Lower bound the Stein discrepancy by a familiar convergence-determining IPM
- Can be perforemd once, in advance, for alrge classes of target distributions and ensures that, for any sequence of probability measures , converges to zero if and only if

- Upper bound the Stein discrepancy by any means necessary to demonstrate convergence to zero under suitable conditions.

### Challenges

- Constructing a
**Stein operator**which produce mean-zero functions under

### Identifying a Stein operator

If we let:

- denote the boundary of (an empty set when
- represent the outward unit normal vector to the boundary at

then we may define the **classical Stein set**

of sufficiently smooth functions satisfying a Neumann-type boundary condition (referring to the inner product of the function and the outward unit vector ).

From this they get the following proposition

If , then for all .

Together, and form the **classical Stein discrepancy** , which is the *main study of the paper*.