If we can sample from i.i.d. from $\Gamma$ and $f \in L_2(\Gamma)$, then \[ P(|\Gamma_n f - \Gamma f | > \epsilon) \le \frac{E(\Gamma_n f - \Gamma f)_2^2}{\epsilon^2} = \frac{Var(f(X_0))}{\epsilon^2 n}. \] With this bound, in order to get \[ |\Gamma_n f - \Gamma f | \le .1 \] with probability at least $.5$, we need at least $200 * Var(f(X_0))$ samples. We can do this for a normal distribution easily using Pytorch.
import torch torch.zeros(200).normal_(0, 1).mean() torch.zeros(200).normal_(0, 1).mean()
tensor(0.0156) tensor(-0.0987)Thus, the Monte Carlo error is controlled by $.1$ at least half of the time. If we have correlated samples, the variance is \[ Var(\Gamma_n f) = \sum_{i = 1}^n \sum_{j = 1}^n Cov(f(X_i), f(X_j)) \] and this bound may tell us nothing useful. Chebyshev and convexity gives us concentration for $f \in L_2(\Gamma)$ by \[ P( |\Gamma_n f - \Gamma f| \ge \epsilon) \le \epsilon^{-2} E |\pi_n f - \pi f|^2 \le \epsilon^{-2} \frac{1}{n} \sum_{k = 1}^n E |f(X_k) - \Gamma f|^2. \] but this is still difficult to deal with.
If we are using Langevin MC starting at $x$ and sampling $X_k$ from $\delta_xP^k$, then $E_x f(X_k) \not= \Gamma f$. Instead, we try to control the bias. The total variation distance \[ \norm{\mu - \nu}_{TV} = \sup_{B \in \mathcal{B}(\mathbb{R}^d)} |\mu(B) - \nu(B)| \] controls the bias for bounded and measurable functions. Particularly, indicator functions that approximate probabilities $P_x(X \in B) \approx \frac{1}{n} \sum_{k = 1}^n I(X_k \in B)$. Let $f : \mathbb{R}^d \to \mathbb{R}$ with $|f| \le R$. Then \[ |E_x\Gamma_n f - \Gamma f | \le \frac{1}{n} \sum_{k = 1}^n |E_x f(X_k) - \Gamma f| \le \frac{2R}{n} \sum_{k = 1}^n \int_{\mathbb{R}^d} | p^k(x, y) - \gamma(y)| dy = \frac{2R}{n} \sum_{k = 1}^n \norm{\delta_x P^k - \pi}_{TV}. \] The Wasserstein p distance $W_p(\mu, \nu)$ is \[ \left( \inf_{C \in \mathcal{C}} \int_{\mathbb{R}^{2d}} \norm{u - v}^p dC(u, v) \right)^{\frac{1}{p}} \] where $\mathcal{C}$ is the set of joint distributions where $\mu$ and $\nu$ have the same marginal distribution. These distances control Lipschitz functions. Let $f : \mathbb{R}^d \to \mathbb{R}$ be $M-$Lipschitz. Let $C_k$ be an optimal coupling for $\delta_x P^k$ and $\pi$. Then \[ | E_x \Gamma_n f - \Gamma f | \le \frac{M}{n} \sum_{k = 1}^n | E_x f(X_k) - \Gamma f | \le \frac{M}{n} \sum_{k = 1}^n \int_{\mathbb{R}^{2d}} |u - v| dC_k(u, v) = \frac{M}{n} \sum_{k = 1}^n W_1(\delta_x P^k, \Gamma). \] There are of course many other ways.