You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this MH implementation, the only place where $p(x)$ comes into play is in the acceptance probability.
477
-
Since we make sure to start the sampling at a point within the support of the distribution, `p(x)` will be nonzero.
478
477
479
-
If the proposal step causes `x_proposal` to be outside the support, then `p(x_proposal)` will be zero, and the acceptance probability (`p(x_proposal)/p(x)`) will be zero.
478
+
As long as we make sure to start the sampling at a point within the support of the distribution, `p(x)` will be nonzero.
479
+
If the proposal step generates an `x_proposal` that is outside the support, `p(x_proposal)` will be zero, and the acceptance probability (`p(x_proposal)/p(x)`) will be zero.
480
480
So such a step will never be accepted, and the sampler will continue to stay within the support of the distribution.
481
+
481
482
Although this does mean that we may find ourselves having a higher reject rate than usual, and thus less efficient sampling, it at least does not cause the algorithm to become unstable or crash.
482
483
483
-
### Hamiltonian Monte Carlo... not so fine
484
+
### Hamiltonian Monte Carlo: not so fine
484
485
485
486
The _real_ problem comes with gradient-based methods like Hamiltonian Monte Carlo (HMC).
486
487
Here's an equally barebones implementation of HMC.
@@ -582,48 +583,112 @@ z += timestep * r # (2)
582
583
r -= (timestep /2) *dEdz(z) # (3)
583
584
```
584
585
585
-
Here, `z` is the position.
586
+
Here, `z` is the position and `r` the momentum.
586
587
Since we start our sampler inside the support of the distribution (by supplying a good initial point), `dEdz(z)` will start off being well-defined on line (1).
587
588
However, after `r` is updated on line (1), `z` is updated again on line (2), and _this_ value of `z` may well be outside of the support.
588
589
At this point, `dEdz(z)` will be `NaN`, and the final update to `r` on line (3) will also cause it to be `NaN`.
589
590
590
591
Even if we're lucky enough for an individual integration step to not move `z` outside the support, there are many integration steps per sampler step, and many sampler steps, and so the chances of this happening at some point are quite high.
591
592
592
-
It's possible to choose your integration parameters carefully to reduce the risk of this happening.
593
+
It's possible to choose our integration parameters carefully to reduce the risk of this happening.
593
594
For example, we could set the integration timestep to be _really_ small, thus reducing the chance of making a move outside the support.
594
595
But that will just lead to a very slow exploration of parameter space, and in general, we should like to avoid this problem altogether.
595
596
596
597
### Rescuing HMC
597
598
598
599
Perhaps unsurprisingly, the answer to this is to transform the underlying distribution to an unconstrained one and sample from that instead.
599
-
However, to preserve the correct behaviour, we have to make sure that we include the pesky Jacobian term when sampling from the transformed distribution.
600
-
Bijectors.jl can do all of this for us.
600
+
However, we have to make sure that we include the pesky Jacobian term when sampling from the transformed distribution.
601
+
That's where Bijectors.jl can come in.
602
+
603
+
The main change we need to make is to pass a modified version of the function `p` to our HMC sampler.
604
+
Recall back at the very start, we transformed $p(x)$ into $q(y)$, and said that
We can also check that the mean and variance of the samples are what we expect them to be.
647
+
From [Wikipedia](https://en.wikipedia.org/wiki/Log-normal_distribution), the mean and variance of a log-normal distribution are respectively $\exp(\mu + \sigma^2/2)$ and $[\exp(\sigma^2) - 1]\exp(2\mu + \sigma^2)$.
648
+
For our log-normal distribution, we set $\mu = 0$ and $\sigma = 1$, so the mean and variance should be $1.6487$ and $4.6707$ respectively.
649
+
650
+
```{julia}
651
+
println(" mean : $(mean(samples_with_hmc_untransformed))")
0 commit comments