-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* ot background page * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * links * formatting * link edits * unbalanced formula fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update ot background * update ot background * fix link --------- Co-authored-by: Arina Danilina <danilina@cip.ifi.lmu.de> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Dominik Klein <dominik.klein@helmholtz-munich.de>
- Loading branch information
1 parent
2acb484
commit a205dcf
Showing
6 changed files
with
117 additions
and
1 deletion.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
# Optimal Transport (OT) in a nutshell | ||
|
||
Optimal transport ({term}`OT`) is a mathematical framework that finds the most efficient way to transform one probability distribution into another, minimizing a cost function that depends on the distance between points. In the context of single-cell genomics, the distributions are set of cells, which we want to map onto each other. Realigning sets of cells is prevalent in single-cell genomics due to the destructive nature of sequencing technologies. | ||
The solution of an ({term}`OT`) problem is given by a {term}`transport matrix` $\mathbf{P} \in \mathbb{R}_{+}^{n \times m}$ where $\mathbf{P}_{i,j}$ describes the amount of mass that is transported from a cell $x_i$ in the source cell distribution to a cell $y_j$ in the target cell distribution. | ||
|
||
In practice, we solve a regularized ({term}`entropic regularization`) formulation of {term}`OT` due to computational and statistical reasons. The first kind of {term}`OT` problem we consider is a {term}`linear problem`, which considers the scenario when both cell distributions live in the same space: | ||
|
||
```{math} | ||
\begin{align*} | ||
\mathbf{L_C^{\varepsilon}(a,b) \overset{\mathrm{def.}}{=} \min_{P\in U(a,b)} \left\langle P, C \right\rangle - \varepsilon H(P).} | ||
\end{align*} | ||
``` | ||
|
||
Here, $\varepsilon$ is the {term}`entropic regularization`, and $\mathbf{H(P) \overset{\mathrm{def.}}{=} - \sum_\mathnormal{i,j} P_\mathnormal{i,j} \left( \log (P_\mathnormal{i,j}) - 1 \right)}$ is the discrete entropy of a coupling matrix. | ||
|
||
:::{figure} figures/Kantorovich_couplings_sol.jpeg | ||
:align: center | ||
:alt: Kantorovich couplings. | ||
:class: img-fluid | ||
|
||
Continuous and discrete couplings between measures $\alpha, \beta$. Figure from {cite}`peyre:19`. | ||
::: | ||
|
||
## Gromov-Wasserstein (GW) | ||
|
||
When the two cell distributions lie in different spaces, we are concerned with the {term}`quadratic problem`. | ||
Here, we assume that two matrices $\mathbf{D \in \mathbb{R}^\mathnormal{n \times n}}$ and $\mathbf{D' \in \mathbb{R}^\mathnormal{m \times m}}$ | ||
quantify similarity relationships between cells within the respective distribution. | ||
|
||
:::{figure} figures/GWapproach.jpeg | ||
:align: center | ||
:alt: Gromov-Wasserstein approach. | ||
:class: img-fluid | ||
|
||
Gromov-Wasserstein approach to comparing two metric measure spaces. Figure from {cite}`peyre:19`. | ||
::: | ||
|
||
The {term}`Gromov-Wasserstein` problem reads | ||
|
||
```{math} | ||
\begin{align*} | ||
\mathrm{GW}\mathbf{((a,D), (b,D'))^\mathrm{2} \overset{\mathrm{def.}}{=} \min_{P \in U(a,b)} \mathcal{E}_{D,D'}(P)} | ||
\\ | ||
\textrm{where} \quad \mathbf{\mathcal{E}_{D,D'}(P) \overset{\mathrm{def.}}{=} \sum_{\mathnormal{i,j,i',j'}} \left| D_{\mathnormal{i,i'}} - D'_{\mathnormal{j,j'}} \right|^\mathrm{2} P_{\mathnormal{i,i'}}P_{\mathnormal{j,j'}}}. | ||
\end{align*} | ||
``` | ||
|
||
In practice, we solve a formulation incorporating {term}`entropic regularization`. | ||
|
||
## Fused Gromov-Wasserstein (FGW) | ||
|
||
{term}`Fused Gromov-Wasserstein` is needed in cases where a data point, e.g. a cell, of the source distribution | ||
has both features in the same space as the target distribution ({term}`linear term`) and features in a | ||
different space than a data point in the target distribution ({term}`quadratic term`). | ||
|
||
:::{figure} figures/FGWadapted.jpg | ||
:align: center | ||
:alt: Fused Gromov-Wasserstein distance. | ||
:class: img-fluid | ||
|
||
Fused Gromov-Wasserstein distance incorporates both feature and structure aspects of the source and target measures. | ||
Figure adapted from {cite}`vayer:20`. | ||
::: | ||
|
||
The FGW problem is defined as | ||
|
||
```{math} | ||
\begin{align*} | ||
\mathrm{FGW}\mathbf{(a,b,D,D',C) \overset{\mathrm{def.}}{=} \min_{P \in U(a,b)} E_{D,D',C}(P)} | ||
\\ | ||
\textrm{where} \quad \mathbf{E_{D,D',C}(P) \overset{\mathrm{def.}}{=} \sum_{\mathnormal{i,j,i',j'}} \left( (1-\alpha)C_\mathnormal{i,j} + \alpha \left| D_{\mathnormal{i,i'}} - D'_{\mathnormal{j,j'}} \right|^\mathrm{2} \right) P_{\mathnormal{i,i'}}P_{\mathnormal{j,j'}}} | ||
\end{align*} | ||
``` | ||
|
||
Here, $D$ and $D'$ are distances defined on the incomparable part of the source space and target space, respectively. $C$ quantifies the distance in the shared space. $\alpha \in [0,1]$ determines the influence of both terms. | ||
|
||
## Unbalanced OT | ||
|
||
When we would like to automatically discard cells (e.g. due to apoptosis or sequencing biases) or increase the influence of cells (e.g. due to proliferation) | ||
we can add a penalty for the amount of mass variation using Kullback-Leibler divergence defined as | ||
|
||
```{math} | ||
\begin{align*} | ||
\mathrm{KL}\mathbf{(P|K) \overset{\mathrm{def.}}{=} \sum_\mathnormal{i,j} P_\mathnormal{i,j} \log \left( \frac{P_\mathnormal{i,j}}{K_\mathnormal{i,j}} \right) - P_\mathnormal{i,j} + K_\mathnormal{i,j}}. | ||
\end{align*} | ||
``` | ||
|
||
In the {term}`linear problem`, this results in the minimisation | ||
|
||
```{math} | ||
\begin{align*} | ||
\mathbf{L_C^{\lambda}(a,b) = \min_{\tilde{a},\tilde{b}} L_C(a,b) + \lambda_1 KL(a,\tilde{a}) + \lambda_2 KL(b,\tilde{b})} \\ | ||
\mathbf{= \min_{P\in \mathbb{R}_+^\mathnormal{n\times m}} \left\langle C,P \right\rangle + \lambda_1 KL(P\mathbb{1}_\mathnormal{m}|a) + \lambda_2 KL(P^\top\mathbb{1}_\mathnormal{m}|b)} | ||
\end{align*} | ||
``` | ||
|
||
where $(\lambda_1, \lambda_2)$ controls how much mass variations are penalized as opposed to transportation of the mass. Here, $\lambda \in [0, \inf]$. Instead, we use the parameter | ||
|
||
$\tau = \frac{\lambda}{\lambda + \varepsilon} \in [0,1]$ | ||
|
||
such that $\tau_a=\tau_b=1$ corresponds to the balanced setting, while a smaller $\tau$ allows for more deviation from the initial distribution. For the {term}`quadratic problem`, the objective is adapted analogously. | ||
|
||
Now you are set to explore use cases in our {doc}`/notebooks/tutorials/index`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters