We introduce SG-t-SNE-Π, a high-performance software for swift
embedding of a large, sparse, stochastic graph
into a
-dimensional space
(
) on a shared-memory computer. The algorithm SG-t-SNE and the
software t-SNE-Π were first described in Reference [1].
The algorithm is built upon precursors for embedding a
-nearest
neighbor (
NN) graph, which is distance-based and regular with
constant degree
. In practice, the precursor algorithms are also
limited up to 2D embedding or suffer from overly long latency in 3D
embedding. SG-t-SNE removes the algorithmic restrictions and enables
-dimensional embedding of arbitrary stochastic graphs, including, but
not restricted to,
NN graphs. SG-t-SNE-Π expedites the
computation with high-performance functions and materializes 3D
embedding in shorter time than 2D embedding with any precursor algorithm
on modern laptop/desktop computers.
The original t-SNE [2] has given rise to several variants. Two
of the variants, t-SNE-BH [3] and
FIt-SNE [4], are distinctive and representative in their
approximation approaches to reducing algorithmic complexity. They are,
however, limited to NN graph embedding. Specifically, at the user
interface, a set of
data points,
, is provided in terms of their
feature vectors
in an
-dimensional vector space
equipped with a metric/distance function. The input parameters include
for the embedding dimension,
for the number of near-neighbors,
and
for the perplexity. A t-SNE algorithm maps the data points
to data points
in a
-dimensional space.
There are two basic algorithmic stages in a conventional t-SNE
algorithm. In the preprocessing stage, the NN graph is generated from
the feature vectors
according to the metric function
and input parameter
. Each data point is associated with a graph
vertex. Next, the
NN graph is cast into a stochastic one,
, and symmetrized to
,
where
is the binary-valued adjacency matrix of the
NN graph, with zero diagonal elements (i.e., the graph has no
self-loops), and
is the distance between
and
. The Gaussian parameters
are determined by the
point-wise equations related to the same perplexity value
,
The next stage is to determine and locate the embedding coordinates
by minimizing the
Kullback-Leibler divergence
where matrix is made of the
ensemble
regulated by the Student t-distribution,
In other words, the objective of
(3) is to find the optimal
stochastic matching between and
defined,
respectively, over the feature vector set
and the embedding
coordinate set
. The optimal matching is obtained numerically
by applying the gradient descent method. A main difference among the
precursor algorithms lies in how the gradient of the objective function
is computed.
The computation per iteration step is dominated by the calculation of the gradient. Van der Maaten reformulated the gradient into two terms [3]:
The attractive interaction term can be cast as the sum of
matrix-vector products with the sparse matrix
. The vectors are composed of the
embedding coordinates, one in each dimension. The repulsive interaction
term can be cast as the sum of
matrix-vector products with the
dense matrix
. For clarity, we simply
refer to the two terms as the
term and the
term, respectively.
The (repulsion) term is in fact a broad-support, dense
convolution with the Student t-distribution kernel on non-equispaced,
scattered data points. As the matrix is dense, a naive method for
calculating the term takes
arithmetic operations. The quadratic
complexity limits the practical use of t-SNE to small graphs. Two types
of existing approaches reduce the quadratic complexity to
,
they are typified by t-SNE-BH and FIt-SNE. The algorithm t-SNE-BH,
introduced by van der Maaten [3], is based on the
Barnes-Hut algorithm. The broad-support convolution is factored into
convolutions of narrow support, at multiple spatial levels,
each narrowly supported algorithm takes
operations. FIt-SNE,
presented by Linderman et al. [4], may be viewed as based
on non-uniform fast Fourier transforms. The execution time of each
approximate algorithm becomes dominated by the
(attraction) term computation. The execution time also faces a steep
rise from 2D to 3D embedding.
With the algorithm SG-t-SNE we extend the use of t-SNE to any sparse
stochastic graph . The key input
is the stochastic matrix
,
, associated with the graph, where
is not
restricted to the form of
(1).
We introduce a parametrized, non-linear rescaling mechanism to explore
the graph sparsity. We determine rescaling parameters
by
where is an input parameter and
is a monotonically
increasing function. We set
in the present version of
SG-t-SNE-Π. Unlike
(2), the rescaling mechanism
(6) imposes no constraint on the graph,
its solution exists unconditionally. For the conventional t-SNE as a
special case, we set
by default. One may still make use of
and exploit the benefit of rescaling (
).
With the implementation SG-t-SNE-Π, we accelerate the entire
gradient calculation of SG-t-SNE and enable practical 3D embedding of
large sparse graphs on modern desktop and laptop computers. We
accelerate the computation of both and
terms
by utilizing the matrix structures and the memory architecture in
tandem.
The matrix in the attractive interaction term of
(5) has the same sparsity pattern as
matrix
, regardless of iterative changes in
.
Sparsity patterns are generally irregular. Matrix-vector products with
irregular sparse matrix invoke irregular memory accesses and incur
non-equal, prolonged access latencies on hierarchical memories. We
moderate memory accesses by permuting the rows and columns of matrix
such that rows and columns with similar nonzero patterns
are placed closer together. The permuted matrix becomes block-sparse
with denser blocks, resulting in better data locality in memory reads
and writes.
The permuted matrix is stored in the Compressed Sparse
Blocks (CSB) storage format [5]. We utilize the CSB routines
for accessing the matrix and calculating the matrix-vector products with
the sparse matrix
. The elements of the
matrix are formed on the fly during the calculation of the attractive
interaction term.
We factor the convolution in the repulsive interaction term of (5) into three consecutive convolutional operations. We introduce an internal equispaced grid within the spatial domain of the embedding points at each iteration, similar to the approach used in FIt-SNE [4]. The three convolutional operations are:
S2G
: Local translation of the scattered (embedding) points to their
neighboring grid points.
G2G
: Convolution across the grid with the same t-distribution kernel
function, which is symmetric, of broad support, and aperiodic.
G2S
: Local translation of the gridded data to the scattered points.
The G2S
operation is a gridded interpolation and S2G
is its
transpose; the arithmetic complexity is , where
is the
interpolation window size per side. Convolution on the grid takes
arithmetic operations, where
is the
number of grid points, i.e., the grid size. The grid size is determined
by the range of the embedding points at each iteration, with respect to
the error tolerance set by default or specified by the user. In the
current implementation, the local interpolation method employed by
SG-t-SNE-Π is accurate up to cubic polynomials in
separable
variables (
).
Although the arithmetic complexity is substantially reduced in
comparison to the quadratic complexity of the direct way, the factored
operations suffer either from memory access latency or memory capacity
issues, which were not recognized or resolved in existing t-SNE
software. The scattered translation incurs high memory access latency.
The aperiodic convolution on the grid suffers from excessive use of
memory when the grid is periodically extended in all sides at once by
zero padding. The exponential memory growth with limits the
embedding dimension or the graph size.
We resolve these memory latency and capacity issues in SG-t-SNE-Π.
Prior to S2G
, we relocate the scattered data points to the grid bins.
This binning process has two immediate benefits. It improves data
locality in the subsequent interpolation. It also establishes a data
partition for parallel, multi-threaded execution of the scattered
interpolation. We omit the parallelization details. For G2G
, we
implement aperiodic convolution by operator splitting, without using
extra memory.
In sparse or structured matrix computation of arithmetic
complexity, the execution time is dominated by memory accesses. We have
described
in the previous sections how we use
intra-term permutations to improve data locality and reduce memory
access latency in computing the attraction and the repulsion terms of
(5). In addition, we permute and relocate
in memory the embedding data points between the two terms, at every
iteration step. The inter-term data relocation is carried out at
multiple layers, exploiting block-wise memory hierarchy. The data
permutation overhead is well paid-off by the much shortened time for
arithmetic calculation with the permuted data. We use Π in the
software name SG-t-SNE-Π to signify the importance and the role of
the permutations in accelerating t-SNE algorithms, including the
conventional one, and enabling 3D embeddings.
Supplementary material and performance plots are found at http://t-sne-pi.cs.duke.edu.
[1] N. Pitsianis, A.-S. Iliopoulos, D. Floros, and X. Sun. Spaceland embedding of sparse stochastic graphs. In IEEE High Performance Extreme Computing Conference, 2019.
[2] L. van der Maaten and G. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research 9(Nov):2579–2605, 2008.
[3] L. van der Maaten. Accelerating t-SNE using tree-based algorithms. Journal of Machine Learning Research 15(Oct):3221–3245, 2014.
[4] G. C. Linderman, M. Rachh, J. G. Hoskins, S. Steinerberger, and Y. Kluger. Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nature Methods 16(3):243–245, 2019.
[5] A. Buluç, J. T. Fineman, M. Frigo, J. R. Gilbert, and C. E. Leiserson. Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In Proceedings of Annual Symposium on Parallelism in Algorithms and Architectures, pp. 233–244, 2009.
[6] N. Pitsianis, D. Floros, A.-S. Iliopoulos, and X. Sun. SG-t-SNE-Π: Swift neighbor embedding of sparse stochastic graphs. Journal of Open Source Software 4(39):1577, 2019.
SG-t-SNE-Π is developed for shared-memory computers with multi-threading,
running Linux or macOS operating system. The source code must be compiled with a
C++ compiler which supports Cilk. The current release is tested with
OpenCilk 1.0 (based on LLVM/Tapir clang++
10.0.1) and
Intel Cilk Plus (GNU g++
7.4.0 and Intel icpc
19.0.4.233).
WARNING: Intel Cilk Plus is deprecated and not supported in newer versions of GNU
g++
and Intelicpc
.)
SG-t-SNE-Π uses the following open-source software:
On Ubuntu:
sudo apt-get install libtbb-dev libflann-dev libmetis-dev libfftw3-dev liblz4-dev doxygen
On macOS:
sudo port install flann tbb metis fftw-3 lz4 doxygen
We use the Meson build system to configure, build, and install the SG-t-SNE-Π library. By default, Meson uses Ninja as its build backend. Both Meson and Ninja can be installed via the Python Package Index (PyPI):
pip3 install --user meson
pip3 install --user ninja
For more information and alternative methods for installing Meson, see Getting meson.
To configure the SG-t-SNE-Π library, demos, conventional t-SNE interface, and documentation, issue the following:
CXX=<c++-compiler> meson <build-options> <build-path>
This will create and configure the build directory at <build-path>
. The CXX
environment variable specifies the C++ compiler to use. The <build-options>
flags are optional; to check the available Meson build options, see
meson_options.txt.
To compile SG-t-SNE-Π within the build directory, issue:
meson compile -C <build-path>
To install all relevant targets, issue:
meson install -C <build-path>
By default, this will install targets in sub-directories lib
, bin
,
include
, and doc
within the SG-t-SNE-Π project directory. To change the
installation prefix, specify the -Dprefix=<install-prefix>
option when
configuring with Meson.
You may test the installation with:
<install-prefix>/bin/test_modules
To compile the SG-t-SNE-Π MATLAB wrappers, specify -Denable_matlab=true
and
-Dmatlabroot=<path-to-matlab-installation>
when configuring.
If building with a compiler which uses an Intel Cilk Plus implementation, you
may also need to set -Ddir_libcilkrts=<path-to-libcilkrts.so-parent-dir>
.
This is not necessary when building with OpenCilk.
We provide two data sets of modest size for demonstrating stochastic graph embedding with SG-t-SNE-Π:
tar -xvzf data/mobius-graph.tar.gz
bin/demo_stochastic_matrix mobius-graph.mtx
tar -xvzf data/pbmc-graph.tar.gz
bin/demo_stochastic_matrix pbmc-graph.mtx
The MNIST data set can be tested using existing wrappers provided by van der Maaten [3].
If you use our package, please cite the following papers:
@inproceedings{pitsianis2019hpec,
title = {Spaceland Embedding of Sparse Stochastic Graphs},
booktitle = {{{IEEE High Performance Extreme Computing Conference}}},
author = {Pitsianis, Nikos and Iliopoulos, Alexandros-Stavros and Floros, Dimitris and Sun, Xiaobai},
date = {2019},
doi = {10.1109/HPEC.2019.8916505}
}
@article{pitsianis2019joss,
title = {{{SG-t-SNE-Π}}: Swift Neighbor Embedding of Sparse Stochastic Graphs},
author = {Pitsianis, Nikos and Floros, Dimitris and Iliopoulos, Alexandros-Stavros and Sun, Xiaobai},
date = {2019},
journaltitle = {Journal of Open Source Software},
volume = {4},
number = {39},
pages = {1577},
issn = {2475-9066},
doi = {10.21105/joss.01577},
url = {http://dx.doi.org/10.21105/joss.01577}
}
The SG-t-SNE-Π library is licensed under the GNU general public license v3.0. To contribute to SG-t-SNE-Π or report any problem, follow our contribution guidelines and code of conduct.
Design and development:
Nikos Pitsianis1,2, Dimitris Floros1,
Alexandros-Stavros Iliopoulos2, Xiaobai
Sun2
1 Department of Electrical and Computer Engineering,
Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
2 Department of Computer Science, Duke University, Durham, NC
27708, USA