https://www.arxiv.org/abs/1805.09949
Propose the labeled complexes to perform persistent homology inference of decision boundaries in classification tasks, and provide theoretical conditions and analysis for recovering the homology of a decision boundary from samples.
https://www.arxiv.org/abs/1910.07617
Characterize two types of directed homology for MLP, show that the directed flag homology reduces to computing the simplicical homology of the underlying undirected graph. This allows to investigate homological differences between NN architectures and their realized structure.
https://www.arxiv.org/abs/1901.09496
Introduce a method for computing the persistent homology over the graphical activation structure of NN, which provides access to the task-relevant substructures activated throughout feed forward. Using this approach, show that existence of adversarial example is alternations to the dominant activation structures, suggesting the representation are sparse on the input space.
https://www.arxiv.org/abs/2106.03016
Construct clique complex on trained DNNs, and compute the one-dimensional persistent homology of DNNs. This reveals the combinatorial effects of multiple neurons in DNNs at different resolution.
https://www.arxiv.org/abs/1812.09764
Propose neural persistence, which is complexity measure on weighted stratified graphs, and show that neural persistence reflects best practices such as dropout and batch normalization.
https://www.arxiv.org/abs/204.02881
Use persistent homology to investigate topological invariant of input space, then derive a decomposition of the underlying space with well known topology.
https://www.arxiv.org/abs/2109.01461
Derive the upper bounds of the Betti numbers on each layer within the network, reducing the problem of architecture selection of a fully connected network boils down to choosing a suitable size of the network.
https://www.arxiv.org/abs/1802.04443
Show that the power of the topological capacity of a dataset in its decision region is a strictly limiting factor in its ability to generalize.
https://www.arxiv.org/abs/2111.13171
By making a novel connection between learning theory and topological data analysis, illustrate that the generalization error can be bounded in terms of a notion called persistent homology dimension, which does not require assumptions on the training dynamics. Aslo develop an efficient algorithm to estimate PHD with visualization tools.
https://www.arxiv.org/abs/1903.08519
Prove that the accuracy of the learning process of a neural network on a representative dataset is similar to the accuracy on the original dataset, where representativeness is measured using persistence diagrams.
https://www.ieeexplore.ieee.org/abstract/document/8746812
Define a method for calculating Riemann and Ricci curvature tensors for a trained neural net.
https://www.arxiv.org/abs/2201.09656
Study the sequence of maps between manifolds, investigate the structures induced through pullbacks of Riemannian metric, showing that pullback is a degenerate Riemannian metric inducing a structure of pseudometric spac, and the Kolmogorov quotient yields a smooth metric.
https://www.arxiv.org/abs/2111.15651
Define a new class of topological features that accurately characterize the process of learning, and equipped for backpropagation. Show that they can predict performance of a DNN without a testing set and high-performance computing.
https://www.arxiv.org/abs/2012.15834
Use the Morse complex of the loss function to relate local behavior of gradient descent trajectory with global properties, and define Topological Obstruction score with barcodes of the loss function, quantifying the badness of local minima for gradient-based optimization.
https://www.arxiv.org/abs/2104.08894
Apply dimension estimation tools to popular datasets, find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images. Find that low dimensional datasets are easier for NNs to learn, and models solving these tasks generalize better from training to test data.
https://www.arxiv.org/abs/2205.08518
It was shown previously that ANN-based compressors can achieve the optimal entropy-distortion curve for some low-dimensinal manifolds in high-dimensional ambient spaces. Show that the optimal entropy-distortion tradeoffs for two low-dimensiaonl manifolds with circular structure, and show that SOTA ANN-based compressors fail to optimally compress them.
https://www.arxiv.org/abs/2210.15058
Define the convolution over the tangent bundles via connection Laplacian operator, and tangent bundle NN. Show that the discretization of TNN is Sheaf neural networks, and show that this discrete architecture converges to underlying continuous TNN.
https://www.arxiv.org/abs/2301.11375
Show that random parameter of neural network induces the metric on input space for infinite width neural networks, and applying this to finite Bayesian neural networks empirically show that the volume magnifies along decision boundaries.
https://www.arxiv.org/abs/2301.12651
Give an improved bound on the number of complex critical point of loss function, of linear shallow neural network with single data point. Also show that for any number of hidden layers, complex critical points with zero coordinates arise in certain patterns, which are classified for one hidden layer.