Goto

Collaborating Authors

 phate


Revealing Neurocognitive and Behavioral Patterns by Unsupervised Manifold Learning from Dynamic Brain Data

Zhou, Zixia, Liu, Junyan, Wu, Wei Emma, Fang, Ruogu, Liu, Sheng, Wei, Qingyue, Yan, Rui, Guo, Yi, Tao, Qian, Wang, Yuanyuan, Islam, Md Tauhidul, Xing, Lei

arXiv.org Artificial Intelligence

Dynamic brain data, teeming with biological and functional insights, are becoming increasingly accessible through advanced measurements, providing a gateway to understanding the inner workings of the brain in living subjects. However, the vast size and intricate complexity of the data also pose a daunting challenge in reliably extracting meaningful information across various data sources. This paper introduces a generalizable unsupervised deep manifold learning for exploration of neurocognitive and behavioral patterns. Unlike existing methods that extract patterns directly from the input data as in the existing methods, the proposed Brain-dynamic Convolutional-Network-based Embedding (BCNE) seeks to capture the brain-state trajectories by deciphering the temporospatial correlations within the data and subsequently applying manifold learning to this correlative representation. The performance of BCNE is showcased through the analysis of several important dynamic brain datasets. The results, both visual and quantitative, reveal a diverse array of intriguing and interpretable patterns. BCNE effectively delineates scene transitions, underscores the involvement of different brain regions in memory and narrative processing, distinguishes various stages of dynamic learning processes, and identifies differences between active and passive behaviors. BCNE provides an effective tool for exploring general neuroscience inquiries or individual-specific patterns.


Reviews: Visualizing the PHATE of Neural Networks

Neural Information Processing Systems

Update after author response: Taking on faith the results the authors report in their author response (namely ability to identify generalization performance using only the training set, results on CIFAR10 and white noise datasets, and the quantitative evaluation of the task-switching), I would raise my score to a 6 (actually if they did achieve everything they claimed in the author response, I would be inclined to give it a 7, but I'd need to see all the results for that). Originality: I think the originality is fairly high. Although the PHATE algorithm exists in the literature, the Multislice kernel is novel, and the idea of visualizing the learning dynamics of the hidden neurons to ascertain things like catastrophic forgetting or poor generalization is (to my knowledge) novel. Quality: I think the Experiments sections could be substantially improved: (1) For the experiments on continual learning, from looking at Figure 3 it is not obvious to me that Adagrad does better than Rehearsal for the "Domain" learning setting, or that Adagrad outperforms Adam at class learning. Adam apparently does the best at task learning, but again, I wouldn't have guessed from the trajectories.


Reviews: Visualizing the PHATE of Neural Networks

Neural Information Processing Systems

The reviewers are all positive if not wildly so, and as the response suggests I would like to not put too much weight on the specific scores. This is a good submission that has a small number of clearly defined improvements outlined in the extensive and helpful reviews.


Exploring the Manifold of Neural Networks Using Diffusion Geometry

Abel, Elliott, Steindl, Andrew J., Mazioud, Selma, Schueler, Ellie, Ogundipe, Folu, Zhang, Ellen, Grinspan, Yvan, Reimann, Kristof, Crevasse, Peyton, Bhaskar, Dhananjay, Viswanath, Siddharth, Zhang, Yanlei, Rudner, Tim G. J., Adelstein, Ian, Krishnaswamy, Smita

arXiv.org Artificial Intelligence

Drawing motivation from the manifold hypothesis, which posits that most high-dimensional data lies on or near low-dimensional manifolds, we apply manifold learning to the space of neural networks. We learn manifolds where datapoints are neural networks by introducing a distance between the hidden layer representations of the neural networks. These distances are then fed to the non-linear dimensionality reduction algorithm PHATE to create a manifold of neural networks. We characterize this manifold using features of the representation, including class separation, hierarchical cluster structure, spectral entropy, and topological structure. Our analysis reveals that high-performing networks cluster together in the manifold, displaying consistent embedding patterns across all these features. Finally, we demonstrate the utility of this approach for guiding hyperparameter optimization and neural architecture search by sampling from the manifold.


Model agnostic local variable importance for locally dependent relationships

Bladen, Kelvyn K., Cutler, Adele, Cutler, D. Richard, Moon, Kevin R.

arXiv.org Machine Learning

Global variable importance measures are commonly used to interpret machine learning model results. Local variable importance techniques assess how variables contribute to individual observations rather than the entire dataset. Current methods typically fail to accurately reflect locally dependent relationships between variables and instead focus on marginal importance values. Additionally, they are not natively adapted for multi-class classification problems. We propose a new model-agnostic method for calculating local variable importance, CLIQUE, that captures locally dependent relationships, contains improvements over permutation-based methods, and can be directly applied to multi-class classification problems. Simulated and real-world examples show that CLIQUE emphasizes locally dependent information and properly reduces bias in regions where variables do not affect the response.


Exploring higher-order neural network node interactions with total correlation

Kerby, Thomas, White, Teresa, Moon, Kevin

arXiv.org Artificial Intelligence

All of these methods require either an input of and the human brain the variables interact interest or the class labels and are thus supervised. in complex ways. Yet accurately characterizing higher-order variable interactions (HOIs) is a difficult In response to the challenges posed by understanding neural problem that is further exacerbated when the networks and analyzing higher-order variable interactions HOIs change across the data. To solve this problem (HOIs), we present Local CorEx, a novel post hoc method we propose a new method called Local Correlation suitable for exploring model weights, nodes, subnetworks, Explanation (CorEx) to capture HOIs at a and latent representations in an unsupervised manner. Here local scale by first clustering data points based on we focus our attention on analyzing groups of hidden nodes their proximity on the data manifold. We then use and latent representations. To the best of our knowledge, our a multivariate version of the mutual information work marks the first post hoc method to do so in an unsupervised called the total correlation, to construct a latent manner and includes the option to easily incorporate factor representation of the data within each cluster label information. Additionally, our approach extends to to learn the local HOIs. We use Local CorEx analyzing HOIs within the data.


Visualizing DNA reaction trajectories with deep graph embedding approaches

Zhang, Chenwei, Duc, Khanh Dao, Condon, Anne

arXiv.org Artificial Intelligence

Synthetic biologists and molecular programmers design novel nucleic acid reactions, with many potential applications. Good visualization tools are needed to help domain experts make sense of the complex outputs of folding pathway simulations of such reactions. Here we present ViDa, a new approach for visualizing DNA reaction folding trajectories over the energy landscape of secondary structures. We integrate a deep graph embedding model with common dimensionality reduction approaches, to map high-dimensional data onto 2D Euclidean space. We assess ViDa on two well-studied and contrasting DNA hybridization reactions. Our preliminary results suggest that ViDa's visualization successfully separates trajectories with different folding mechanisms, thereby providing useful insight to users, and is a big improvement over the current state-of-the-art in DNA kinetics visualization.


Neural FIM for learning Fisher Information Metrics from point cloud data

Fasina, Oluwadamilola, Huguet, Guillaume, Tong, Alexander, Zhang, Yanlei, Wolf, Guy, Nickel, Maximilian, Adelstein, Ian, Krishnaswamy, Smita

arXiv.org Artificial Intelligence

Although data diffusion embeddings are ubiquitous in unsupervised learning and have proven to be a viable technique for uncovering the underlying intrinsic geometry of data, diffusion embeddings are inherently limited due to their discrete nature. To this end, we propose neural FIM, a method for computing the Fisher information metric (FIM) from point cloud data - allowing for a continuous manifold model for the data. Neural FIM creates an extensible metric space from discrete point cloud data such that information from the metric can inform us of manifold characteristics such as volume and geodesics. We demonstrate Neural FIM's utility in selecting parameters for the PHATE visualization method as well as its ability to obtain information pertaining to local volume illuminating branching points and cluster centers embeddings of a toy dataset and two single-cell datasets of IPSC reprogramming and PBMCs (immune cells).


Why you should be using PHATE for dimensionality reduction

#artificialintelligence

As data scientists, we often work with high-dimensional data with more than 3 features, or dimensions, of interest. In supervised machine learning, we may use this data for training and classification for example and may reduce the dimensions to speed up the training. In unsupervised learning, we use this type of data for visualization and clustering. In single-cell RNA sequencing (scRNA-seq), for example, we accumulate measurements of tens of thousands of genes per cell for upwards of a million cells. That's a lot of data that provides a window into the cell's identity, state, and other properties.


Top 5 Single-cell Genomics Papers of 2021

#artificialintelligence

In the age of Big Data in biology, data science and machine learning have flourished and benefitted from their interdisciplinary application to biology. As a graduate student in this discipline, I read a lot of papers to stay up to date on the literature (and still have a large reading list to catch up on!), and thought I would share what have been some of the best papers I've read this year. In about 80–90% of the single-cell papers you'll encounter, depending on the research question, there will be at least one or two tSNE or UMAP plots to visualize the data they've collected, usually single-cell RNA-sequencing (scRNA-seq) data, where individual cells are profiled for their RNA abundance across the genome. These unsupervised dimensionality reduction methods have been more or less accepted as the status quo for data visualization in the world of single-cell genomics, so it took Academic Twitter by storm this summer when a new preprint boldly challenged that norm, arguing that these methods do little to preserve the latent structure of the data it seeks to convey to our 3D minds. Using the extreme example of preserving equidistant cells in high-dimensional space, and later relaxing it to near-equidistance, they show how tSNE and UMAP distort the orientation of groups of cells with near-equidistance spacing in the original space, clustering them with groups of cells that are evenly spread further apart.