Goto

Collaborating Authors

 North Carolina


LP-3DGS: Learning to Prune 3D Gaussian Splatting

Neural Information Processing Systems

Recently, 3D Gaussian Splatting (3DGS) has become one of the mainstream methodologies for novel view synthesis (NVS) due to its high quality and fast rendering speed. However, as a point-based scene representation, 3DGS potentially generates a large number of Gaussians to fit the scene, leading to high memory usage. Improvements that have been proposed require either an empirical preset pruning ratio or importance score threshold to prune the point cloud. Such hyperparameters require multiple rounds of training to optimize and achieve the maximum pruning ratio while maintaining the rendering quality for each scene. In this work, we propose learning-to-prune 3DGS (LP-3DGS), where a trainable binary mask is applied to the importance score to automatically find a favorable pruning ratio. Instead of using the traditional straight-through estimator (STE) method to approximate the binary mask gradient, we redesign the masking function to leverage the Gumbel-Sigmoid method, making it differentiable and compatible with the existing training process of 3DGS. Extensive experiments have shown that LP-3DGS consistently achieves a good balance between efficiency and high quality.


Learning to Prove Theorems by Learning to Generate Theorems

Neural Information Processing Systems

We consider the task of automated theorem proving, a key AI task. Deep learning has shown promise for training theorem provers, but there are limited humanwritten theorems and proofs available for supervised learning. To address this limitation, we propose to learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover. Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover and advances the state of the art of automated theorem proving in Metamath.


Prior-itizing Privacy: A Bayesian Approach to Setting the Privacy Budget in Differential Privacy Jerome P. Reiter Department of Statistical Science Department of Statistical Science Duke University

Neural Information Processing Systems

When releasing outputs from confidential data, agencies need to balance the analytical usefulness of the released data with the obligation to protect data subjects' confidentiality. For releases satisfying differential privacy, this balance is reflected by the privacy budget, ฮต. We provide a framework for setting ฮต based on its relationship with Bayesian posterior probabilities of disclosure. The agency responsible for the data release decides how much posterior risk it is willing to accept at various levels of prior risk, which implies a unique ฮต. Agencies can evaluate different risk profiles to determine one that leads to an acceptable trade-off in risk and utility.


NeuroPath: A Neural Pathway Transformer for Joining the Dots of Human Connectomes

Neural Information Processing Systems

Although modern imaging technologies allow us to study connectivity between two distinct brain regions in-vivo, an in-depth understanding of how anatomical structure supports brain function and how spontaneous functional fluctuations emerge remarkable cognition is still elusive. Meanwhile, tremendous efforts have been made in the realm of machine learning to establish the nonlinear mapping between neuroimaging data and phenotypic traits. However, the absence of neuroscience insight in the current approaches poses significant challenges in understanding cognitive behavior from transient neural activities. To address this challenge, we put the spotlight on the coupling mechanism of structural connectivity (SC) and functional connectivity (FC) by formulating such network neuroscience question into an expressive graph representation learning problem for high-order topology. Specifically, we introduce the concept of topological detour to characterize how a ubiquitous instance of FC (direct link) is supported by neural pathways (detour) physically wired by SC, which forms a cyclic loop interacted by brain structure and function. In the clichรฉ of machine learning, the multi-hop detour pathway underlying SC-FC coupling allows us to devise a novel multi-head self-attention mechanism within Transformer to capture multi-modal feature representation from paired graphs of SC and FC. Taken together, we propose a biological-inspired deep model, coined as NeuroPath, to find putative connectomic feature representations from the unprecedented amount of neuroimages, which can be plugged into various downstream applications such as task recognition and disease diagnosis. We have evaluated NeuroPath on large-scale public datasets including Human Connectome Project (HCP) and UK Biobank (UKB) under different experiment settings of supervised and zero-shot learning, where the state-of-the-art performance by our NeuroPath indicates great potential in network neuroscience.


Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function

Neural Information Processing Systems

Doc., we provide some useful properties of Finally, we prove Lemma 3.1 in the main text. A.1 Properties of the smoothed function ฯ† The statement (a) can be found in [3, Corollary 17.19]. Applying again [3, Corollary 17.19] we prove (b). The statement (c) holds due to the well-known Baillon-Haddad theorem [3, Corollary 18.17]. Under Assumptions 2.1 and 2.2, for given x and ฮท > 0, x Combining this inequality and (44), we obtain (42).


Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function

Neural Information Processing Systems

We develop a novel and single-loop variance-reduced algorithm to solve a class of stochastic nonconvex-convex minimax problems involving a nonconvex-linear objective function, which has various applications in different fields such as machine learning and robust optimization. This problem class has several computational challenges due to its nonsmoothness, nonconvexity, nonlinearity, and non-separability of the objective functions. Our approach relies on a new combination of recent ideas, including smoothing and hybrid biased variance-reduced techniques.


Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification, Nan Huang

Neural Information Processing Systems

Medical time series (MedTS) data, such as Electroencephalography (EEG) and Electrocardiography (ECG), play a crucial role in healthcare, such as diagnosing brain and heart diseases. Existing methods for MedTS classification primarily rely on handcrafted biomarkers extraction and CNN-based models, with limited exploration of transformer-based models. In this paper, we introduce Medformer, a multi-granularity patching transformer tailored specifically for MedTS classification. Our method incorporates three novel mechanisms to leverage the unique characteristics of MedTS: cross-channel patching to leverage inter-channel correlations, multi-granularity embedding for capturing features at different scales, and two-stage (intra-and inter-granularity) multi-granularity self-attention for learning features and correlations within and among granularities. We conduct extensive experiments on five public datasets under both subject-dependent and challenging subject-independent setups. Results demonstrate Medformer's superiority over 10 baselines, achieving top averaged ranking across five datasets on all six evaluation metrics. These findings underscore the significant impact of our method on healthcare applications, such as diagnosing Myocardial Infarction, Alzheimer's, and Parkinson's disease.


Bayesian Optimization of Function Networks: Supplementary Material

Neural Information Processing Systems

In this section, we provide a formal statement and proof of Proposition 1. We begin by proving the following auxiliary result. We are now in position to show Proposition 1, which can be seen as a simple generalization of Theorem 1 in Balandat et al. (2020). R, k = 1,..., K, are Lipschitz continuous. R R, k = 1,..., K, given by f The desired result is now a direct consequence of Proposition 2 in the supplement of Balandat et al. (2020), which is in turn a consequence of Theorem 2.3 in Homem-de Mello (2008).


A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression

Neural Information Processing Systems

This paper conducts a comprehensive study of the learning curves of kernel ridge regression (KRR) under minimal assumptions. Our contributions are three-fold: 1) we analyze the role of key properties of the kernel, such as its spectral eigen-decay, the characteristics of the eigenfunctions, and the smoothness of the kernel; 2) we demonstrate the validity of the Gaussian Equivalent Property (GEP), which states that the generalization performance of KRR remains the same when the whitened features are replaced by standard Gaussian vectors, thereby shedding light on the success of previous analyzes under the Gaussian Design Assumption; 3) we derive novel bounds that improve over existing bounds across a broad range of setting such as (in)dependent feature vectors and various combinations of eigen-decay rates in the over/underparameterized regimes.