Goto

Collaborating Authors

 PDF


Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction

Neural Information Processing Systems

Decoding non-invasive brain recordings is pivotal for advancing our understanding of human cognition but faces challenges due to individual differences and complex neural signal representations. Traditional methods often require customized models and extensive trials, lacking interpretability in visual reconstruction tasks.


GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction

Neural Information Processing Systems

Representing 3D scenes from multiview images remains a core challenge in computer vision and graphics, requiring both reliable rendering and reconstruction, which often conflicts due to the mismatched prioritization of image quality over precise underlying scene geometry. Although both neural implicit surfaces and explicit Gaussian primitives have advanced with neural rendering techniques, current methods impose strict constraints on density fields or primitive shapes, which enhances the affinity for geometric reconstruction at the sacrifice of rendering quality. To address this dilemma, we introduce GSDF, a dual-branch architecture combining 3D Gaussian Splatting (3DGS) and neural Signed Distance Fields (SDF). Our approach leverages mutual guidance and joint supervision during the training process to mutually enhance reconstruction and rendering. Specifically, our method guides the Gaussian primitives to locate near potential surfaces and accelerates the SDF convergence. This implicit mutual guidance ensures robustness and accuracy in both synthetic and real-world scenarios. Experimental results demonstrate that our method boosts the SDF optimization process to reconstruct more detailed geometry, while reducing floaters and blurry edge artifacts in rendering by aligning Gaussian primitives with the underlying geometry.


Andrea Banino

Neural Information Processing Systems

Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. In particular, we evaluate vision-language representations, pretrained on natural image captioning datasets. We show that these pretrained representations drive meaningful, task-relevant exploration and improve performance on 3D simulated environments. We also characterize why and how language provides useful abstractions for exploration by considering the impacts of using representations from a pretrained model, a language oracle, and several ablations. We demonstrate the benefits of our approach with on-and off-policy RL algorithms and in two very different task domains-- one that stresses the identification and manipulation of everyday objects, and one that requires navigational exploration in an expansive world. Our results suggest that using language-shaped representations could improve exploration for various algorithms and agents in challenging environments.



ENS10_Paper_Neurips22

Neural Information Processing Systems

Post-processing ensemble prediction systems can improve the reliability of weather forecasting, especially for extreme event prediction. In recent years, different machine learning models have been developed to improve the quality of weather post-processing. However, these models require a comprehensive dataset of weather simulations to produce high-accuracy results, which comes at a high computational cost to generate.


Characterization of Excess Risk for Locally Strongly Convex Population Risk Mingyang Yi, Zhi-Ming Ma University of Chinese Academy of Sciences

Neural Information Processing Systems

We establish upper bounds for the expected excess risk of models trained by proper iterative algorithms which approximate the local minima. Unlike the results built upon the strong globally strongly convexity or global growth conditions e.g., PL-inequality, we only require the population risk to be locally strongly convex around its local minima. Concretely, our bound under convex problems is of order ร•(1/n). For non-convex problems with d model parameters such that d/n is smaller than a threshold independent of n, the order of ร•(1/n) can be maintained if the empirical risk has no spurious local minima with high probability. Moreover, the bound for non-convex problem becomes ร•(1/ n) without such assumption. Our results are derived via algorithmic stability and characterization of the empirical risk's landscape. Compared with the existing algorithmic stability based results, our bounds are dimensional insensitive and without restrictions on the algorithm's implementation, learning rate, and the number of iterations. Our bounds underscore that with locally strongly convex population risk, the models trained by any proper iterative algorithm can generalize well, even for non-convex problems, and d is large.


Characterization of Excess Risk for Locally Strongly Convex Population Risk Mingyang Yi, Zhi-Ming Ma University of Chinese Academy of Sciences

Neural Information Processing Systems

We establish upper bounds for the expected excess risk of models trained by proper iterative algorithms which approximate the local minima. Unlike the results built upon the strong globally strongly convexity or global growth conditions e.g., PL-inequality, we only require the population risk to be locally strongly convex around its local minima. Concretely, our bound under convex problems is of order ร•(1/n). For non-convex problems with d model parameters such that d/n is smaller than a threshold independent of n, the order of ร•(1/n) can be maintained if the empirical risk has no spurious local minima with high probability. Moreover, the bound for non-convex problem becomes ร•(1/ n) without such assumption. Our results are derived via algorithmic stability and characterization of the empirical risk's landscape. Compared with the existing algorithmic stability based results, our bounds are dimensional insensitive and without restrictions on the algorithm's implementation, learning rate, and the number of iterations. Our bounds underscore that with locally strongly convex population risk, the models trained by any proper iterative algorithm can generalize well, even for non-convex problems, and d is large.


Scenario Diffusion: Controllable Driving Scenario Generation With Diffusion

Neural Information Processing Systems

Automated creation of synthetic traffic scenarios is a key part of validating the safety of autonomous vehicles (AVs). In this paper, we propose Scenario Diffusion, a novel diffusion-based architecture for generating traffic scenarios that enables controllable scenario generation. We combine latent diffusion, object detection and trajectory regression to generate distributions of synthetic agent poses, orientations and trajectories simultaneously. To provide additional control over the generated scenario, this distribution is conditioned on a map and sets of tokens describing the desired scenario. We show that our approach has sufficient expressive capacity to model diverse traffic patterns and generalizes to different geographical regions.


5c8cb735a1ce65dac514233cbd5576d6-AuthorFeedback.pdf

Neural Information Processing Systems

First of all, we want to thank every reviewer for valuable notes and comments. In particular, we will discuss tuning time of the algorithms. Our paper is based on a standard GBDT score function (as, e.g., in [21]). The algorithm is easy to derive from our paper, when you replace a leaf size in Eq. 6 with sum Performance of this hessian-based sampling is even better (see Table 1), and we will add these results to the paper. We will add this to the paper.