AITopics | Neural Information Processing Systems

Collaborating Authors

Neural Information Processing Systems

DiffuBox: Refining 3D Object Detection with Point Diffusion Katie Z Luo

Neural Information Processing SystemsJun-1-2025, 07:51:28 GMT

Ensuring robust 3D object detection and localization is crucial for many applications in robotics and autonomous driving. Recent models, however, face difficulties in maintaining high performance when applied to domains with differing sensor setups or geographic locations, often resulting in poor localization accuracy due to domain shift. To overcome this challenge, we introduce a novel diffusion-based box refinement approach. This method employs a domain-agnostic diffusion model, conditioned on the LiDAR points surrounding a coarse bounding box, to simultaneously refine the box's location, size, and orientation. We evaluate this approach under various domain adaptation settings, and our results reveal significant improvements across different datasets, object classes and detectors.

artificial intelligence, diffubox, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (1.00)
Transportation > Ground > Road (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.66)

Add feedback

OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators Allen Nie 1 Christina J. Yuan

Neural Information Processing SystemsJun-1-2025, 07:51:06 GMT

Offline policy evaluation (OPE) allows us to evaluate and estimate a new sequential decision-making policy's performance by leveraging historical interaction data collected from other policies. Evaluating a new policy online without a confident estimate of its performance can lead to costly, unsafe, or hazardous outcomes, especially in education and healthcare. Several OPE estimators have been proposed in the last decade, many of which have hyperparameters and require training. Unfortunately, choosing the best OPE algorithm for each task and domain is still unclear. In this paper, we propose a new algorithm that adaptively blends a set of OPE estimators given a dataset without relying on an explicit selection using a statistical procedure. We prove that our estimator is consistent and satisfies several desirable properties for policy evaluation. Additionally, we demonstrate that when compared to alternative approaches, our estimator can be used to select higher-performing policies in healthcare and robotics. Our work contributes to improving ease of use for a general-purpose, estimator-agnostic, off-policy evaluation framework for offline RL.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.14)
North America > Canada > Alberta (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

aaebdb8bb6b0e73f6c3c54a0ab0c6415-AuthorFeedback.pdf

Neural Information Processing SystemsJun-1-2025, 07:50:24 GMT

Finally, we show in Lines 718-724 that LSPIv2 satisfies Assumption 1 (by Theorem 2.2) and make

artificial intelligence, machine learning, value function, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Private Attribute Inference from Images with Vision-Language Models

Neural Information Processing SystemsJun-1-2025, 07:49:26 GMT

As large language models (LLMs) become ubiquitous in our daily tasks and digital interactions, associated privacy risks are increasingly in focus. While LLM privacy research has primarily focused on the leakage of model training data, it has recently been shown that LLMs can make accurate privacy-infringing inferences from previously unseen texts. With the rise of vision-language models (VLMs), capable of understanding both images and text, a key question is whether this concern transfers to the previously unexplored domain of benign images posted online. To answer this question, we compile an image dataset with human-annotated labels of the image owner's personal attributes. In order to understand the privacy risks posed by VLMs beyond traditional human attribute recognition, our dataset consists of images where the inferable private attributes do not stem from direct depictions of humans. On this dataset, we evaluate 7 state-of-the-art VLMs, finding that they can infer various personal attributes at up to 77.6% accuracy. Concerningly, we observe that accuracy scales with the general capabilities of the models, implying that future models can be misused as stronger inferential adversaries, establishing an imperative for the development of adequate defenses.

information, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Texas (0.27)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

SF-V: Single Forward Video Generation Model

Neural Information Processing SystemsJun-1-2025, 07:44:11 GMT

Diffusion-based video generation models have demonstrated remarkable success in obtaining high-fidelity videos through the iterative denoising process. However, these models require multiple denoising steps during sampling, resulting in high computational costs. In this work, we propose a novel approach to obtain singlestep video generation models by leveraging adversarial training to fine-tune pretrained video diffusion models. We show that, through the adversarial training, the multi-steps video diffusion model, i.e., Stable Video Diffusion (SVD), can be trained to perform single forward pass to synthesize high-quality videos, capturing both temporal and spatial dependencies in the video data. Extensive experiments demonstrate that our method achieves competitive generation quality of synthesized videos with significantly reduced computational overhead for the denoising process (i.e., around 23 speedup compared with SVD and 6 speedup compared with existing works, with even better generation quality), paving the way for real-time video synthesis and editing. Work done during an internship at Snap Inc.

diffusion model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana (0.14)
North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Coordinated hippocampal-entorhinal replay as structural inference

Talfan Evans, Neil Burgess

Neural Information Processing SystemsJun-1-2025, 07:43:46 GMT

Constructing and maintaining useful representations of sensory experience is essential for reasoning about ones environment. High-level associative (topological) maps can be useful for efficient planning and are easily constructed from experience. Conversely, embedding new experiences within a metric structure allows them to be integrated with existing ones and novel associations to be implicitly inferred. Neurobiologically, the synaptic associations between hippocampal place cells and entorhinal grid cells are thought to represent associative and metric structures, respectively. Learning the place-grid cell associations can therefore be interpreted as learning a mapping between these two spaces. Here, we show how this map could be constructed by probabilistic message-passing through the hippocampalentorhinal system, where messages are scheduled to reduce the propagation of redundant information. We propose that this offline inference corresponds to coordinated hippocampal-entorhinal replay during sharp wave ripples. Our results also suggest that the metric map will contain local distortions that reflect the inferred structure of the environment according to associative experience, explaining observed grid deformations.

artificial intelligence, machine learning, replay, (18 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.97)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.70)

Add feedback

aa68c75c4a77c87f97fb686b2f068676-AuthorFeedback.pdf

Neural Information Processing SystemsJun-1-2025, 07:43:29 GMT

As suggested, we will specify the mapping and continue therein with linear algebra notation.

artificial intelligence, contribution, metric space, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.50)

Add feedback

aa36c88c27650af3b9868b723ae15dfc-AuthorFeedback.pdf

Neural Information Processing SystemsJun-1-2025, 07:42:40 GMT

We thank all reviewers for their time and valuable comments. We thank this reviewer for the positive feedback! "The theoretical sample complexity is not significantly improved over previously-known methods." The main contribution of our paper is to show that an existing and popular algorithm (i.e., group-sparse regularized We view the sample complexity improvement over the dependence on k as a side benefit of our analysis. "It would be interesting to see a more thorough empirical evaluation, to compare with the interaction screening The main contribution of our paper is theoretical. Our graph has diamond shape (Figure 1 of our paper), 10 variables and edge weight 0.2. This observation is actually the starting point of our paper. We will include this discussion in our paper. "The presentation is quite technical...the Ising case seems to be enough to introduce the main idea...but a lot of For learning non-binary graphical models, we see a benefit of using the group-sparse (i.e., the l "Experiments are only presented for rather small examples (up to 14 variables, up to k = 6)."

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.81)

Add feedback

DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection

Neural Information Processing SystemsJun-1-2025, 07:42:27 GMT

Domain adaptive object detection (DAOD) aims to generalize detectors trained on an annotated source domain to an unlabelled target domain. As the visual-language models (VLMs) can provide essential general knowledge on unseen images, freezing the visual encoder and inserting a domain-agnostic adapter can learn domaininvariant knowledge for DAOD. However, the domain-agnostic adapter is inevitably biased to the source domain.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia (0.46)
North America > United States > Texas (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

From Dictionary to Tensor: A Scalable Multi-View Subspace Clustering Framework with Triple Information Enhancement Zhibin Gu1 College of Computer and Cyber Security, Hebei Normal University, China

Neural Information Processing SystemsJun-1-2025, 07:42:04 GMT

While Tensor-based Multi-view Subspace Clustering (TMSC) has garnered significant attention for its capacity to effectively capture high-order correlations among multiple views, three notable limitations in current TMSC methods necessitate consideration: 1) high computational complexity and reliance on dictionary completeness resulting from using observed data as the dictionary, 2) inaccurate subspace representation stemming from the oversight of local geometric information and 3) under-penalization of noise-related singular values within tensor data caused by treating all singular values equally. To address these limitations, this paper presents a Scalable TMSC framework with Triple infOrmatioN Enhancement (STONE). Notably, an enhanced anchor dictionary learning mechanism has been utilized to recover the low-rank anchor structure, resulting in reduced computational complexity and increased resilience, especially in scenarios with inadequate dictionaries. Additionally, we introduce an anchor hypergraph Laplacian regularizer to preserve the inherent geometry of the data within the subspace representation. Simultaneously, an improved hyperbolic tangent function has been employed as a precise approximation for tensor rank, effectively capturing the significant variations in singular values. Extensive experiments on a variety of datasets show that the STONE outperforms SOTA approaches in both effectiveness and efficiency.

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.50)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.86)

Technology: