AITopics

UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles

Neural Information Processing SystemsMay-29-2025, 19:14:10 GMT

Unmanned Aerial Vehicles (UAVs), equipped with cameras, are employed in numerous applications, including aerial photography, surveillance, and agriculture. In these applications, robust object detection and tracking are essential for the effective deployment of UAVs. However, existing benchmarks for UAV applications are mainly designed for traditional 2D perception tasks, restricting the development of real-world applications that require a 3D understanding of the environment. Furthermore, despite recent advancements in single-UAV perception, limited views of a single UAV platform significantly constrain its perception capabilities over long distances or in occluded areas. To address these challenges, we introduce UAV3D - a benchmark designed to advance research in both 3D and collaborative 3D perception tasks with UAVs. UAV3D comprises 1,000 scenes, each of which has 20 frames with fully annotated 3D bounding boxes on vehicles. We provide the benchmark for four 3D perception tasks: single-UAV 3D object detection, single-UAV object tracking, collaborative-UAV 3D object detection, and collaborative-UAV object tracking.

artificial intelligence, dataset, detection, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Connecticut (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Industry:

Media > Photography (0.88)
Information Technology > Robotics & Automation (0.72)
Aerospace & Defense > Aircraft (0.61)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback

Deep Shells: Unsupervised Shape Correspondence with Optimal Transport Aysim Toker Technical University of Munich

Neural Information Processing SystemsMay-29-2025, 19:13:35 GMT

We propose a novel unsupervised learning approach to 3D shape correspondence that builds a multiscale matching pipeline into a deep neural network. This approach is based on smooth shells, the current state-of-the-art axiomatic correspondence method, which requires an a priori stochastic search over the space of initial poses. Our goal is to replace this costly preprocessing step by directly learning good initializations from the input surfaces. To that end, we systematically derive a fully differentiable, hierarchical matching pipeline from entropy regularized optimal transport. This allows us to combine it with a local feature extractor based on smooth, truncated spectral convolution filters. Finally, we show that the proposed unsupervised method significantly improves over the state-of-the-art on multiple datasets, even in comparison to the most recent supervised methods. Moreover, we demonstrate compelling generalization results by applying our learned filters to examples that significantly deviate from the training set.

artificial intelligence, correspondence, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

769c3bce651ce5feaa01ce3b75986420-AuthorFeedback.pdf

Neural Information Processing SystemsMay-29-2025, 19:13:22 GMT

First of all, we would like to thank the reviewers for their constructive feedback and thorough reviews. In comparison to prior work, our network is more accurate and generalizes better across benchmarks. Table 1 prove that our method is robust to this type of input noise. The remaining "3 of 5" experiments in [7, Figure 1.] that R3 frequently refers to are based on Surreal which is We will state this point more clearly in the paper and thank the reviewer for the insight. According to [7, Appendix C] this leads to problems for humans in "bent over poses", see also Figure 2 of our Appendix.

artificial intelligence, machine learning, table 1, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

ULNeF: Untangled Layered Neural Fields for Mix-and-Match Virtual Try-On Miguel A. Otaduy Universidad Rey Juan Carlos Universidad Rey Juan Carlos Madrid, Spain

Neural Information Processing SystemsMay-29-2025, 19:12:51 GMT

Recent advances in neural models have shown great results for virtual try-on (VTO) problems, where a 3D representation of a garment is deformed to fit a target body shape. However, current solutions are limited to a single garment layer, and cannot address the combinatorial complexity of mixing different garments. Motivated by this limitation, we investigate the use of neural fields for mix-and-match VTO, and identify and solve a fundamental challenge that existing neural-field methods cannot address: the interaction between layered neural fields. To this end, we propose a neural model that untangles layered neural fields to represent collision-free garment surfaces. The key ingredient is a neural untangling projection operator that works directly on the layered neural fields, not on explicit surface representations. Algorithms to resolve object-object interaction are inherently limited by the use of explicit geometric representations, and we show how methods that work directly on neural implicit representations could bring a change of paradigm and open the door to radically different approaches.

artificial intelligence, computer vision, machine learning, (13 more...)

Neural Information Processing Systems

Country: Europe > Spain > Galicia > Madrid (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models

Neural Information Processing SystemsMay-29-2025, 19:12:39 GMT

Large language models are probabilistic models, and the process of generating content is essentially sampling from the output distribution of the language model. Existing watermarking techniques inject watermarks into the generated content without altering the output quality. On the other hand, existing acceleration techniques, specifically speculative sampling, leverage a draft model to speed up the sampling process while preserving the output distribution. However, there is no known method to simultaneously accelerate the sampling process and inject watermarks into the generated content. In this paper, we investigate this direction and find that the integration of watermarking and acceleration is non-trivial. We prove a no-go theorem, which states that it is impossible to simultaneously maintain the highest watermark strength and the highest sampling efficiency. Furthermore, we propose two methods that maintain either the sampling efficiency or the watermark strength, but not both. Our work provides a rigorous theoretical foundation for understanding the inherent trade-off between watermark strength and sampling efficiency in accelerating the generation of watermarked tokens for large language models. We also conduct numerical experiments to validate our theoretical findings and demonstrate the effectiveness of the proposed methods.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

BRP-NAS: Prediction-based NAS using GCNs, Mohamed S. Abdelfattah 1 Royson Lee

Neural Information Processing SystemsMay-29-2025, 19:12:03 GMT

Neural architecture search (NAS) enables researchers to automatically explore broad design spaces in order to improve efficiency of neural networks. This efficiency is especially important in the case of on-device deployment, where improvements in accuracy should be balanced out with computational demands of a model. In practice, performance metrics of model are computationally expensive to obtain. Previous work uses a proxy (e.g., number of operations) or a layer-wise measurement of neural network layers to estimate end-to-end hardware performance but the imprecise prediction diminishes the quality of NAS. To address this problem, we propose BRP-NAS, an efficient hardware-aware NAS enabled by an accurate performance predictor-based on graph convolutional network (GCN). What is more, we investigate prediction quality on different metrics and show that sample efficiency of the predictor-based NAS can be improved by considering binary relations of models and an iterative data selection strategy. We show that our proposed method outperforms all prior methods on NAS-Bench-101 and NAS-Bench-201, and that our predictor can consistently learn to extract useful features from the DARTS search space, improving upon the second-order baseline. Finally, to raise awareness of the fact that accurate latency estimation is not a trivial task, we release LatBench - a latency dataset of NAS-Bench-201 models running on a broad range of devices.

artificial intelligence, machine learning, predictor, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Texas (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

63ba665e01f39233674426ba36d6e177-Paper-Conference.pdf

Neural Information Processing SystemsMay-29-2025, 19:09:27 GMT

Humans judge perceptual similarity according to diverse visual attributes, including scene layout, subject location, and camera pose. Existing vision models understand a wide range of semantic abstractions but improperly weigh these attributes and thus make inferences misaligned with human perception. While vision representations have previously benefited from alignment in contexts like image generation, the utility of perceptually aligned representations in general-purpose settings remains unclear. Here, we investigate how aligning vision representations to human perceptual judgments impacts their usability across diverse vision tasks. We finetune state-of-the-art models on human similarity judgments for image triplets and evaluate them across standard benchmarks. We find that perceptual alignment yields representations that improve upon the original backbones across many tasks, including counting, segmentation, depth estimation, instance retrieval, and retrieval-augmented generation, while deteriorating performance on natural classification. Performance is widely preserved on other tasks, including specialized out-of-distribution domains such as in medical imaging and 3D environment frames. Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands (0.14)
Europe > France (0.14)
Africa > Rwanda (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form

Neural Information Processing SystemsMay-29-2025, 19:08:58 GMT

Although optimal transport (OT) problems admit closed form solutions in a very few notable cases, e.g. in 1D or between Gaussians, these closed forms have proved extremely fecund for practitioners to define tools inspired from the OT geometry. On the other hand, the numerical resolution of OT problems using entropic regularization has given rise to many applications, but because there are no known closed-form solutions for entropic regularized OT problems, these approaches are mostly algorithmic, not informed by elegant closed forms. In this paper, we propose to fill the void at the intersection between these two schools of thought in OT by proving that the entropy-regularized optimal transport problem between two Gaussian measures admits a closed form. Contrary to the unregularized case, for which the explicit form is given by the Wasserstein-Bures distance, the closed form we obtain is differentiable everywhere, even for Gaussians with degenerate covariance matrices. We obtain this closed form solution by solving the fixed-point equation behind Sinkhorn's algorithm, the default method for computing entropic regularized OT. Remarkably, this approach extends to the generalized unbalanced case -- where Gaussian measures are scaled by positive constants. This extension leads to a closed form expression for unbalanced Gaussians as well, and highlights the mass transportation / destruction trade-off seen in unbalanced optimal transport. Moreover, in both settings, we show that the optimal transportation plans are (scaled) Gaussians and provide analytical formulas of their parameters. These formulas constitute the first non-trivial closed forms for entropy-regularized optimal transport, thus providing a ground truth for the analysis of entropic OT and Sinkhorn's algorithm.

artificial intelligence, formula, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Europe > France (0.28)
Asia > Japan (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form

Neural Information Processing SystemsMay-29-2025, 19:08:51 GMT

Although optimal transport (OT) problems admit closed form solutions in a very few notable cases, e.g. in 1D or between Gaussians, these closed forms have proved extremely fecund for practitioners to define tools inspired from the OT geometry. On the other hand, the numerical resolution of OT problems using entropic regularization has given rise to many applications, but because there are no known closed-form solutions for entropic regularized OT problems, these approaches are mostly algorithmic, not informed by elegant closed forms. In this paper, we propose to fill the void at the intersection between these two schools of thought in OT by proving that the entropy-regularized optimal transport problem between two Gaussian measures admits a closed form. Contrary to the unregularized case, for which the explicit form is given by the Wasserstein-Bures distance, the closed form we obtain is differentiable everywhere, even for Gaussians with degenerate covariance matrices. We obtain this closed form solution by solving the fixed-point equation behind Sinkhorn's algorithm, the default method for computing entropic regularized OT. Remarkably, this approach extends to the generalized unbalanced case -- where Gaussian measures are scaled by positive constants. This extension leads to a closed form expression for unbalanced Gaussians as well, and highlights the mass transportation / destruction trade-off seen in unbalanced optimal transport. Moreover, in both settings, we show that the optimal transportation plans are (scaled) Gaussians and provide analytical formulas of their parameters. These formulas constitute the first non-trivial closed forms for entropy-regularized optimal transport, thus providing a ground truth for the analysis of entropic OT and Sinkhorn's algorithm.

artificial intelligence, machine learning, optimal transport, (15 more...)

Neural Information Processing Systems

Country:

Europe > France (0.28)
Asia > Japan (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

766e428d1e232bbdd58664b41346196c-AuthorFeedback.pdf

Neural Information Processing SystemsMay-29-2025, 19:08:40 GMT

We thank the reviewers for their appreciative and thoughtful feedback. Reviewer 1. "However, the authors fail to bring the result to their impact of the current state OT, or any novel stochastic optimization algorithm designed to compute it faster. We will further emphasize these aspects. A measure with 0 mean. Reviewer 2. "If the paper could show the formula for that case [TV] that would be Reviewer 3. ""Figure 1 illustrates the convergence"... the convergence of what?" "Figure 2 is also difficult to understand. "[On the proof of prop 2] Could Which proves the uniform bound.

algorithm, artificial intelligence, eigenvalue, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.56)

Add feedback

Filters

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles

Deep Shells: Unsupervised Shape Correspondence with Optimal Transport Aysim Toker Technical University of Munich

769c3bce651ce5feaa01ce3b75986420-AuthorFeedback.pdf

ULNeF: Untangled Layered Neural Fields for Mix-and-Match Virtual Try-On Miguel A. Otaduy Universidad Rey Juan Carlos Universidad Rey Juan Carlos Madrid, Spain

Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models

BRP-NAS: Prediction-based NAS using GCNs, Mohamed S. Abdelfattah 1 Royson Lee

63ba665e01f39233674426ba36d6e177-Paper-Conference.pdf

Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form

Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form

766e428d1e232bbdd58664b41346196c-AuthorFeedback.pdf