AITopics

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-10-2026, 22:28:20 GMT

e92e1b476bb5262d793fd40931e0ed53-Paper.pdf

computer vision, proc, representation, (11 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Genre: Research Report (0.69)

Industry: Information Technology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-10-2026, 21:04:01 GMT

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Benchmarking vision-based driving policies is challenging. On one hand, open-loop evaluation with real data is easy, but these results do not reflect closed-loop performance.

machine learning, natural language, simulation, (17 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (0.70)
Automobiles & Trucks (0.70)
Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)

Neural Information Processing SystemsOct-9-2025, 22:46:46 GMT

32768f7faf1995026ef9821c696f3404-Paper-Datasets_and_Benchmarks_Track.pdf

autonomous driving, computer vision, simulation, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (0.70)
Automobiles & Trucks (0.70)
Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Vision (0.70)
(2 more...)

Neural Information Processing SystemsOct-9-2025, 16:18:21 GMT

9f0b1220028dfa2ee82ca0a0e0fc52d1-Paper-Conference.pdf

artificial intelligence, computer vision, machine learning, (14 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-17-2025, 02:18:52 GMT

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis

While 2D generative adversarial networks have enabled high-resolution image synthesis, they largely lack an understanding of the 3D world and the image formation process.

computer vision, proc, representation, (12 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > California (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Research Report (0.69)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Li, Songlin, Paschalidou, Despoina, Guibas, Leonidas

PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers

arXiv.org Artificial IntelligenceJul-18-2024

The increased demand for tools that automate the 3D content creation process led to tremendous progress in deep generative models that can generate diverse 3D objects of high fidelity. In this paper, we present PASTA, an autoregressive transformer architecture for generating high quality 3D shapes. PASTA comprises two main components: An autoregressive transformer that generates objects as a sequence of cuboidal primitives and a blending network, implemented with a transformer decoder that composes the sequences of cuboids and synthesizes high quality meshes for each object. Our model is trained in two stages: First we train our autoregressive generative model using only annotated cuboidal parts as supervision and next, we train our blending network using explicit 3D supervision, in the form of watertight meshes. Evaluations on various ShapeNet objects showcase the ability of our model to perform shape generation from diverse inputs \eg from scratch, from a partial object, from text and images, as well size-guided generation, by explicitly conditioning on a bounding box that defines the object's boundaries. Moreover, as our model considers the underlying part-based structure of a 3D object, we are able to select a specific part and produce shapes with meaningful variations of this part. As evidenced by our experiments, our model generates 3D shapes that are both more realistic and diverse than existing part-based and non part-based methods, while at the same time is simpler to implement and train.

computer vision, conf, proc, (13 more...)

2407.13677

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

arXiv.org Artificial IntelligenceJun-21-2024

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Dauner, Daniel, Hallgarten, Marcel, Li, Tianyu, Weng, Xinshuo, Huang, Zhiyu, Yang, Zetong, Li, Hongyang, Gilitschenski, Igor, Ivanovic, Boris, Pavone, Marco, Geiger, Andreas, Chitta, Kashyap

Benchmarking vision-based driving policies is challenging. On one hand, open-loop evaluation with real data is easy, but these results do not reflect closed-loop performance. On the other, closed-loop evaluation is possible in simulation, but is hard to scale due to its significant computational demands. Further, the simulators available today exhibit a large domain gap to real data. This has resulted in an inability to draw clear conclusions from the rapidly growing body of research on end-to-end autonomous driving. In this paper, we present NAVSIM, a middle ground between these evaluation paradigms, where we use large datasets in combination with a non-reactive simulator to enable large-scale real-world benchmarking. Specifically, we gather simulation-based metrics, such as progress and time to collision, by unrolling bird's eye view abstractions of the test scenes for a short simulation horizon. Our simulation is non-reactive, i.e., the evaluated policy and environment do not influence each other. As we demonstrate empirically, this decoupling allows open-loop metric computation while being better aligned with closed-loop evaluations than traditional displacement errors. NAVSIM enabled a new competition held at CVPR 2024, where 143 teams submitted 463 entries, resulting in several new insights. On a large set of challenging scenarios, we observe that simple methods with moderate compute requirements such as TransFuser can match recent large-scale end-to-end driving architectures such as UniAD. Our modular framework can potentially be extended with new datasets, data curation strategies, and metrics, and will be continually maintained to host future challenges. Our code is available at https://github.com/autonomousvision/navsim.

autonomous driving, computer vision, simulation, (14 more...)

2406.15349

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.65)

Industry:

Transportation > Ground > Road (0.91)
Automobiles & Trucks (0.91)
Energy (0.77)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Dauner, Daniel, Hallgarten, Marcel, Geiger, Andreas, Chitta, Kashyap

Parting with Misconceptions about Learning-based Vehicle Motion Planning

arXiv.org Artificial IntelligenceNov-2-2023

The release of nuPlan marks a new era in vehicle motion planning research, offering the first large-scale real-world dataset and evaluation schemes requiring both precise short-term planning and long-horizon ego-forecasting. Existing systems struggle to simultaneously meet both requirements. Indeed, we find that these tasks are fundamentally misaligned and should be addressed independently. We further assess the current state of closed-loop planning in the field, revealing the limitations of learning-based methods in complex real-world scenarios and the value of simple rule-based priors such as centerline selection through lane graph search algorithms. More surprisingly, for the open-loop sub-task, we observe that the best results are achieved when using only this centerline as scene context (i.e., ignoring all information regarding the map and other agents). Combining these insights, we propose an extremely simple and efficient planner which outperforms an extensive set of competitors, winning the nuPlan planning challenge 2023.

autonomous driving, ieee international conf, proc, (12 more...)

2306.07962

Country:

North America > United States (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Energy (0.52)
Transportation (0.49)
Automobiles & Trucks (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.72)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.61)

arXiv.org Artificial IntelligenceJul-23-2022

A Visual Navigation Perspective for Category-Level Object Pose Estimation

Guo, Jiaxin, Zhong, Fangxun, Xiong, Rong, Liu, Yunhui, Wang, Yue, Liao, Yiyi

This paper studies category-level object pose estimation based on a single monocular image. Recent advances in pose-aware generative models have paved the way for addressing this challenging task using analysis-by-synthesis. The idea is to sequentially update a set of latent variables, e.g., pose, shape, and appearance, of the generative model until the generated image best agrees with the observation. However, convergence and efficiency are two challenges of this inference procedure. In this paper, we take a deeper look at the inference of analysis-by-synthesis from the perspective of visual navigation, and investigate what is a good navigation policy for this specific task. We evaluate three different strategies, including gradient descent, reinforcement learning and imitation learning, via thorough comparisons in terms of convergence, robustness and efficiency. Moreover, we show that a simple hybrid approach leads to an effective and efficient solution. We further compare these strategies to state-of-the-art methods, and demonstrate superior performance on synthetic and real-world datasets leveraging off-the-shelf pose-aware generative models.

computer vision, pose estimation, proc, (12 more...)

2203.13572

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)