AITopics | bird

Collaborating Authors

bird

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator Xiaolong Wang

Neural Information Processing SystemsFeb-7-2026, 23:53:14 GMT

In this paper, we introduce a novel approach to fine-grained cross-view geo-localization.

artificial intelligence, machine learning, satellite image, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.66)

Industry: Information Technology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Geographic Information Systems (0.68)
(2 more...)

Add feedback

CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection

Neural Information Processing SystemsDec-27-2025, 06:18:01 GMT

Accurate and robust 3D object detection is a critical component in autonomous vehicles and robotics. While recent radar-camera fusion methods have made significant progress by fusing information in the bird's-eye view (BEV) representation, they often struggle to effectively capture the motion of dynamic objects, leading to limited performance in real-world scenarios. In this paper, we introduce CRT-Fusion, a novel framework that integrates temporal information into radar-camera fusion to address this challenge. Our approach comprises three key modules: Multi-View Fusion (MVF), Motion Feature Estimator (MFE), and Motion Guided Temporal Fusion (MGTF).

artificial intelligence, name change, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Robots (0.59)

Add feedback

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

Neural Information Processing SystemsDec-26-2025, 11:31:35 GMT

Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car to make the results coherent and realistic. Due to the challenges posed by occlusion, unfavourable imaging conditions and low resolution, \emph{generating} the BEV semantic maps corresponding to corrupted or invalid areas in the perspective view (PV) is appealing very recently.

artificial intelligence, name change, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.37)

Add feedback

Asynchrony-Robust Collaborative Perception via Bird's Eye View Flow

Neural Information Processing SystemsDec-25-2025, 11:21:23 GMT

Collaborative perception can substantially boost each agent's perception ability by facilitating communication among multiple agents. However, temporal asynchrony among agents is inevitable in the real world due to communication delays, interruptions, and clock misalignments.

asynchrony-robust collaborative perception, cobevflow, name change, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.78)

Add feedback

Blind Image Restoration via Fast Diffusion Inversion

Neural Information Processing SystemsDec-25-2025, 03:53:14 GMT

Image Restoration (IR) methods based on a pre-trained diffusion model have demonstrated state-of-the-art performance. However, they have two fundamental limitations: 1) they often assume that the degradation operator is completely known and 2) they alter the diffusion sampling process, which may result in restored images that do not lie onto the data manifold. To address these issues, we propose Blind Image Restoration via fast Diffusion inversion (BIRD) a blind IR method that jointly optimizes for the degradation model parameters and the restored image. To ensure that the restored images lie onto the data manifold, we propose a novel sampling technique on a pre-trained diffusion model. A key idea in our method is not to modify the reverse sampling, i.e., not to alter all the intermediate latents, once an initial noise is sampled. This is ultimately equivalent to casting the IR task as an optimization problem in the space of the input noise. Moreover, to mitigate the computational cost associated with inverting a fully unrolled diffusion model, we leverage the inherent capability of these models to skip ahead in the forward diffusion process using large time steps. We experimentally validate BIRD on several image restoration tasks and show that it achieves state of the art performance.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.86)

Add feedback

Invertible Tabular GANs: Killing Two Birds with One Stone for Tabular Data Synthesis

Neural Information Processing SystemsDec-23-2025, 21:16:40 GMT

Tabular data synthesis has received wide attention in the literature. This is because available data is often limited, incomplete, or cannot be obtained easily, and data privacy is becoming increasingly important. In this work, we present a generalized GAN framework for tabular synthesis, which combines the adversarial training of GANs and the negative log-density regularization of invertible neural networks. The proposed framework can be used for two distinctive objectives. First, we can further improve the synthesis quality, by decreasing the negative log-density of real records in the process of adversarial training. On the other hand, by increasing the negative log-density of real records, realistic fake records can be synthesized in a way that they are not too much close to real records and reduce the chance of potential information leakage. We conduct experiments with real-world datasets for classification, regression, and privacy attacks. In general, the proposed method demonstrates the best synthesis quality (in terms of task-oriented evaluation metrics, e.g., F1) when decreasing the negative log-density during the adversarial training. If increasing the negative log-density, our experimental results show that the distance between real and fake records increases, enhancing robustness against privacy attacks.

invertible tabular gan, synthesis, tabular data synthesis, (9 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

112d8e0c7563de6e3408b49a09b4d8a3-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 03:37:31 GMT

artificial intelligence, machine learning, satellite image, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(5 more...)

Genre: Research Report (0.68)

Industry: Information Technology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Geographic Information Systems (0.68)
(2 more...)

Add feedback

DriveMind: A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving

Wasif, Dawood, Moore, Terrence J, Reddy, Chandan K, Cho, Jin-Hee

arXiv.org Artificial IntelligenceJun-3-2025

Recent advances in autonomous vehicles have shifted development from rigid pipelines to end-to-end neural policies mapping raw sensor streams directly to control commands [1-3]. While these models offer streamlined architectures and strong benchmark performance, they raise critical deployment concerns. Their internal logic is opaque, complicating validation in safety-critical settings. They struggle to generalize to rare events like severe weather or infrastructure damage and lack formal guarantees on kinematic properties such as speed limits and lane-keeping. Further, they provide no natural interface for human oversight or explanation. These challenges motivate frameworks that combine deep network expressiveness with transparency, robustness, and provable safety. Meanwhile, Large Language Models (LLMs) and Vision Language Models (VLMs) have demonstrated human-level reasoning and visual grounding [4-6]. Recent works like VLM-SR (Shaped Rewards) [7], VLM-RM (Reward Models) [8], and RoboCLIP (Language-Conditioned Robot Learning via Contrastive Language-Image Pretraining) [9] inject semantic feedback into Reinforcement Learning (RL), but rely on static prompts unsuited to evolving road conditions and overlook vehicle dynamics.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2506.00819

Country: North America > United States > Virginia (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Collaborating Authors

bird

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

112d8e0c7563de6e3408b49a09b4d8a3-Supplemental-Conference.pdf

Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator Xiaolong Wang

CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

Asynchrony-Robust Collaborative Perception via Bird's Eye View Flow

Blind Image Restoration via Fast Diffusion Inversion

Invertible Tabular GANs: Killing Two Birds with One Stone for Tabular Data Synthesis

112d8e0c7563de6e3408b49a09b4d8a3-Supplemental-Conference.pdf

112d8e0c7563de6e3408b49a09b4d8a3-Paper-Conference.pdf

DriveMind: A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving