AITopics | fov

Collaborating Authors

fov

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FOVA: Offline Federated Reinforcement Learning with Mixed-Quality Data

Qiao, Nan, Yue, Sheng, Ren, Ju, Zhang, Yaoxue

arXiv.org Artificial IntelligenceDec-3-2025

Offline Federated Reinforcement Learning (FRL), a marriage of federated learning and offline reinforcement learning, has attracted increasing interest recently. Albeit with some advancement, we find that the performance of most existing offline FRL methods drops dramatically when provided with mixed-quality data, that is, the logging behaviors (offline data) are collected by policies with varying qualities across clients. To overcome this limitation, this paper introduces a new vote-based offline FRL framework, named FOVA. It exploits a \emph{vote mechanism} to identify high-return actions during local policy evaluation, alleviating the negative effect of low-quality behaviors from diverse local learning policies. Besides, building on advantage-weighted regression (AWR), we construct consistent local and global training objectives, significantly enhancing the efficiency and stability of FOVA. Further, we conduct an extensive theoretical analysis and rigorously show that the policy learned by FOVA enjoys strict policy improvement over the behavioral policy. Extensive experiments corroborate the significant performance gains of our proposed algorithm over existing baselines on widely used benchmarks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2512.0235

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.67)
Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

HAVEN: Hierarchical Adversary-aware Visibility-Enabled Navigation with Cover Utilization using Deep Transformer Q-Networks

Chauhan, Mihir, Conover, Damon, Bera, Aniket

arXiv.org Artificial IntelligenceDec-2-2025

Autonomous navigation in partially observable environments requires agents to reason beyond immediate sensor input, exploit occlusion, and ensure safety while progressing toward a goal. These challenges arise in many robotics domains, from urban driving and warehouse automation to defense and surveillance. Classical path planning approaches and memoryless reinforcement learning often fail under limited fields of view (FoVs) and occlusions, committing to unsafe or inefficient maneuvers. We propose a hierarchical navigation framework that integrates a Deep Transformer Q-Network (DTQN) as a high-level subgoal selector with a modular low-level controller for waypoint execution. The DTQN consumes short histories of task-aware features, encoding odometry, goal direction, obstacle proximity, and visibility cues, and outputs Q-values to rank candidate subgoals. Visibility-aware candidate generation introduces masking and exposure penalties, rewarding the use of cover and anticipatory safety. A low-level potential field controller then tracks the selected subgoal, ensuring smooth short-horizon obstacle avoidance. We validate our approach in 2D simulation and extend it directly to a 3D Unity-ROS environment by projecting point-cloud perception into the same feature schema, enabling transfer without architectural changes. Results show consistent improvements over classical planners and RL baselines in success rate, safety margins, and time to goal, with ablations confirming the value of temporal memory and visibility-aware candidate design. These findings highlight a generalizable framework for safe navigation under uncertainty, with broad relevance across robotic platforms.

exposure, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2512.00592

Country:

North America > United States > Maryland > Prince George's County > Adelphi (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.89)
(2 more...)

Add feedback

Visibility-aware Cooperative Aerial Tracking with Decentralized LiDAR-based Swarms

Yin, Longji, Ren, Yunfan, Zhu, Fangcheng, Shi, Liuyu, Kong, Fanze, Tang, Benxu, Liu, Wenyi, Lyu, Ximin, Zhang, Fu

arXiv.org Artificial IntelligenceDec-2-2025

Abstract--Autonomous aerial tracking with drones offers vast potential for surveillance, cinematography, and industrial inspection applications. While single-drone tracking systems have been extensively studied, swarm-based target tracking remains underexplored, despite its unique advantages of distributed perception, fault-tolerant redundancy, and multidirectional target coverage. T o bridge this gap, we propose a novel decentralized LiDAR-based swarm tracking framework that enables visibility-aware, cooperative target tracking in complex environments, while fully harnessing the unique capabilities of swarm systems. T o address visibility, we introduce a novel Spherical Signed Distance Field (SSDF)-based metric for 3-D environmental occlusion representation, coupled with an efficient algorithm that enables real-time onboard SSDF updating. A general Field-of-View (FOV) alignment cost supporting heterogeneous LiDAR configurations is proposed for consistent target observation. These innovations are integrated into a hierarchical planner, combining a kinodynamic front-end searcher with a spatiotemporal SE(3) back-end optimizer to generate collision-free, visibility-optimized trajectories. The proposed approach undergoes thorough evaluation through comprehensive benchmark comparisons and ablation studies. Deployed on heterogeneous LiDAR swarms, our fully decentralized implementation features collaborative perception, distributed planning, and dynamic swarm reconfigurability. V alidated through rigorous real-world experiments in cluttered outdoor environments, the proposed system demonstrates robust cooperative tracking of agile targets (drones, humans) while achieving superior visibility maintenance. This work establishes a systematic solution for swarm-based target tracking, and its source code will be released to benefit the community. Recent studies highlight the unique suitability of UA Vs for tracking dynamic targets in complex environments, owing to their highly agile three-dimensional (3-D) maneuverability. While substantial progress has been made in single-UA V tracking, the swarm-based aerial tracking remains underexplored. The authors are with the Department of Mechanical Engineering, The University of Hong Kong, Hong Kong. X. Lyu is with the School of Intelligent System Engineering, Sun Y at-sen University, Shenzhen, China. A swarm of four autonomous drones is cooperatively tracking a human runner using heterogeneous LiDAR configurations. The LiDAR setup consists of one upward-facing Mid360 LiDAR (marked by blue dashed lines), one downward-facing Mid360 LiDAR (green dashed lines), and two Avia LiDARs (red dashed lines). The swarm forms a 3-D distribution to track the target, with each tracker positioned optimally to suit its FOV settings. Effective agile aerial tracking with autonomous swarms primarily relies on three criteria: visibility, coordination, and portability.

artificial intelligence, drone, swarm, (17 more...)

arXiv.org Artificial Intelligence

2512.0128

Country:

Asia > China > Hong Kong (0.44)
Asia > China > Guangdong Province > Shenzhen (0.24)
Europe > Norway > Norwegian Sea (0.04)

Genre: Research Report (1.00)

Industry:

Transportation (0.67)
Aerospace & Defense (0.67)
Information Technology > Robotics & Automation (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

26d01e5ed42d8dcedd6aa0e3e99cffc4-Paper-Conference.pdf

Neural Information Processing SystemsNov-15-2025, 01:16:55 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Italy (0.04)
Europe > France (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation

Winter, Dominik, Bui, Mai, Gavaldon, Monica Azqueta, Triltsch, Nicolas, Rosati, Marco, Brieu, Nicolas

arXiv.org Artificial IntelligenceOct-21-2025

Scarcity of annotated data, particularly for rare or atypical morphologies, present significant challenges for cell and nuclei segmentation in computational pathology. While manual annotation is labor-intensive and costly, synthetic data offers a cost-effective alternative. We introduce a Multimodal Semantic Diffusion Model (MSDM) for generating realistic pixel-precise image-mask pairs for cell and nuclei segmentation. By conditioning the generative process with cellular/nuclear morphologies (using horizontal and vertical maps), RGB color characteristics, and BERT-encoded assay/indication metadata, MSDM generates datasests with desired morphological properties. These heterogeneous modalities are integrated via multi-head cross-attention, enabling fine-grained control over the generated images. Quantitative analysis demonstrates that synthetic images closely match real data, with low Wasserstein distances between embeddings of generated and real images under matching biological conditions. The incorporation of these synthetic samples, exemplified by columnar cells, significantly improves segmentation model accuracy on columnar cells. This strategy systematically enriches data sets, directly targeting model deficiencies. We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models. Thereby, we pave the way for broader application of generative models in computational pathology.

diffusion model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.09121

Country:

Europe > United Kingdom (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

26d01e5ed42d8dcedd6aa0e3e99cffc4-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 21:22:41 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe > Italy (0.04)
Europe > France (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Object-Reconstruction-Aware Whole-body Control of Mobile Manipulators

Dursun, Fatih, Adorno, Bruno Vilhena, Watson, Simon, Pan, Wei

arXiv.org Artificial IntelligenceSep-5-2025

Object reconstruction and inspection tasks play a crucial role in various robotics applications. Identifying paths that reveal the most unknown areas of the object becomes paramount in this context, as it directly affects efficiency, and this problem is known as the view path planning problem. Current methods often use sampling-based path planning techniques, evaluating potential views along the path to enhance reconstruction performance. However, these methods are computationally expensive as they require evaluating several candidate views on the path. To this end, we propose a computationally efficient solution that relies on calculating a focus point in the most informative (unknown) region and having the robot maintain this point in the camera field of view along the path. We incorporated this strategy into the whole-body control of a mobile manipulator employing a visibility constraint without the need for an additional path planner. We conducted comprehensive and realistic simulations using a large dataset of 114 diverse objects of varying sizes from 57 categories to compare our method with a sampling-based planning strategy using Bayesian data analysis. Furthermore, we performed real-world experiments with an 8-DoF mobile manipulator to demonstrate the proposed method's performance in practice. Our results suggest that there is no significant difference in object coverage and entropy. In contrast, our method is approximately nine times faster than the baseline sampling-based method in terms of the average time the robot spends between views.

artificial intelligence, planning & scheduling, robot, (16 more...)

arXiv.org Artificial Intelligence

2509.04094

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Disentangled representations of microscopy images

Dapueto, Jacopo, Pastore, Vito Paolo, Noceti, Nicoletta, Odone, Francesca

arXiv.org Artificial IntelligenceJun-26-2025

Microscopy image analysis is fundamental for different applications, from diagnosis to synthetic engineering and environmental monitoring. Modern acquisition systems have granted the possibility to acquire an escalating amount of images, requiring a consequent development of a large collection of deep learning-based automatic image analysis methods. Although deep neural networks have demonstrated great performance in this field, interpretability, an essential requirement for microscopy image analysis, remains an open challenge. This work proposes a Disentangled Representation Learning (DRL) methodology to enhance model interpretability for microscopy image classification. Exploiting benchmark datasets from three different microscopic image domains (plankton, yeast vacuoles, and human cells), we show how a DRL framework, based on transferring a representation learnt from synthetic data, can provide a good trade-off between accuracy and interpretability in this domain.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2506.20649

Country:

Europe > North Macedonia (0.04)
Europe > Italy (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images

Käs, Stephanie, Peter, Sven, Thillmann, Henrik, Burenko, Anton, Adrian, David Benjamin, Mack, Dennis, Linder, Timm, Leibe, Bastian

arXiv.org Artificial IntelligenceJun-25-2025

Fisheye cameras offer robots the ability to capture human movements across a wider field of view (FOV) than standard pinhole cameras, making them particularly useful for applications in human-robot interaction and automotive contexts. However, accurately detecting human poses in fisheye images is challenging due to the curved distortions inherent to fisheye optics. While various methods for undistorting fisheye images have been proposed, their effectiveness and limitations for poses that cover a wide FOV has not been systematically evaluated in the context of absolute human pose estimation from monocular fisheye images. To address this gap, we evaluate the impact of pinhole, equidistant and double sphere camera models, as well as cylindrical projection methods, on 3D human pose estimation accuracy. We find that in close-up scenarios, pinhole projection is inadequate, and the optimal projection method varies with the FOV covered by the human pose. The usage of advanced fisheye models like the double sphere model significantly enhances 3D human pose estimation accuracy. We propose a heuristic for selecting the appropriate projection model based on the detection bounding box to enhance prediction quality. Additionally, we introduce and evaluate on our novel dataset FISHnCHIPS, which features 3D human skeleton annotations in fisheye images, including images from unconventional angles, such as extreme close-ups, ground-mounted cameras, and wide-FOV poses, available at: https://www.vision.rwth-aachen.de/fishnchips

artificial intelligence, pose estimation, video understanding, (16 more...)

arXiv.org Artificial Intelligence

2506.19747

Country: Europe > Germany (0.04)

Genre: Research Report (0.82)

Industry: Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)

Add feedback