AITopics | target detection

Collaborating Authors

target detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rethinking Evaluation of Infrared Small Target Detection

Neural Information Processing SystemsJun-21-2026, 10:32:35 GMT

As an essential vision task, infrared small target detection (IRSTD)1 has seen significant advancements through deep learning. However, critical limitations in current evaluation protocols impede further progress. First, existing methods rely on fragmented pixel-and target-level specific metrics, which fails to provide a comprehensive view of model capabilities. Second, an excessive emphasis on overall performance scores obscures crucial error analysis, which is vital for identifying failure modes and improving real-world system performance. Third, the field predominantly adopts dataset-specific training-testing paradigms, hindering the understanding of model robustness and generalization across diverse infrared scenarios.

detection, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?

Arturo Deza, Miguel Eckstein

Neural Information Processing SystemsApr-22-2026, 13:45:27 GMT

Previous studies have proposed image-based clutter measures that correlate with human search times and/or eye movements. However, most models do not take into account the fact that the effects of clutter interact with the foveated nature of the human visual system: visual clutter further from the fovea has an increasing detrimental influence on perception. Here, we introduce a new foveated clutter model to predict the detrimental effects in target search utilizing a forced fixation search task. We use Feature Congestion (Rosenholtz et al.) as our non foveated clutter model, and we stack a peripheral architecture on top of Feature Congestion for our foveated model. We introduce the Peripheral Integration Feature Congestion (PIFC) coefficient, as a fundamental ingredient of our model that modulates clutter as a non-linear gain contingent on eccentricity. We show that Foveated Feature Congestion (FFC) clutter scores (r(44) = 0.82 0.04,p < 0.0001) correlate better with target detection (hit rate) than regular Feature Congestion (r(44) = 0.19 0.13,p= 0.0774) in forced fixation search; and we extend foveation to other clutter models showing stronger correlations in all cases. Thus, our model allows us to enrich clutter perception research by computing fixation specific clutter maps. Code for building peripheral representations is available1.

artificial intelligence, eccentricity, feature congestion, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.55)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

LCB-CV-UNet: Enhanced Detector for High Dynamic Range Radar Signals

Wang, Yanbin, Chen, Xingyu, Wang, Yumiao, Wang, Xiang, Zang, Chuanfei, Cui, Guolong, Liu, Jiahuan

arXiv.org Artificial IntelligenceNov-27-2025

We propose the LCB-CV-UNet to tackle performance degradation caused by High Dynamic Range (HDR) radar signals. Initially, a hardware-efficient, plug-and-play module named Logarithmic Connect Block (LCB) is proposed as a phase coherence preserving solution to address the inherent challenges in handling HDR features. Then, we propose the Dual Hybrid Dataset Construction method to generate a semi-synthetic dataset, approximating typical HDR signal scenarios with adjustable target distributions. Simulation results show about 1% total detection probability improvement with under 0.9% computational complexity added compared with the baseline. Furthermore, it excels 5% over the baseline at the range in 11-13 dB signal-to-noise ratio typical for urban targets. Finally, the real experiment validates the practicality of our model.

artificial intelligence, hdr signal, machine learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IGARSS55030.2025.11243251

2505.23454

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

Toward Gaze Target Detection of Young Autistic Children

Deng, Shijian, Kosloski, Erin E., Vasireddy, Siva Sai Nagender, Li, Jia, Sherwood, Randi Sierra, Hatha, Feroz Mohamed, Patel, Siddhi, Rollins, Pamela R, Tian, Yapeng

arXiv.org Artificial IntelligenceNov-17-2025

The automatic detection of gaze targets in autistic children through artificial intelligence can be impactful, especially for those who lack access to a sufficient number of professionals to improve their quality of life. This paper introduces a new, real-world AI application for gaze target detection in autistic children, which predicts a child's point of gaze from an activity image. This task is foundational for building automated systems that can measure joint attention--a core challenge in Autism Spectrum Disorder (ASD). To facilitate the study of this challenging application, we collected the first-ever Autism Gaze Target (AGT) dataset. We further propose a novel Socially A ware Coarse-to-Fine (SACF) gaze detection framework that explicitly leverages the social context of a scene to overcome the class imbalance common in autism datasets--a consequence of autistic children's tendency to show reduced gaze to faces. It utilizes a two-pathway architecture with expert models specialized in social and nonsocial gaze, guided by a context-awareness gate module. The results of our comprehensive experiments demonstrate that our framework achieves new state-of-the-art performance for gaze target detection in this population, significantly outperforming existing methods, especially on the critical minority class of face-directed gaze.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.11244

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)

Add feedback

GazeVLM: A Vision-Language Model for Multi-Task Gaze Understanding

Mathew, Athul M., Hermassi, Haithem, Khalid, Thariq, Khan, Arshad Ali, Souissi, Riad

arXiv.org Artificial IntelligenceNov-11-2025

Gaze understanding unifies the detection of people, their gaze targets, and objects of interest into a single framework, offering critical insight into visual attention and intent estimation. Although prior research has modelled gaze cues in visual scenes, a unified system is still needed for gaze understanding using both visual and language prompts. This paper introduces GazeVLM, a novel Vision-Language Model (VLM) for multi-task gaze understanding in images, addressing person detection, gaze target detection, and gaze object identification. While other transformer-based methods exist for gaze analysis, GazeVLM represents, to our knowledge, the first application of a VLM to these combined tasks, allowing for selective execution of each task. Through the integration of visual (RGB and depth) and textual modalities, our ablation study on visual input combinations revealed that a fusion of RGB images with HHA-encoded depth maps, guided by text prompts, yields superior performance. We also introduce an object-level gaze detection metric for gaze object identification ($AP_{ob}$). Through experiments, GazeVLM demonstrates significant improvements, notably achieving state-of-the-art evaluation scores on GazeFollow and VideoAttentionTarget datasets.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.06348

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

RL-Aided Cognitive ISAC: Robust Detection and Sensing-Communication Trade-offs

Umra, Adam, Ahmed, Aya M., Sezgin, Aydin

arXiv.org Artificial IntelligenceNov-5-2025

This paper proposes a reinforcement learning (RL)-aided cognitive framework for massive MIMO-based integrated sensing and communication (ISAC) systems employing a uniform planar array (UPA). The focus is on enhancing radar sensing performance in environments with unknown and dynamic disturbance characteristics. A Wald-type detector is employed for robust target detection under non-Gaussian clutter, while a SARSA-based RL algorithm enables adaptive estimation of target positions without prior environmental knowledge. Based on the RL-derived sensing information, a joint waveform optimization strategy is formulated to balance radar sensing accuracy and downlink communication throughput. The resulting design provides an adaptive trade-off between detection performance and achievable sum rate through an analytically derived closed-form solution. Monte Carlo simulations demonstrate that the proposed cognitive ISAC framework achieves significantly improved detection probability compared to orthogonal and non-learning adaptive baselines, while maintaining competitive communication performance. These results underline the potential of RL-assisted sensing for robust and spectrum-efficient ISAC in next-generation wireless networks.

detection, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2511.02672

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SignalLLM: A General-Purpose LLM Agent Framework for Automated Signal Processing

Ke, Junlong, Hu, Qiying, Yuan, Shenghai, Xu, Yuecong, Yang, Jianfei

arXiv.org Artificial IntelligenceOct-31-2025

Modern signal processing (SP) pipelines, whether model-based or data-driven, often constrained by complex and fragmented workflow, rely heavily on expert knowledge and manual engineering, and struggle with adaptability and generalization under limited data. In contrast, Large Language Models (LLMs) offer strong reasoning capabilities, broad general-purpose knowledge, in-context learning, and cross-modal transfer abilities, positioning them as powerful tools for automating and generalizing SP workflows. Motivated by these potentials, we introduce SignalLLM, the first general-purpose LLM-based agent framework for general SP tasks. Unlike prior LLM-based SP approaches that are limited to narrow applications or tricky prompting, SignalLLM introduces a principled, modular architecture. It decomposes high-level SP goals into structured subtasks via in-context learning and domain-specific retrieval, followed by hierarchical planning through adaptive retrieval-augmented generation (RAG) and refinement; these subtasks are then executed through prompt-based reasoning, cross-modal reasoning, code synthesis, model invocation, or data-driven LLM-assisted modeling. Its generalizable design enables the flexible selection of problem solving strategies across different signal modalities, task types, and data conditions. We demonstrate the versatility and effectiveness of SignalLLM through five representative tasks in communication and sensing, such as radar target detection, human activity recognition, and text compression. Experimental results show superior performance over traditional and existing LLM-based methods, particularly in few-shot and zero-shot settings.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2509.17197

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TY-RIST: Tactical YOLO Tricks for Real-time Infrared Small Target Detection

Atrash, Abdulkarim, Moured, Omar, Chen, Yufan, Zhang, Jiaming, Ertekin, Seyda, Ugur, Omur

arXiv.org Artificial IntelligenceSep-30-2025

Infrared small target detection (IRSTD) is critical for defense and surveillance but remains challenging due to (1) target loss from minimal features, (2) false alarms in cluttered environments, (3) missed detections from low saliency, and (4) high computational costs. To address these issues, we propose TY-RIST, an optimized YOLOv12n architecture that integrates (1) a stride-aware backbone with fine-grained receptive fields, (2) a high-resolution detection head, (3) cascaded coordinate attention blocks, and (4) a branch pruning strategy that reduces computational cost by about 25.5% while marginally improving accuracy and enabling real-time inference. We also incorporate the Normalized Gaussian Wasserstein Distance (NWD) to enhance regression stability. Extensive experiments on four benchmarks and across 20 different models demonstrate state-of-the-art performance, improving mAP at 0.5 IoU by +7.9%, Precision by +3%, and Recall by +10.2%, while achieving up to 123 FPS on a single GPU. Cross-dataset validation on a fifth dataset further confirms strong generalization capability. Additional results and resources are available at https://www.github.com/moured/TY-RIST

detection, machine learning, real time system, (20 more...)

arXiv.org Artificial Intelligence

2509.22909

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Architecture > Real Time Systems (0.72)
Information Technology > Artificial Intelligence > Vision (0.70)
(2 more...)

Add feedback

When marine radar target detection meets pretrained large language models

Hu, Qiying, Zhang, Linping, Wang, Xueqian, Li, Gang, Liu, Yu, Zhang, Xiao-Ping

arXiv.org Artificial IntelligenceSep-16-2025

Deep learning (DL) methods are widely used to extract high-dimensional patterns from the sequence features of radar echo signals. However, conventional DL algorithms face challenges such as redundant feature segments, and constraints from restricted model sizes. To address these issues, we propose a framework that integrates feature preprocessing with large language models (LLMs). Our preprocessing module tokenizes radar sequence features, applies a patch selection algorithm to filter out uninformative segments, and projects the selected patches into embeddings compatible with the feature space of pre-trained LLMs. Leveraging these refined embeddings, we incorporate a pre-trained LLM, fine-tuning only the normalization layers to reduce training burdens while enhancing performance. Experiments on measured datasets demonstrate that the proposed method significantly outperforms the state-of-the-art baselines on supervised learning tests.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.1211

Country: Asia > China (0.30)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NS-FPN: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective

Yuan, Maoxun, Meng, Duanni, Xi, Ziteng, Zhao, Tianyi, Zhao, Shiji, Dai, Yimian, Wei, Xingxing

arXiv.org Artificial IntelligenceAug-12-2025

Infrared small target detection and segmentation (IRSTDS) is a critical yet challenging task in defense and civilian applications, owing to the dim, shapeless appearance of targets and severe background clutter. Recent CNN-based methods have achieved promising target perception results, but they only focus on enhancing feature representation to offset the impact of noise, which results in the increased false alarms problem. In this paper, through analyzing the problem from the frequency domain, we pioneer in improving performance from noise suppression perspective and propose a novel noise-suppression feature pyramid network (NS-FPN), which integrates a low-frequency guided feature purification (LFP) module and a spiral-aware feature sampling (SFS) module into the original FPN structure. The LFP module suppresses the noise features by purifying high-frequency components to achieve feature enhancement devoid of noise interference, while the SFS module further adopts spiral sampling to fuse target-relevant features in feature fusion process. Our NS-FPN is designed to be lightweight yet effective and can be easily plugged into existing IRSTDS frameworks. Extensive experiments on the public IRSTDS datasets demonstrate that our method significantly reduces false alarms and achieves superior performance on IRSTDS tasks.

artificial intelligence, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.06878

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback