AITopics | feature descriptor

Collaborating Authors

feature descriptor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration

Jianchun Chen, Lingjing Wang, Xiang Li, Yi Fang

Neural Information Processing SystemsFeb-12-2026, 06:01:47 GMT

Recent methods introduce deep neural networks to predict the controlling parameters of hand-crafted geometric transformation models (e.g.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > UAE (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Surfel-based 3D Registration with Equivariant SE(3) Features

Kang, Xueyang, Zhao, Hang, Khoshelham, Kourosh, Vandewalle, Patrick

arXiv.org Artificial IntelligenceAug-29-2025

Point cloud registration is crucial for ensuring 3D alignment consistency of multiple local point clouds in 3D reconstruction for remote sensing or digital heritage. While various point cloud-based registration methods exist, both non-learning and learning-based, they ignore point orientations and point uncertainties, making the model susceptible to noisy input and aggressive rotations of the input point cloud like orthogonal transformation; thus, it necessitates extensive training point clouds with transformation augmentations. To address these issues, we propose a novel surfel-based pose learning regression approach. Our method can initialize surfels from Lidar point cloud using virtual perspective camera parameters, and learns explicit $\mathbf{SE(3)}$ equivariant features, including both position and rotation through $\mathbf{SE(3)}$ equivariant convolutional kernels to predict relative transformation between source and target scans. The model comprises an equivariant convolutional encoder, a cross-attention mechanism for similarity computation, a fully-connected decoder, and a non-linear Huber loss. Experimental results on indoor and outdoor datasets demonstrate our model superiority and robust performance on real point-cloud scans compared to state-of-the-art methods.

artificial intelligence, machine learning, registration, (18 more...)

arXiv.org Artificial Intelligence

2508.20789

Country: Europe > Belgium (0.14)

Genre: Research Report (0.70)

Industry:

Education (0.46)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model

Azhari, Maulana Bisyir, Shim, David Hyunchul

arXiv.org Artificial IntelligenceJul-18-2025

Learning-based monocular visual odometry (VO) poses robustness, generalization, and efficiency challenges in robotics. Recent advances in visual foundation models, such as DINOv2, have improved robustness and generalization in various vision tasks, yet their integration in VO remains limited due to coarse feature granularity. In this paper, we present DINO-VO, a feature-based VO system leveraging DINOv2 visual foundation model for its sparse feature matching. To address the integration challenge, we propose a salient keypoints detector tailored to DINOv2's coarse features. Furthermore, we complement DINOv2's robust-semantic features with fine-grained geometric features, resulting in more localizable representations. Finally, a transformer-based matcher and differentiable pose estimation layer enable precise camera motion estimation by learning good matches. Against prior detector-descriptor networks like SuperPoint, DINO-VO demonstrates greater robustness in challenging environments. Furthermore, we show superior accuracy and generalization of the proposed feature descriptors against standalone DINOv2 coarse features. DINO-VO outperforms prior frame-to-frame VO methods on the TartanAir and KITTI datasets and is competitive on EuRoC dataset, while running efficiently at 72 FPS with less than 1GB of memory usage on a single GPU. Moreover, it performs competitively against Visual SLAM systems on outdoor driving scenarios, showcasing its generalization capabilities.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.13145

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dance Style Recognition Using Laban Movement Analysis

Turab, Muhammad, Colantoni, Philippe, Muselet, Damien, Tremeau, Alain

arXiv.org Artificial IntelligenceMay-1-2025

The growing interest in automated movement analysis has presented new challenges in recognition of complex human activities including dance. This study focuses on dance style recognition using features extracted using Laban Movement Analysis. Previous studies for dance style recognition often focus on cross-frame movement analysis, which limits the ability to capture temporal context and dynamic transitions between movements. This gap highlights the need for a method that can add temporal context to LMA features. For this, we introduce a novel pipeline which combines 3D pose estimation, 3D human mesh reconstruction, and floor aware body modeling to effectively extract LMA features. To address the temporal limitation, we propose a sliding window approach that captures movement evolution across time in features. These features are then used to train various machine learning methods for classification, and their explainability explainable AI methods to evaluate the contribution of each feature to classification performance. Our proposed method achieves a highest classification accuracy of 99.18\% which shows that the addition of temporal context significantly improves dance style recognition performance.

artificial intelligence, machine learning, recognition, (13 more...)

arXiv.org Artificial Intelligence

2504.21166

Country: Europe (0.68)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.31)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Emotion Recognition in Contemporary Dance Performances Using Laban Movement Analysis

Turab, Muhammad, Colantoni, Philippe, Muselet, Damien, Tremeau, Alain

arXiv.org Artificial IntelligenceMay-1-2025

This paper presents a novel framework for emotion recognition in contemporary dance by improving existing Laban Movement Analysis (LMA) feature descriptors and introducing robust, novel descriptors that capture both quantitative and qualitative aspects of the movement. Our approach extracts expressive characteristics from 3D keypoints data of professional dancers performing contemporary dance under various emotional states, and train multiple classifiers, including Random Forests and Support Vector Machines. Additionally, provide in-depth explanation of features and their impact on model predictions using explainable machine learning methods. Overall, our study improves emotion recognition in contemporary dance and offers promising applications in performance analysis, dance training, and human-computer interaction with highest accuracy of 96.85%. Keywords: Laban Movement Analysis Emotion Recognition Explainable AI 3D Body Pose Estimation 1 Introduction In recent years, human emotion recognition has become an important research direction in the field of motion analysis and human-computer interaction.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2504.21154

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

Real-Time Sleepiness Detection for Driver State Monitoring System

Ghimire, Deepak, Jeong, Sunghwan, Yoon, Sunhong, Park, Sanghyun, Choi, Juhwan

arXiv.org Artificial IntelligenceApr-22-2025

Driver face monitoring system can detect driver fatigue, which is an important factor in a large number of accidents, using computer vision techniques. In this paper we present a real-time technique for driver eye state detection. At first face is detected and the eyes are searched inside face region for tracking. A normalized cross correlation based online dynamic template matching technique with combination of Kalman filter tracking is proposed to track the detected eye positions in the subsequent image frames. Support vector machine with histogram of orientation gradient features is used for classification of state of the eyes as open or closed. If the eye(s) state is detected as closed for a specified amount of time the driver is considered to be sleeping and an alarm will be generated.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.14807

Country:

Asia (0.95)
North America > United States (0.48)

Genre: Research Report (1.00)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)

Add feedback

Efficiently Closing Loops in LiDAR-Based SLAM Using Point Cloud Density Maps

Gupta, Saurabh, Guadagnino, Tiziano, Mersch, Benedikt, Trekel, Niklas, Malladi, Meher V. R., Stachniss, Cyrill

arXiv.org Artificial IntelligenceJan-13-2025

Consistent maps are key for most autonomous mobile robots. They often use SLAM approaches to build such maps. Loop closures via place recognition help maintain accurate pose estimates by mitigating global drift. This paper presents a robust loop closure detection pipeline for outdoor SLAM with LiDAR-equipped robots. The method handles various LiDAR sensors with different scanning patterns, field of views and resolutions. It generates local maps from LiDAR scans and aligns them using a ground alignment module to handle both planar and non-planar motion of the LiDAR, ensuring applicability across platforms. The method uses density-preserving bird's eye view projections of these local maps and extracts ORB feature descriptors from them for place recognition. It stores the feature descriptors in a binary search tree for efficient retrieval, and self-similarity pruning addresses perceptual aliasing in repetitive environments. Extensive experiments on public and self-recorded datasets demonstrate accurate loop closure detection, long-term localization, and cross-platform multi-map alignment, agnostic to the LiDAR scanning patterns, fields of view, and motion profiles.

artificial intelligence, closure, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.07399

Country: Europe (1.00)

Genre: Research Report (0.63)

Industry:

Transportation > Ground > Road (0.46)
Automobiles & Trucks (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

GIMS: Image Matching System Based on Adaptive Graph Construction and Graph Neural Network

Song, Xianfeng, Zou, Yi, Shi, Zheng, Liu, Zheng

arXiv.org Artificial IntelligenceDec-24-2024

Feature-based image matching has extensive applications in computer vision. Keypoints detected in images can be naturally represented as graph structures, and Graph Neural Networks (GNNs) have been shown to outperform traditional deep learning techniques. Consequently, the paradigm of image matching via GNNs has gained significant prominence in recent academic research. In this paper, we first introduce an innovative adaptive graph construction method that utilizes a filtering mechanism based on distance and dynamic threshold similarity. This method dynamically adjusts the criteria for incorporating new vertices based on the characteristics of existing vertices, allowing for the construction of more precise and robust graph structures while avoiding redundancy. We further combine the vertex processing capabilities of GNNs with the global awareness capabilities of Transformers to enhance the model's representation of spatial and feature information within graph structures. This hybrid model provides a deeper understanding of the interrelationships between vertices and their contributions to the matching process. Additionally, we employ the Sinkhorn algorithm to iteratively solve for optimal matching results. Finally, we validate our system using extensive image datasets and conduct comprehensive comparative experiments. Experimental results demonstrate that our system achieves an average improvement of 3.8x-40.3x in overall matching performance. Additionally, the number of vertices and edges significantly impacts training efficiency and memory usage; therefore, we employ multi-GPU technology to accelerate the training process. Our code is available at https://github.com/songxf1024/GIMS.

artificial intelligence, machine learning, vertex, (18 more...)

arXiv.org Artificial Intelligence

2412.18221

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States (0.04)
North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration

Kang, Xueyang, Luan, Zhaoliang, Khoshelham, Kourosh, Wang, Bing

arXiv.org Artificial IntelligenceOct-8-2024

Point cloud registration is a foundational task for 3D alignment and reconstruction applications. While both traditional and learning-based registration approaches have succeeded, leveraging the intrinsic symmetry of point cloud data, including rotation equivariance, has received insufficient attention. This prohibits the model from learning effectively, resulting in a requirement for more training data and increased model complexity. To address these challenges, we propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation. Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers. Such modular design enables us to utilize sparsely sampled input points and initialize the descriptor by self-trained or pre-trained geometric feature descriptors easily. Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches, while the model complexity remains relatively low at the same time.

descriptor, point cloud registration, registration, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-73235-5_9

2410.05729

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > China > Hong Kong (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique

Pala, Tej Deep, Toh, Vernon Y. H., Bhardwaj, Rishabh, Poria, Soujanya

arXiv.org Artificial IntelligenceAug-20-2024

In today's era, where large language models (LLMs) are integrated into numerous real-world applications, ensuring their safety and robustness is crucial for responsible AI usage. Automated red-teaming methods play a key role in this process by generating adversarial attacks to identify and mitigate potential vulnerabilities in these models. However, existing methods often struggle with slow performance, limited categorical diversity, and high resource demands. While Rainbow Teaming, a recent approach, addresses the diversity challenge by framing adversarial prompt generation as a quality-diversity search, it remains slow and requires a large fine-tuned mutator for optimal performance. To overcome these limitations, we propose Ferret, a novel approach that builds upon Rainbow Teaming by generating multiple adversarial prompt mutations per iteration and using a scoring function to rank and select the most effective adversarial prompt. We explore various scoring functions, including reward models, Llama Guard, and LLM-as-a-judge, to rank adversarial mutations based on their potential harm to improve the efficiency of the search for harmful mutations. Our results demonstrate that Ferret, utilizing a reward model as a scoring function, improves the overall attack success rate (ASR) to 95%, which is 46% higher than Rainbow Teaming. Additionally, Ferret reduces the time needed to achieve a 90% ASR by 15.2% compared to the baseline and generates adversarial prompts that are transferable i.e. effective on other LLMs of larger size. Our codes are available at https://github.com/declare-lab/ferret.

category, llama guard 2, mutation, (14 more...)

arXiv.org Artificial Intelligence

2408.10701

Country: Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law > Criminal Law (0.93)
Health & Medicine > Therapeutic Area (0.68)
Government > Military (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback