AITopics

Country: Asia > Singapore (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)

arXiv.org Artificial IntelligenceAug-27-2025

Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection

Tao, Ye, Fu, Xinran, Pang, Honglin, Yang, Xi, Li, Chuntao

Oracle Bone Inscriptions (OBIs), play a crucial role in understanding ancient Chinese civilization. The automated detection of OBIs from rubbing images represents a fundamental yet challenging task in digital archaeology, primarily due to various degradation factors including noise and cracks that limit the effectiveness of conventional detection networks. To address these challenges, we propose a novel clustering-based feature space representation learning method. Our approach uniquely leverages the Oracle Bones Character (OBC) font library dataset as prior knowledge to enhance feature extraction in the detection network through clustering-based representation learning. The method incorporates a specialized loss function derived from clustering results to optimize feature representation, which is then integrated into the total network loss. We validate the effectiveness of our method by conducting experiments on two OBIs detection dataset using three mainstream detection frameworks: Faster R-CNN, DETR, and Sparse R-CNN. Through extensive experimentation, all frameworks demonstrate significant performance improvements.

artificial intelligence, knowledge, machine learning, (19 more...)

2508.18641

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Neural Information Processing SystemsAug-20-2025, 04:10:24 GMT

Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution

Thang Vu, Hyunjun Jang, Trung X. Pham, Chang Yoo

RPN relies on a single anchor per location and performs multi-stage refinement.

anchor, artificial intelligence, machine learning, (15 more...)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Behravan, Majid, Haghani, Maryam, Gracanin, Denis

Transcending Dimensions using Generative AI: Real-Time 3D Model Generation in Augmented Reality

arXiv.org Artificial IntelligenceMay-1-2025

Traditional 3D modeling requires technical expertise, specialized software, and time-intensive processes, making it inaccessible for many users. Our research aims to lower these barriers by combining generative AI and augmented reality (AR) into a cohesive system that allows users to easily generate, manipulate, and interact with 3D models in real time, directly within AR environments. Utilizing cutting-edge AI models like Shap-E, we address the complex challenges of transforming 2D images into 3D representations in AR environments. Key challenges such as object isolation, handling intricate backgrounds, and achieving seamless user interaction are tackled through advanced object detection methods, such as Mask R-CNN. Evaluation results from 35 participants reveal an overall System Usability Scale (SUS) score of 69.64, with participants who engaged with AR/VR technologies more frequently rating the system significantly higher, at 80.71. This research is particularly relevant for applications in gaming, education, and AR-based e-commerce, offering intuitive, model creation for users without specialized skills.

ar environment, machine learning, natural language, (17 more...)

2504.21033

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceFeb-5-2025

An Empirical Study of Methods for Small Object Detection from Satellite Imagery

Yuan, Xiaohui, Chakravarty, Aniv, Gu, Lichuan, Wei, Zhenchun, Lichtenberg, Elinor, Chen, Tian

This paper reviews object detection methods for finding small objects from remote sensing imagery and provides an empirical evaluation of four state-of-the-art methods to gain insights into method performance and technical challenges. In particular, we use car detection from urban satellite images and bee box detection from satellite images of agricultural lands as application scenarios. Drawing from the existing surveys and literature, we identify several top-performing methods for the empirical study. Public, high-resolution satellite image datasets are used in our experiments.

artificial intelligence, detection, machine learning, (18 more...)

2502.03674

Country: North America > United States > Michigan > Van Buren County (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.73)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJan-23-2025, 07:24:28 GMT

Reviews: FreeAnchor: Learning to Match Anchors for Visual Object Detection

I am raising my score to seven. The authors begin by noting that many existing object detection pipelines include a step on'anchor assignment', where from a large set of candidate bounding boxes (or "anchors") in a generic image frame, the one that best matches the ground truth bounding box, as measure by IoU, is chosen to be the one that is used for training, ie the object detection and bounding box regression outputs for that anchor will be pushed towards the ground truth. The authors note that for objects which don't fill the anchor well (slim objects oriented diagonally, objects with holes, or occluded objects) the best anchor according to this IoU comparison may be actively bad for training as a whole. The authors propose "learning to match", ie producing a custom likelihood which promotes both precision and recall of the final result (making reference to terms from the traditional loss function). For each ground truth bounding box, a'bag of anchors' is selected by ranking IoU and picking the best n. During training, a different bounding box is selected from this bag for each object, for each backwards pass.

anchor, ground truth, visual object detection, (10 more...)

Genre: Summary/Review (0.36)

Technology: Information Technology > Artificial Intelligence > Vision (0.82)

Neural Information Processing SystemsJan-20-2025, 05:38:01 GMT

Reviews: Integrated perception with recurrent multi-task neural networks

This paper is crystal clear and the main points are easily accessible. The key idea of integrated learning of representation sharing and output correlation is sound and well executed in the new architecture comprising CNNs, R-CNNs, RNNs and autoencoders. My main concern is regarding the experimental evaluation. There is clear room for improvement: (1) the authors are encouraged to use the standard VOC 2012 dataset instead of the more obsolete VOC 2010/2007 datasets--this makes direct comparison of different methods possible; (2) the baseline methods (Independent and Multi-task in Table 1) are too simple to justify the effectiveness of the proposed method, and more recent work on multi-task deep learning should be compared. Note that, although this paper contrasts itself clearly from the literature, it does not mean that it is enough to evaluate the proposed method only against simple baselines.

integrated perception, perception, recurrent multi-task neural network, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neha, Fnu, Bhati, Deepshikha, Shukla, Deepak Kumar, Amiruzzaman, Md

From classical techniques to convolution-based models: A review of object detection algorithms

arXiv.org Artificial IntelligenceDec-6-2024

Object detection is a fundamental task in computer vision and image understanding, with the goal of identifying and localizing objects of interest within an image while assigning them corresponding class labels. Traditional methods, which relied on handcrafted features and shallow models, struggled with complex visual data and showed limited performance. These methods combined low-level features with contextual information and lacked the ability to capture high-level semantics. Deep learning, especially Convolutional Neural Networks (CNNs), addressed these limitations by automatically learning rich, hierarchical features directly from data. These features include both semantic and high-level representations essential for accurate object detection. This paper reviews object detection frameworks, starting with classical computer vision methods. We categorize object detection approaches into two groups: (1) classical computer vision techniques and (2) CNN-based detectors. We compare major CNN models, discussing their strengths and limitations. In conclusion, this review highlights the significant advancements in object detection through deep learning and identifies key areas for further research to improve performance.

artificial intelligence, detection, machine learning, (19 more...)