AITopics | Ren, Botao

Collaborating Authors

Ren, Botao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

Yu, Yi, Ren, Botao, Zhang, Peiyuan, Liu, Mingxin, Luo, Junwei, Zhang, Shaofeng, Da, Feipeng, Yan, Junchi, Yang, Xue

arXiv.org Artificial IntelligenceFeb-6-2025

With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning OOD from point annotations has gained great attention. In this paper, we rethink this challenging task setting with the layout among instances and present Point2RBox-v2. At the core are three principles: 1) Gaussian overlap loss. It learns an upper bound for each instance by treating objects as 2D Gaussian distributions and minimizing their overlap. 2) Voronoi watershed loss. It learns a lower bound for each instance through watershed on Voronoi tessellation. 3) Consistency loss. It learns the size/rotation variation between two output sets with respect to an input image and its augmented view. Supplemented by a few devised techniques, e.g. edge loss and copy-paste, the detector is further enhanced. To our best knowledge, Point2RBox-v2 is the first approach to explore the spatial layout among instances for learning point-supervised OOD. Our solution is elegant and lightweight, yet it is expected to give a competitive performance especially in densely packed scenes: 62.61%/86.15%/34.71% on DOTA/HRSC/FAIR1M. Code is available at https://github.com/VisionXLab/point2rbox-v2.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.04268

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ArtFormer: Controllable Generation of Diverse 3D Articulated Objects

Su, Jiayi, Feng, Youhe, Li, Zheng, Song, Jinhua, He, Yangfan, Ren, Botao, Xu, Botian

arXiv.org Artificial IntelligenceDec-10-2024

This paper presents a novel framework for modeling and conditional generation of 3D articulated objects. Troubled by flexibility-quality tradeoffs, existing methods are often limited to using predefined structures or retrieving shapes from static datasets. To address these challenges, we parameterize an articulated object as a tree of tokens and employ a transformer to generate both the object's high-level geometry code and its kinematic relations. Subsequently, each sub-part's geometry is further decoded using a signed-distance-function (SDF) shape prior, facilitating the synthesis of high-quality 3D shapes. Our approach enables the generation of diverse objects with high-quality geometry and varying number of parts. Comprehensive experiments on conditional generation from text descriptions demonstrate the effectiveness and flexibility of our method.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.07237

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Ren, Botao, Yang, Xue, Yu, Yi, Luo, Junwei, Deng, Zhidong

arXiv.org Artificial IntelligenceOct-10-2024

Single point supervised oriented object detection has gained attention and made initial progress within the community. SAM), PointOBB has shown promise due to its prior-free feature. In this paper, we propose PointOBBv2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior. Specifically, we first generate a Class Probability Map (CPM) by training the network with non-uniform positive and negative sampling. We show that the CPM is able to learn the approximate object regions and their contours. Then, Principal Component Analysis (PCA) is applied to accurately estimate the orientation and the boundary of objects. By further incorporating a separation mechanism, we resolve the confusion caused by the overlapping on the CPM, enabling its operation in high-density scenarios. Extensive comparisons demonstrate that our method achieves a training speed 15.58 faster and an accuracy improvement of 11.60%/25.15%/21.19% on the DOTAv1.0/v1.5/v2.0 This significantly advances the cutting edge of single point supervised oriented detection in the modular track. Oriented object detection is essential for accurately labeling small and densely packed objects, especially in scenarios like remote sensing imagery, retail analysis, and scene text detection, where Oriented Bounding Boxes (OBBs) provide precise annotations.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.0821

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.35)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Improving Detection in Aerial Images by Capturing Inter-Object Relationships

Ren, Botao, Xu, Botian, Pu, Yifan, Wang, Jingyi, Deng, Zhidong

arXiv.org Artificial IntelligenceApr-5-2024

In many image domains, the spatial distribution of objects in a scene exhibits meaningful patterns governed by their semantic relationships. In most modern detection pipelines, however, the detection proposals are processed independently, overlooking the underlying relationships between objects. In this work, we introduce a transformer-based approach to capture these inter-object relationships to refine classification and regression outcomes for detected objects. Building on two-stage detectors, we tokenize the region of interest (RoI) proposals to be processed by a transformer encoder. Specific spatial and geometric relations are incorporated into the attention weights and adaptively modulated and regularized. Experimental results demonstrate that the proposed method achieves consistent performance improvement on three benchmarks including DOTA-v1.0, DOTA-v1.5, and HRSC 2016, especially ranking first on both DOTA-v1.5 and HRSC 2016. Specifically, our new method has an increase of 1.59 mAP on DOTA-v1.0, 4.88 mAP on DOTA-v1.5, and 2.1 mAP on HRSC 2016, respectively, compared to the baselines.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.0414

Country:

Asia > China (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Feedback RoI Features Improve Aerial Object Detection

Ren, Botao, Xu, Botian, Liu, Tengyu, Wang, Jingyi, Deng, Zhidong

arXiv.org Artificial IntelligenceNov-28-2023

Neuroscience studies have shown that the human visual system utilizes high-level feedback information to guide lower-level perception, enabling adaptation to signals of different characteristics. In light of this, we propose Feedback multi-Level feature Extractor (Flex) to incorporate a similar mechanism for object detection. Flex refines feature selection based on image-wise and instance-level feedback information in response to image quality variation and classification uncertainty. Experimental results show that Flex offers consistent improvement to a range of existing SOTA methods on the challenging aerial object detection datasets including DOTA-v1.0, DOTA-v1.5, and HRSC2016. Although the design originates in aerial image detection, further experiments on MS COCO also reveal our module's efficacy in general detection models. Quantitative and qualitative analyses indicate that the improvements are closely related to image qualities, which match our motivation.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2311.17129

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback