AITopics | Ma, Bowen

Collaborating Authors

Ma, Bowen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP

Lin, Shuyang, Jia, Tong, Wang, Hao, Ma, Bowen, Li, Mingyuan, Chen, Dongyue

arXiv.org Artificial IntelligenceJun-16-2024

X-ray prohibited item detection is an essential component of security check and categories of prohibited item are continuously increasing in accordance with the latest laws. Previous works all focus on close-set scenarios, which can only recognize known categories used for training and often require time-consuming as well as labor-intensive annotations when learning novel categories, resulting in limited real-world applications. Although the success of vision-language models (e.g. CLIP) provides a new perspectives for open-set X-ray prohibited item detection, directly applying CLIP to X-ray domain leads to a sharp performance drop due to domain shift between X-ray data and general data used for pre-training CLIP. To address aforementioned challenges, in this paper, we introduce distillation-based open-vocabulary object detection (OVOD) task into X-ray security inspection domain by extending CLIP to learn visual representations in our specific X-ray domain, aiming to detect novel prohibited item categories beyond base categories on which the detector is trained. Specifically, we propose X-ray feature adapter and apply it to CLIP within OVOD framework to develop OVXD model. X-ray feature adapter containing three adapter submodules of bottleneck architecture, which is simple but can efficiently integrate new knowledge of X-ray domain with original knowledge, further bridge domain gap and promote alignment between X-ray images and textual concepts. Extensive experiments conducted on PIXray and PIDray datasets demonstrate that proposed method performs favorably against other baseline OVOD methods in detecting novel categories in X-ray scenario. It outperforms previous best result by 15.2 AP50 and 1.5 AP50 on PIXray and PIDray with achieving 21.0 AP50 and 27.8 AP50 respectively.

category, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.10961

Country:

Asia > China > Liaoning Province (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AO-DETR: Anti-Overlapping DETR for X-Ray Prohibited Items Detection

Li, Mingyuan, Jia, Tong, Wang, Hao, Ma, Bowen, Lin, Shuyang, Cai, Da, Chen, Dongyue

arXiv.org Artificial IntelligenceMar-7-2024

Prohibited item detection in X-ray images is one of the most essential and highly effective methods widely employed in various security inspection scenarios. Considering the significant overlapping phenomenon in X-ray prohibited item images, we propose an Anti-Overlapping DETR (AO-DETR) based on one of the state-of-the-art general object detectors, DINO. Specifically, to address the feature coupling issue caused by overlapping phenomena, we introduce the Category-Specific One-to-One Assignment (CSA) strategy to constrain category-specific object queries in predicting prohibited items of fixed categories, which can enhance their ability to extract features specific to prohibited items of a particular category from the overlapping foreground-background features. To address the edge blurring problem caused by overlapping phenomena, we propose the Look Forward Densely (LFD) scheme, which improves the localization accuracy of reference boxes in mid-to-high-level decoder layers and enhances the ability to locate blurry edges of the final layer. Similar to DINO, our AO-DETR provides two different versions with distinct backbones, tailored to meet diverse application requirements. Extensive experiments on the PIXray and OPIXray datasets demonstrate that the proposed method surpasses the state-of-the-art object detectors, indicating its potential applications in the field of prohibited item detection. The source code will be released at https://github.com/Limingyuan001/AO-DETR-test.

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.04309

Country:

Asia > China (0.70)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

FlowFace: Semantic Flow-guided Shape-aware Face Swapping

Zeng, Hao, Zhang, Wei, Fan, Changjie, Lv, Tangjie, Wang, Suzhen, Zhang, Zhimeng, Ma, Bowen, Li, Lincheng, Ding, Yu, Yu, Xin

arXiv.org Artificial IntelligenceDec-6-2022

In this work, we propose a semantic flow-guided two-stage framework for shape-aware face swapping, namely FlowFace. Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping. Concretely, our FlowFace consists of a face reshaping network and a face swapping network. The face reshaping network addresses the shape outline differences between the source and target faces. It first estimates a semantic flow (i.e., face shape differences) between the source and the target face, and then explicitly warps the target face shape with the estimated semantic flow. After reshaping, the face swapping network generates inner facial features that exhibit the identity of the source face. We employ a pre-trained face masked autoencoder (MAE) to extract facial features from both the source face and the target face. In contrast to previous methods that use identity embedding to preserve identity information, the features extracted by our encoder can better capture facial appearances and identity information. Then, we develop a cross-attention fusion module to adaptively fuse inner facial features from the source face with the target facial attributes, thus leading to better identity preservation. Extensive quantitative and qualitative experiments on in-the-wild faces demonstrate that our FlowFace outperforms the state-of-the-art significantly.

artificial intelligence, machine learning, target face, (14 more...)

arXiv.org Artificial Intelligence

2212.02797

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback