AITopics | Xiao, Jimin

Collaborating Authors

Xiao, Jimin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Image Fusion for Cross-Domain Sequential Recommendation

Wu, Wangyu, Song, Siqi, Qiu, Xianglin, Huang, Xiaowei, Ma, Fei, Xiao, Jimin

arXiv.org Artificial IntelligenceFeb-26-2025

Cross-Domain Sequential Recommendation (CDSR) aims to predict future user interactions based on historical interactions across multiple domains. The key challenge in CDSR is effectively capturing cross-domain user preferences by fully leveraging both intra-sequence and inter-sequence item interactions. In this paper, we propose a novel method, Image Fusion for Cross-Domain Sequential Recommendation (IFCDSR), which incorporates item image information to better capture visual preferences. Our approach integrates a frozen CLIP model to generate image embeddings, enriching original item embeddings with visual data from both intra-sequence and inter-sequence interactions. Additionally, we employ a multiple attention layer to capture cross-domain interests, enabling joint learning of single-domain and cross-domain user preferences. To validate the effectiveness of IFCDSR, we re-partitioned four e-commerce datasets and conducted extensive experiments. Results demonstrate that IFCDSR significantly outperforms existing methods.

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Artificial Intelligence

2502.15694

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Promising Solution (0.66)

Industry: Information Technology (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Wang, Xiaolei, Wang, Xiaoyang, Bai, Huihui, Lim, Eng Gee, Xiao, Jimin

arXiv.org Artificial IntelligenceDec-31-2024

Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to over-generalization(OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate OG, we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a normal textual representation, suppressing over-generalization of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.00346

Country: Asia > China (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Event USKT : U-State Space Model in Knowledge Transfer for Event Cameras

Lin, Yuhui, Zhang, Jiahao, Li, Siyuan, Xiao, Jimin, Xu, Ding, Wu, Wenjun, Lu, Jiaxuan

arXiv.org Artificial IntelligenceNov-22-2024

Event cameras, as an emerging imaging technology, offer distinct advantages over traditional RGB cameras, including reduced energy consumption and higher frame rates. However, the limited quantity of available event data presents a significant challenge, hindering their broader development. To alleviate this issue, we introduce a tailored U-shaped State Space Model Knowledge Transfer (USKT) framework for Event-to-RGB knowledge transfer. This framework generates inputs compatible with RGB frames, enabling event data to effectively reuse pre-trained RGB models and achieve competitive performance with minimal parameter tuning. Within the USKT architecture, we also propose a bidirectional reverse state space model. Unlike conventional bidirectional scanning mechanisms, the proposed Bidirectional Reverse State Space Model (BiR-SSM) leverages a shared weight strategy, which facilitates efficient modeling while conserving computational resources. In terms of effectiveness, integrating USKT with ResNet50 as the backbone improves model performance by 0.95%, 3.57%, and 2.9% on DVS128 Gesture, N-Caltech101, and CIFAR-10-DVS datasets, respectively, underscoring USKT's adaptability and effectiveness. The code will be made available upon acceptance.

artificial intelligence, computer vision, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.15276

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

High-Frequency Enhanced Hybrid Neural Representation for Video Compression

Yu, Li, Li, Zhihui, Xiao, Jimin, Gabbouj, Moncef

arXiv.org Artificial IntelligenceNov-10-2024

According to statistics, in 2023, more than 65% of total Internet traffic is video content (Corporation, 2023), and this percentage is expected to continue increasing. In the past, video compression was usually achieved by traditional codecs like H.264/AVC (Wiegand et al., 2003), H.265/HEVC (Sullivan et al., 2012), H.266/VVC (Bross et al., 2021), and AVS (Zhang et al., 2019). However, the handcrafted algorithms in these traditional codecs would limit the compression efficiency. With the rise of deep learning, many neural video codec (NVC) technologies have been proposed (Lu et al., 2019; Li et al., 2021; Agustsson et al., 2020; Wang et al., 2024b). These approaches replace handcrafted components with deep learning modules, achieving impressive rate-distortion performance. However, these NVC approaches have not yet achieved widespread adoption in practical applications. One reason for this is that these approaches often require a large network to achieve generalized compression over the entire data distribution, which is more computationally intensive and frequently leads to slower decoding speeds compared to traditional codecs. Moreover, the generalization capability of the network depends on the dataset used for model training, leading to poor performance on out-of-distribution (OOD) data from different domains (Zhang et al., 2021a), and even when the resolution changes. To overcome these challenges associated with NVCs, researchers have turned to implicit neural representations (INRs) as a promising alternative.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2411.06685

Country: Asia (0.29)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation

Zhao, Xinqiao, Tang, Feilong, Wang, Xiaoyang, Xiao, Jimin

arXiv.org Artificial IntelligenceJan-22-2024

Image-level weakly supervised semantic segmentation has received increasing attention due to its low annotation cost. Existing methods mainly rely on Class Activation Mapping (CAM) to obtain pseudo-labels for training semantic segmentation models. In this work, we are the first to demonstrate that long-tailed distribution in training data can cause the CAM calculated through classifier weights over-activated for head classes and under-activated for tail classes due to the shared features among head- and tail- classes. This degrades pseudo-label quality and further influences final semantic segmentation performance. To address this issue, we propose a Shared Feature Calibration (SFC) method for CAM generation. Specifically, we leverage the class prototypes that carry positive shared features and propose a Multi-Scaled Distribution-Weighted (MSDW) consistency loss for narrowing the gap between the CAMs generated through classifier weights and class prototypes during training. The MSDW loss counterbalances over-activation and under-activation by calibrating the shared features in head-/tail-class classifier weights. Experimental results show that our SFC significantly improves CAM boundaries and achieves new state-of-the-art performances. The project is available at https://github.com/Barrett-python/SFC.

artificial intelligence, machine learning, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2401.11719

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Trajectory Poisson multi-Bernoulli mixture filter for traffic monitoring using a drone

García-Fernández, Ángel F., Xiao, Jimin

arXiv.org Machine LearningAug-28-2023

This paper proposes a multi-object tracking (MOT) algorithm for traffic monitoring using a drone equipped with optical and thermal cameras. Object detections on the images are obtained using a neural network for each type of camera. The cameras are modelled as direction-of-arrival (DOA) sensors. Each DOA detection follows a von-Mises Fisher distribution, whose mean direction is obtain by projecting a vehicle position on the ground to the camera. We then use the trajectory Poisson multi-Bernoulli mixture filter (TPMBM), which is a Bayesian MOT algorithm, to optimally estimate the set of vehicle trajectories. We have also developed a parameter estimation algorithm for the measurement model. We have tested the accuracy of the resulting TPMBM filter in synthetic and experimental data sets.

artificial intelligence, coordinate system, machine learning, (16 more...)

arXiv.org Machine Learning

2306.1689

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking

Ma, Teli, Wang, Mengmeng, Xiao, Jimin, Wu, Huifeng, Liu, Yong

arXiv.org Artificial IntelligenceAug-24-2023

Siamese network has been a de facto benchmark framework for 3D LiDAR object tracking with a shared-parametric encoder extracting features from template and search region, respectively. This paradigm relies heavily on an additional matching network to model the cross-correlation/similarity of the template and search region. In this paper, we forsake the conventional Siamese paradigm and propose a novel single-branch framework, SyncTrack, synchronizing the feature extracting and matching to avoid forwarding encoder twice for template and search region as well as introducing extra parameters of matching network. The synchronization mechanism is based on the dynamic affinity of the Transformer, and an in-depth analysis of the relevance is provided theoretically. Moreover, based on the synchronization, we introduce a novel Attentive Points-Sampling strategy into the Transformer layers (APST), replacing the random/Farthest Points Sampling (FPS) method with sampling under the supervision of attentive relations between the template and search region. It implies connecting point-wise sampling with the feature learning, beneficial to aggregating more distinctive and geometric features for tracking with sparse points. Extensive experiments on two benchmark datasets (KITTI and NuScenes) show that SyncTrack achieves state-of-the-art performance in real-time tracking.

artificial intelligence, computer vision, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.12549

Country: Asia > China (0.46)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Generative Adversarial Classifier for Handwriting Characters Super-Resolution

Qian, Zhuang, Huang, Kaizhu, Wang, Qiufeng, Xiao, Jimin, Zhang, Rui

arXiv.org Artificial IntelligenceJan-18-2019

Generative Adversarial Networks (GAN) receive great attentions recently due to its excellent performance in image generation, transformation, and super-resolution. However, GAN has rarely been studied and trained for classification, leading that the generated images may not be appropriate for classification. In this paper, we propose a novel Generative Adversarial Classifier (GAC) particularly for low-resolution Handwriting Character Recognition. Specifically, involving additionally a classifier in the training process of normal GANs, GAC is calibrated for learning suitable structures and restored characters images that benefits the classification. Experimental results show that our proposed method can achieve remarkable performance in handwriting characters 8x super-resolution, approximately 10% and 20% higher than the present state-of-the-art methods respectively on benchmark data CASIA-HWDB1.1 and MNIST.

artificial intelligence, classifier, neural network, (17 more...)

arXiv.org Artificial Intelligence

1901.06199

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback