AITopics | auxiliary branch

Collaborating Authors

auxiliary branch

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning

Hou, Yaxin, Han, Bo, Jia, Yuheng, Liu, Hui, Hou, Junhui

arXiv.org Artificial IntelligenceDec-12-2025

Current long-tailed semi-supervised learning methods assume that labeled data exhibit a long-tailed distribution, and unlabeled data adhere to a typical predefined distribution (i.e., long-tailed, uniform, or inverse long-tailed). However, the distribution of the unlabeled data is generally unknown and may follow an arbitrary distribution. To tackle this challenge, we propose a Controllable Pseudo-label Generation (CPG) framework, expanding the labeled dataset with the progressively identified reliable pseudo-labels from the unlabeled dataset and training the model on the updated labeled dataset with a known distribution, making it unaffected by the unlabeled data distribution. Specifically, CPG operates through a controllable self-reinforcing optimization cycle: (i) at each training step, our dynamic controllable filtering mechanism selectively incorporates reliable pseudo-labels from the unlabeled dataset into the labeled dataset, ensuring that the updated labeled dataset follows a known distribution; (ii) we then construct a Bayes-optimal classifier using logit adjustment based on the updated labeled data distribution; (iii) this improved classifier subsequently helps identify more reliable pseudo-labels in the next training step. We further theoretically prove that this optimization cycle can significantly reduce the generalization error under some conditions. Additionally, we propose a class-aware adaptive augmentation module to further improve the representation of minority classes, and an auxiliary branch to maximize data utilization by leveraging all labeled and unlabeled samples. Comprehensive evaluations on various commonly used benchmark datasets show that CPG achieves consistent improvements, surpassing state-of-the-art methods by up to $\textbf{15.97%}$ in accuracy. The code is available at https://github.com/yaxinhou/CPG.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.03993

Country:

Asia > China > Hong Kong (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Seeing through Unclear Glass: Occlusion Removal with One Shot

Li, Qiang, Cao, Yuanming

arXiv.org Artificial IntelligenceSep-3-2025

Images taken through window glass are often degraded by contaminants adhered to the glass surfaces. Such contaminants cause occlusions that attenuate the incoming light and scatter stray light towards the camera. Most of existing deep learning methods for neutralizing the effects of contaminated glasses relied on synthetic training data. Few researchers used real degraded and clean image pairs, but they only considered removing or alleviating the effects of rain drops on glasses. This paper is concerned with the more challenging task of learning the restoration of images taken through glasses contaminated by a wide range of oc-cluders, including muddy water, dirt and other small foreign particles found in reality. T o facilitate the learning task we have gone to a great length to acquire real paired images with and without glass contaminants. More importantly, we propose an all-in-one model to neutralize contaminants of different types by utilizing the one-shot test-time adaptation mechanism. It involves a self-supervised auxiliary learning task to update the trained model for the unique occlusion type of each test image. Experimental results show that the proposed method outperforms the state-of-the-art methods quantitatively and qualitatively in cleaning realistic contaminated images, especially the unseen ones.

artificial intelligence, machine learning, occlusion, (15 more...)

arXiv.org Artificial Intelligence

2509.01033

Country:

North America > Canada > Ontario > Hamilton (0.40)
Asia (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Clinical Inspired MRI Lesion Segmentation

Yan, Lijun, Wang, Churan, Zhong, Fangwei, Wang, Yizhou

arXiv.org Artificial IntelligenceFeb-21-2025

Magnetic resonance imaging (MRI) is a potent diagnostic tool for detecting pathological tissues in various diseases. Different MRI sequences have different contrast mechanisms and sensitivities for different types of lesions, which pose challenges to accurate and consistent lesion segmentation. In clinical practice, radiologists commonly use the sub-sequence feature, i.e. the difference between post contrast-enhanced T1-weighted (post) and pre-contrast-enhanced (pre) sequences, to locate lesions. Inspired by this, we propose a residual fusion method to learn subsequence representation for MRI lesion segmentation. Specifically, we iteratively and adaptively fuse features from pre- and post-contrast sequences at multiple resolutions, using dynamic weights to achieve optimal fusion and address diverse lesion enhancement patterns. Our method achieves state-of-the-art performances on BraTS2023 dataset for brain tumor segmentation and our in-house breast MRI dataset for breast lesion segmentation. Our method is clinically inspired and has the potential to facilitate lesion segmentation in various applications.

auxiliary branch, dataset, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2502.16032

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Gated-Attention Feature-Fusion Based Framework for Poverty Prediction

Ramzan, Muhammad Umer, Khaddim, Wahab, Rana, Muhammad Ehsan, Ali, Usman, Ali, Manohar, Hassan, Fiaz ul, Mehmood, Fatima

arXiv.org Artificial IntelligenceNov-29-2024

This research paper addresses the significant challenge of accurately estimating poverty levels using deep learning, particularly in developing regions where traditional methods like household surveys are often costly, infrequent, and quickly become outdated. To address these issues, we propose a state-of-the-art Convolutional Neural Network (CNN) architecture, extending the ResNet50 model by incorporating a Gated-Attention Feature-Fusion Module (GAFM). Our architecture is designed to improve the model's ability to capture and combine both global and local features from satellite images, leading to more accurate poverty estimates. The model achieves a 75% R2 score, significantly outperforming existing leading methods in poverty mapping. This improvement is due to the model's capacity to focus on and refine the most relevant features, filtering out unnecessary data, which makes it a powerful tool for remote sensing and poverty estimation.

artificial intelligence, machine learning, satellite imagery, (13 more...)

arXiv.org Artificial Intelligence

2411.1969

Country:

Africa > Nigeria (0.05)
Asia > Thailand (0.04)
South America > Colombia > Antioquia Department > Medellín (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Banking & Finance > Economy (0.70)
Government > Regional Government (0.47)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection

Zhao, Yun, Gong, Zhan, Zheng, Peiru, Zhu, Hong, Wu, Shaohua

arXiv.org Artificial IntelligenceNov-7-2024

More and more research works fuse the LiDAR and camera information to improve the 3D object detection of the autonomous driving system. Recently, a simple yet effective fusion framework has achieved an excellent detection performance, fusing the LiDAR and camera features in a unified bird's-eye-view (BEV) space. In this paper, we propose a LiDAR-camera fusion framework, named SimpleBEV, for accurate 3D object detection, which follows the BEV-based fusion framework and improves the camera and LiDAR encoders, respectively. Specifically, we perform the camera-based depth estimation using a cascade network and rectify the depth results with the depth information derived from the LiDAR points. Meanwhile, an auxiliary branch that implements the 3D object detection using only the camera-BEV features is introduced to exploit the camera information during the training phase. Besides, we improve the LiDAR feature extractor by fusing the multi-scaled sparse convolutional features. Experimental results demonstrate the effectiveness of our proposed method. Our method achieves 77.6\% NDS accuracy on the nuScenes dataset, showcasing superior performance in the 3D object detection track.

artificial intelligence, detection, information, (16 more...)

arXiv.org Artificial Intelligence

2411.05292

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Transportation (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.35)

Add feedback

Leveraging Recurrent Neural Networks for Predicting Motor Movements from Primate Motor Cortex Neural Recordings

Wang, Yuanxi, Wang, Zuowen, Liu, Shih-Chii

arXiv.org Artificial IntelligenceNov-1-2024

This paper presents an efficient deep learning solution for decoding motor movements from neural recordings in non-human primates. An Autoencoder Gated Recurrent Unit (AEGRU) model was adopted as the model architecture for this task. The autoencoder is only used during the training stage to achieve better generalization. Together with the preprocessing techniques, our model achieved 0.71 $R^2$ score, surpassing the baseline models in Neurobench and is ranked first for $R^2$ in the IEEE BioCAS 2024 Grand Challenge on Neural Decoding. Model pruning is also applied leading to a reduction of 41.4% of the multiply-accumulate (MAC) operations with little change in the $R^2$ score compared to the unpruned model.

artificial intelligence, gru layer, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.22283

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Utah (0.05)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Ada-DF: An Adaptive Label Distribution Fusion Network For Facial Expression Recognition

Liu, Shu, Xu, Yan, Wan, Tongming, Kui, Xiaoyan

arXiv.org Artificial IntelligenceApr-24-2024

Facial expression recognition (FER) plays a significant role in our daily life. However, annotation ambiguity in the datasets could greatly hinder the performance. In this paper, we address FER task via label distribution learning paradigm, and develop a dual-branch Adaptive Distribution Fusion (Ada-DF) framework. One auxiliary branch is constructed to obtain the label distributions of samples. The class distributions of emotions are then computed through the label distributions of each emotion. Finally, those two distributions are adaptively fused according to the attention weights to train the target branch. Extensive experiments are conducted on three real-world datasets, RAF-DB, AffectNet and SFEW, where our Ada-DF shows advantages over the state-of-the-art works.

auxiliary branch, class distribution, label distribution, (12 more...)

arXiv.org Artificial Intelligence

2404.15714

Country:

North America > United States > Nebraska (0.04)
Asia > China > Hunan Province (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-level Asymmetric Contrastive Learning for Medical Image Segmentation Pre-training

Zeng, Shuang, Zhu, Lei, Zhang, Xinliang, Tian, Zifeng, Chen, Qian, Jin, Lujia, Wang, Jiayi, Lu, Yanye

arXiv.org Artificial IntelligenceSep-21-2023

Contrastive learning, which is a powerful technique for learning image-level representations from unlabeled data, leads a promising direction to dealing with the dilemma between large-scale pre-training and limited labeled data. However, most existing contrastive learning strategies are designed mainly for downstream tasks of natural images, therefore they are sub-optimal and even worse than learning from scratch when directly applied to medical images whose downstream tasks are usually segmentation. In this work, we propose a novel asymmetric contrastive learning framework named JCL for medical image segmentation with self-supervised pre-training. Specifically, (1) A novel asymmetric contrastive learning strategy is proposed to pre-train both encoder and decoder simultaneously in one-stage to provide better initialization for segmentation models. (2) A multi-level contrastive loss is designed to take the correspondence among feature-level, image-level and pixel-level projections, respectively into account to make sure multi-level representations can be learned by the encoder and decoder during pre-training. (3) Experiments on multiple medical image datasets indicate our JCL framework outperforms existing SOTA contrastive learning strategies.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2309.11876

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Heterogeneous-Branch Collaborative Learning for Dialogue Generation

Li, Yiwei, Feng, Shaoxiong, Sun, Bin, Li, Kan

arXiv.org Artificial IntelligenceMar-21-2023

With the development of deep learning, advanced dialogue generation methods usually require a greater amount of computational resources. One promising approach to obtaining a high-performance and lightweight model is knowledge distillation, which relies heavily on the pre-trained powerful teacher. Collaborative learning, also known as online knowledge distillation, is an effective way to conduct one-stage group distillation in the absence of a well-trained large teacher model. However, previous work has a severe branch homogeneity problem due to the same training objective and the independent identical training sets. To alleviate this problem, we consider the dialogue attributes in the training of network branches. Each branch learns the attribute-related features based on the selected subset. Furthermore, we propose a dual group-based knowledge distillation method, consisting of positive distillation and negative distillation, to further diversify the features of different branches in a steadily and interpretable way. The proposed approach significantly improves branch heterogeneity and outperforms state-of-the-art collaborative learning methods on two widely used open-domain dialogue datasets.

distillation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2303.11621

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.70)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Towards Using Count-level Weak Supervision for Crowd Counting

Lei, Yinjie, Liu, Yan, Zhang, Pingping, Liu, Lingqiao

arXiv.org Artificial IntelligenceFeb-28-2020

Most existing crowd counting methods require object location-level annotation, i.e., placing a dot at the center of an object. While being simpler than the bounding-box or pixel-level annotation, obtaining this annotation is still labor-intensive and time-consuming especially for images with highly crowded scenes. On the other hand, weaker annotations that only know the total count of objects can be almost effortless in many practical scenarios. Thus, it is desirable to develop a learning method that can effectively train models from count-level annotations. To this end, this paper studies the problem of weakly-supervised crowd counting which learns a model from only a small amount of location-level annotations (fully-supervised) but a large amount of count-level annotations (weakly-supervised). To perform effective training in this scenario, we observe that the direct solution of regressing the integral of density map to the object count is not sufficient and it is beneficial to introduce stronger regularizations on the predicted density map of weakly-annotated images. We devise a simple-yet-effective training strategy, namely Multiple Auxiliary Tasks Training (MATT), to construct regularizes for restricting the freedom of the generated density maps. Through extensive experiments on existing datasets and a newly proposed dataset, we validate the effectiveness of the proposed weakly-supervised method and demonstrate its superior performance over existing solutions.

annotation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2003.00164

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Liaoning Province > Dalian (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback