Goto

Collaborating Authors

 single-task model



Uni[MASK]: Unified Inference in Sequential Decision Problems

Neural Information Processing Systems

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models.


Semi-Supervised Multi-Task Learning for Interpretable Quality As- sessment of Fundus Images

Telesco, Lucas Gabriel, Nejamkin, Danila, Mata, Estefanía, Filizzola, Francisco, Wignall, Kevin, Troilo, Lucía Franco, Cenoz, María de los Angeles, Thompson, Melissa, Leguía, Mercedes, Larrabide, Ignacio, Orlando, José Ignacio

arXiv.org Artificial Intelligence

Retinal image quality assessment (RIQA) supports computer-aided diagnosis of eye diseases. However, most tools classify only overall image quality, without indicating acquisition defects to guide recapture. This gap is mainly due to the high cost of detailed annotations. In this paper, we aim to mitigate this limitation by introducing a hybrid semi-supervised learning approach that combines manual labels for overall quality with pseudo-labels of quality details within a multi-task framework. Our objective is to obtain more interpretable RIQA models without requiring extensive manual labeling. Pseudo-labels are generated by a Teacher model trained on a small dataset and then used to fine-tune a pre-trained model in a multi-task setting. Using a ResNet-18 backbone, we show that these weak annotations improve quality assessment over single-task baselines (F1: 0.875 vs. 0.863 on EyeQ, and 0.778 vs. 0.763 on DeepDRiD), matching or surpassing existing methods. The multi-task model achieved performance statistically comparable to the Teacher for most detail prediction tasks (p > 0.05). In a newly annotated EyeQ subset released with this paper, our model performed similarly to experts, suggesting that pseudo-label noise aligns with expert variability. Our main finding is that the proposed semi-supervised approach not only improves overall quality assessment but also provides interpretable feedback on capture conditions (illumination, clarity, contrast). This enhances interpretability at no extra manual labeling cost and offers clinically actionable outputs to guide image recapture.


Neural Multivariate Regression: Qualitative Insights from the Unconstrained Feature Model

Andriopoulos, George, Basnet, Soyuj Jung, Guevara, Juan, Guo, Li, Ross, Keith

arXiv.org Artificial Intelligence

The Unconstrained Feature Model (UFM) is a mathematical framework that enables closed-form approximations for minimal training loss and related performance measures in deep neural networks (DNNs). This paper leverages the UFM to provide qualitative insights into neural multivariate regression, a critical task in imitation learning, robotics, and reinforcement learning. Specifically, we address two key questions: (1) How do multi-task models compare to multiple single-task models in terms of training performance? (2) Can whitening and normalizing regression targets improve training performance? The UFM theory predicts that multi-task models achieve strictly smaller training MSE than multiple single-task models when the same or stronger regularization is applied to the latter, and our empirical results confirm these findings. Regarding whitening and normalizing regression targets, the UFM theory predicts that they reduce training MSE when the average variance across the target dimensions is less than one, and our empirical results once again confirm these findings. These findings highlight the UFM as a powerful framework for deriving actionable insights into DNN design and data pre-processing strategies.



Effective Multi-Task Learning for Biomedical Named Entity Recognition

Ruano, João, Correia, Gonçalo M., Barreiros, Leonor, Mendes, Afonso

arXiv.org Artificial Intelligence

Biomedical Named Entity Recognition presents significant challenges due to the complexity of biomedical terminology and inconsistencies in annotation across datasets. This paper introduces SRU-NER (Slot-based Recurrent Unit NER), a novel approach designed to handle nested named entities while integrating multiple datasets through an effective multi-task learning strategy. SRU-NER mitigates annotation gaps by dynamically adjusting loss computation to avoid penalizing predictions of entity types absent in a given dataset. Through extensive experiments, including a cross-corpus evaluation and human assessment of the model's predictions, SRU-NER achieves competitive performance in biomedical and general-domain NER tasks, while improving cross-domain generalization.


Efficient Multi-Task Modeling through Automated Fusion of Trained Models

Zhou, Jingxuan, Bao, Weidong, Wang, Ji, Zhong, Zhengyi, Zhang, Dayu

arXiv.org Artificial Intelligence

Although multi-task learning is widely applied in intelligent services, traditional multi-task modeling methods often require customized designs based on specific task combinations, resulting in a cumbersome modeling process. Inspired by the rapid development and excellent performance of single-task models, this paper proposes an efficient multi-task modeling method that can automatically fuse trained single-task models with different structures and tasks to form a multi-task model. As a general framework, this method allows modelers to simply prepare trained models for the required tasks, simplifying the modeling process while fully utilizing the knowledge contained in the trained models. This eliminates the need for excessive focus on task relationships and model structure design. To achieve this goal, we consider the structural differences among various trained models and employ model decomposition techniques to hierarchically decompose them into multiple operable model components. Furthermore, we have designed an Adaptive Knowledge Fusion (AKF) module based on Transformer, which adaptively integrates intra-task and inter-task knowledge based on model components. Through the proposed method, we achieve efficient and automated construction of multi-task models, and its effectiveness is verified through extensive experiments on three datasets.


Uni[MASK]: Unified Inference in Sequential Decision Problems

Neural Information Processing Systems

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models.


Towards a Real-Time Simulation of Elastoplastic Deformation Using Multi-Task Neural Networks

Schmeitz, Ruben, Remmers, Joris, Mula, Olga, van der Sluis, Olaf

arXiv.org Artificial Intelligence

This study introduces a surrogate modeling framework merging proper orthogonal decomposition, long short-term memory networks, and multi-task learning, to accurately predict elastoplastic deformations in real-time. Superior to single-task neural networks, this approach achieves a mean absolute error below 0.40\% across various state variables, with the multi-task model showing enhanced generalization by mitigating overfitting through shared layers. Moreover, in our use cases, a pre-trained multi-task model can effectively train additional variables with as few as 20 samples, demonstrating its deep understanding of complex scenarios. This is notably efficient compared to single-task models, which typically require around 100 samples. Significantly faster than traditional finite element analysis, our model accelerates computations by approximately a million times, making it a substantial advancement for real-time predictive modeling in engineering applications. While it necessitates further testing on more intricate models, this framework shows substantial promise in elevating both efficiency and accuracy in engineering applications, particularly for real-time scenarios.


Multi-label Annotation for Visual Multi-Task Learning Models

Sharma, G., Angleraud, A., Pieters, R.

arXiv.org Artificial Intelligence

Deep learning requires large amounts of data, and a well-defined pipeline for labeling and augmentation. Current solutions support numerous computer vision tasks with dedicated annotation types and formats, such as bounding boxes, polygons, and key points. These annotations can be combined into a single data format to benefit approaches such as multi-task models. However, to our knowledge, no available labeling tool supports the export functionality for a combined benchmark format, and no augmentation library supports transformations for the combination of all. In this work, these functionalities are presented, with visual data annotation and augmentation to train a multi-task model (object detection, segmentation, and key point extraction). The tools are demonstrated in two robot perception use cases.