AITopics | mtl

Collaborating Authors

mtl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f8b932c70d0b2e6bf071729a4fa68dfc-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-15-2026, 05:08:15 GMT

ablation study, complex-n3, relation, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.41)

Add feedback

ByzantineResilientDistributedMulti-TaskLearning

Neural Information Processing SystemsFeb-10-2026, 13:15:58 GMT

Distributed multi-task learning provides significant advantages in multi-agent networkswithheterogeneous datasources where agents aimtolearndistinctbut correlated models simultaneously. However, distributed algorithms for learning relatedness among tasks arenotresilient inthepresence ofByzantine agents. In this paper, we present an approach for Byzantine resilient distributed multi-task learning. We propose an efficient online weight assignment rule by measuring the accumulated loss using an agent's data and its neighbors' models. A small accumulated loss indicates a large similarity between the two tasks.

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Add feedback

1b69ebedb522700034547abc5652ffac-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 16:25:23 GMT

asml, prediction, target task, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)

Add feedback

Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse

Neural Information Processing SystemsDec-27-2025, 09:17:19 GMT

Neural networks are often trained on multiple tasks, either simultaneously (multi-task learning, MTL) or sequentially (pretraining and subsequent finetuning, PT+FT). In particular, it is common practice to pretrain neural networks on a large auxiliary task before finetuning on a downstream task with fewer samples. Despite the prevalence of this approach, the inductive biases that arise from learning multiple tasks are poorly characterized. In this work, we address this gap. We describe novel implicit regularization penalties associated with MTL and PT+FT in diagonal linear networks and single-hidden-layer ReLU networks. These penalties indicate that MTL and PT+FT induce the network to reuse features in different ways. 1) Both MTL and PT+FT exhibit biases towards feature reuse between tasks, and towards sparsity in the set of learned features. We show a conservation law that implies a direct tradeoff between these two biases.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

M³ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

Neural Information Processing SystemsDec-25-2025, 02:01:52 GMT

Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly. Multi-tasking models have become successful and often essential for many sophisticated systems such as autonomous driving and indoor robots. However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks, and the challenge is amplified when a growing number of tasks have to be squeezed into one compact model; (ii) at inference, current MTL regimes have to activate nearly the entire model even to just execute a single task. Yet most real systems demand only one or two tasks at each moment, while flexibly switching between tasks per need: therefore such "all tasks activated" inference is also highly inefficient and non-scalable in practice. In this paper, we present a model-accelerator co-design framework to enable efficient on-device MTL, that tackles both training and inference bottlenecks.

efficient multi-task learning, mixture-of-expert vision transformer, model-accelerator co-design, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.63)
Information Technology > Artificial Intelligence > Robots (0.57)

Add feedback

Consistent Multitask Learning with Nonlinear Output Relations

Carlo Ciliberto, Alessandro Rudi, Lorenzo Rosasco, Massimiliano Pontil

Neural Information Processing SystemsNov-21-2025, 12:27:34 GMT

Improving the efficiency of learning from human supervision is one of the great challenges in machine learning. Multitask learning is one of the key approaches in this sense and it is based on the assumption that different learning problems (i.e.

artificial intelligence, estimator, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > Italy (0.04)
North America > United States > Massachusetts (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)

Add feedback

Collaborative Management for Chronic Diseases and Depression: A Double Heterogeneity-based Multi-Task Learning Method

Chai, Yidong, Liu, Haoxin, Xie, Jiaheng, Wang, Chaopeng, Fang, Xiao

arXiv.org Artificial IntelligenceNov-21-2025

Wearable sensor technologies and deep learning are transforming healthcare management. Yet, most health sensing studies focus narrowly on physical chronic diseases. This overlooks the critical need for joint assessment of comorbid physical chronic diseases and depression, which is essential for collaborative chronic care. We conceptualize multi-disease assessment, including both physical diseases and depression, as a multi-task learning (MTL) problem, where each disease assessment is modeled as a task. This joint formulation leverages inter-disease relationships to improve accuracy, but it also introduces the challenge of double heterogeneity: chronic diseases differ in their manifestation (disease heterogeneity), and patients with the same disease show varied patterns (patient heterogeneity). To address these issues, we first adopt existing techniques and propose a base method. Given the limitations of the base method, we further propose an Advanced Double Heterogeneity-based Multi-Task Learning (ADH-MTL) method that improves the base method through three innovations: (1) group-level modeling to support new patient predictions, (2) a decomposition strategy to reduce model complexity, and (3) a Bayesian network that explicitly captures dependencies while balancing similarities and differences across model components. Empirical evaluations on real-world wearable sensor data demonstrate that ADH-MTL significantly outperforms existing baselines, and each of its innovations is shown to be effective. This study contributes to health information systems by offering a computational solution for integrated physical and mental healthcare and provides design principles for advancing collaborative chronic disease management across the pre-treatment, treatment, and post-treatment phases.

artificial intelligence, heterogeneity, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.16398

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

1b69ebedb522700034547abc5652ffac-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 08:17:05 GMT

artificial intelligence, machine learning, tasml, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)

Add feedback

Merge Now, Regret Later: The Hidden Cost of Model Merging is Adversarial Transferability

Gangwal, Ankit, Sharma, Aaryan Ajay

arXiv.org Artificial IntelligenceSep-30-2025

Model Merging (MM) has emerged as a promising alternative to multi-task learning, where multiple fine-tuned models are combined, without access to tasks' training data, into a single model that maintains performance across tasks. Recent works have explored the impact of MM on adversarial attacks, particularly backdoor attacks. However, none of them have sufficiently explored its impact on transfer attacks using adversarial examples, i.e., a black-box adversarial attack where examples generated for a surrogate model successfully mislead a target model. In this work, we study the effect of MM on the transferability of adversarial examples. We perform comprehensive evaluations and statistical analysis consisting of 8 MM methods, 7 datasets, and 6 attack methods, sweeping over 336 distinct attack settings. Through it, we first challenge the prevailing notion of MM conferring free adversarial robustness, and show MM cannot reliably defend against transfer attacks, with over 95% relative transfer attack success rate. Moreover, we reveal 3 key insights for machine-learning practitioners regarding MM and transferability for a robust system design: (1) stronger MM methods increase vulnerability to transfer attacks; (2) mitigating representation bias increases vulnerability to transfer attacks; and (3) weight averaging, despite being the weakest MM method, is the most vulnerable MM method to transfer attacks. Finally, we analyze the underlying reasons for this increased vulnerability, and provide potential solutions to the problem. Our findings offer critical insights for designing more secure systems employing MM.

artificial intelligence, machine learning, transfer attack, (15 more...)

arXiv.org Artificial Intelligence

2509.23689

Country: Asia > India > Telangana > Hyderabad (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction

Ajirak, Marzieh, Bein, Oded, Bowen, Ellen Rose, Kanellopoulos, Dora, Falk, Avital, Gunning, Faith M., Solomonov, Nili, Grosenick, Logan

arXiv.org Artificial IntelligenceSep-30-2025

We propose a unified framework for adaptive routing in multitask, multimodal prediction settings where data heterogeneity and task interactions vary across samples. Motivated by applications in psychotherapy where structured assessments and unstructured clinician notes coexist with partially missing data and correlated outcomes, we introduce a routing-based architecture that dynamically selects modality processing pathways and task-sharing strategies on a per-sample basis. Our model defines multiple modality paths, including raw and fused representations of text and numeric features and learns to route each input through the most informative expert combination. Task-specific predictions are produced by shared or independent heads depending on the routing decision, and the entire system is trained end-to-end. We evaluate the model on both synthetic data and real-world psychotherapy notes predicting depression and anxiety outcomes. Our experiments show that our method consistently outperforms fixed multitask or single-task baselines, and that the learned routing policy provides interpretable insights into modality relevance and task structure. This addresses critical challenges in personalized healthcare by enabling per-subject adaptive information processing that accounts for data heterogeneity and task correlations. Applied to psychotherapy, this framework could improve mental health outcomes, enhance treatment assignment precision, and increase clinical cost-effectiveness through personalized intervention strategies.

bioinformatics, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.12227

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Industry: