Goto

Collaborating Authors

 cml


Multimodal Representation Learning Conditioned on Semantic Relations

Qiao, Yang, Hu, Yuntong, Zhao, Liang

arXiv.org Artificial Intelligence

Multimodal representation learning has advanced rapidly with contrastive models such as CLIP, which align image-text pairs in a shared embedding space. However, these models face limitations: (1) they typically focus on image-text pairs, underutilizing the semantic relations across different pairs. (2) they directly match global embeddings without contextualization, overlooking the need for semantic alignment along specific subspaces or relational dimensions; and (3) they emphasize cross-modal contrast, with limited support for intra-modal consistency. To address these issues, we propose Relation-Conditioned Multimodal Learning RCML, a framework that learns multimodal representations under natural-language relation descriptions to guide both feature extraction and alignment. Our approach constructs many-to-many training pairs linked by semantic relations and introduces a relation-guided cross-attention mechanism that modulates multimodal representations under each relation context. The training objective combines inter-modal and intra-modal contrastive losses, encouraging consistency across both modalities and semantically related samples. Experiments on different datasets show that RCML consistently outperforms strong baselines on both retrieval and classification tasks, highlighting the effectiveness of leveraging semantic relations to guide multimodal representation learning.


Cooperative Meta-Learning with Gradient Augmentation

Shin, Jongyun, Han, Seunjin, Kim, Jangho

arXiv.org Artificial Intelligence

Model agnostic meta-learning (MAML) is one of the most widely used gradient-based meta-learning, consisting of two optimization loops: an inner loop and outer loop. MAML learns the new task from meta-initialization parameters with an inner update and finds the meta-initialization parameters in the outer loop. In general, the injection of noise into the gradient of the model for augmenting the gradient is one of the widely used regularization methods. In this work, we propose a novel cooperative meta-learning framework dubbed CML which leverages gradient-level regularization with gradient augmentation. We inject learnable noise into the gradient of the model for the model generalization. The key idea of CML is introducing the co-learner which has no inner update but the outer loop update to augment gradients for finding better meta-initialization parameters. Since the co-learner does not update in the inner loop, it can be easily deleted after meta-training. Therefore, CML infers with only meta-learner without additional cost and performance degradation. We demonstrate that CML is easily applicable to gradient-based meta-learning methods and CML leads to increased performance in few-shot regression, few-shot image classification and few-shot node classification tasks. Our codes are at https://github.com/JJongyn/CML.


Deploying clinical machine learning? Consider the following...

Lu, Charles, Chang, Ken, Singh, Praveer, Pomerantz, Stuart, Doyle, Sean, Kakarmath, Sujay, Bridge, Christopher, Kalpathy-Cramer, Jayashree

arXiv.org Artificial Intelligence

Despite the intense attention and considerable investment into clinical machine learning research, relatively few applications have been deployed at a large-scale in a real-world clinical environment. While research is important in advancing the state-of-the-art, translation is equally important in bringing these techniques and technologies into a position to ultimately impact healthcare. We believe a lack of appreciation for several considerations are a major cause for this discrepancy between expectation and reality. To better characterize a holistic perspective among researchers and practitioners, we survey several practitioners with commercial experience in developing CML for clinical deployment. Using these insights, we identify several main categories of challenges in order to better design and develop clinical machine learning applications.


Early prediction of the risk of ICU mortality with Deep Federated Learning

Randl, Korbinian, Armengol, Núria Lladós, Mondrejevski, Lena, Miliou, Ioanna

arXiv.org Artificial Intelligence

Intensive Care Units usually carry patients with a serious risk of mortality. Recent research has shown the ability of Machine Learning to indicate the patients' mortality risk and point physicians toward individuals with a heightened need for care. Nevertheless, healthcare data is often subject to privacy regulations and can therefore not be easily shared in order to build Centralized Machine Learning models that use the combined data of multiple hospitals. Federated Learning is a Machine Learning framework designed for data privacy that can be used to circumvent this problem. In this study, we evaluate the ability of deep Federated Learning to predict the risk of Intensive Care Unit mortality at an early stage. We compare the predictive performance of Federated, Centralized, and Local Machine Learning in terms of AUPRC, F1-score, and AUROC. Our results show that Federated Learning performs equally well as the centralized approach and is substantially better than the local approach, thus providing a viable solution for early Intensive Care Unit mortality prediction. In addition, we show that the prediction performance is higher when the patient history window is closer to discharge or death. Finally, we show that using the F1-score as an early stopping metric can stabilize and increase the performance of our approach for the task at hand.


Continual Few-Shot Learning with Adversarial Class Storage

Wu, Kun, Yin, Chengxiang, Tang, Jian, Xu, Zhiyuan, Wang, Yanzhi, Yang, Dejun

arXiv.org Artificial Intelligence

Humans have a remarkable ability to quickly and effectively learn new concepts in a continuous manner without forgetting old knowledge. Though deep learning has made tremendous successes on various computer vision tasks, it faces challenges for achieving such human-level intelligence. In this paper, we define a new problem called continual few-shot learning, in which tasks arrive sequentially and each task is associated with a few training samples. We propose Continual Meta-Learner (CML) to solve this problem. CML integrates metric-based classification and a memory-based mechanism along with adversarial learning into a meta-learning framework, which leads to the desirable properties: 1) it can quickly and effectively learn to handle a new task; 2) it overcomes catastrophic forgetting; 3) it is model-agnostic. We conduct extensive experiments on two image datasets, MiniImageNet and CIFAR100. Experimental results show that CML delivers state-of-the-art performance in terms of classification accuracy on few-shot learning tasks without catastrophic forgetting.


Manage ML Automation Workflow with DagsHub, GitHub Action, and CML

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.


MLOps for Conversational AI with Rasa, DVC, and CML (Part I)

#artificialintelligence

This is the first part of a series of blog posts that describe how to use Data Version Control (DVC), and Continuous Machine Learning (CML) when developing conversational AI assistants using the Rasa framework. This post is mostly an introduction to these three components, in the next post I'll delve into the code, and how to get everything connected for Rasa MLOps bliss. If you've not heard of Data Version Control (DVC), you've been missing out. DVC is an exciting tool from iterative.ai DVC extends git's functionality to cover your data wherever you want to store it, whether that is locally, on a cloud platform like AWS S3, or a Hadoop File System. Like git, DVC is language agnostic.


Council Post: The Pandemic And Its Implications On Industrial Machine Learning

#artificialintelligence

Charlie Burgoyne is the founder and CEO of Valkyrie. For a moment, let's set aside the abject tragedy of the Covid-19 pandemic and the demoralizing conditions through which the world continues to persevere. Instead, let's examine the state of affairs from a dispassionate and scientific position. Seismic changes in behavior are erupting as the burden of the pandemic forces transformation. Crippling inefficiencies in industry and volatile projections of markets have led to unprecedented uncertainty.


Curriculum-Meta Learning for Order-Robust Continual Relation Extraction

Wu, Tongtong, Li, Xuekai, Li, Yuan-Fang, Haffari, Reza, Qi, Guilin, Zhu, Yujin, Xu, Guoqiang

arXiv.org Artificial Intelligence

Continual relation extraction is an important task that focuses on extracting new facts incrementally from unstructured text. Given the sequential arrival order of the relations, this task is prone to two serious challenges, namely catastrophic forgetting and order-sensitivity. We propose a novel curriculum-meta learning method to tackle the above two challenges in continual relation extraction. We combine meta learning and curriculum learning to quickly adapt model parameters to a new task and to reduce interference of previously seen tasks on the current task. We design a novel relation representation learning method through the distribution of domain and range types of relations. Such representations are utilized to quantify the difficulty of tasks for the construction of curricula. Moreover, we also present novel difficulty-based metrics to quantitatively measure the extent of order-sensitivity of a given model, suggesting new ways to evaluate model robustness. Our comprehensive experiments on three benchmark datasets show that our proposed method outperforms the state-of-the-art techniques. The code is available at the anonymous GitHub repository: https://github.com/wutong8023/AAAI_CML.


Using Continuous Machine Learning to Run Your ML Pipeline

#artificialintelligence

CI/CD is a key concept that is becoming increasingly popular and widely adopted in the software industry nowadays. Incorporating continuous integration and deployment for a software project that doesn't contain a machine learning component is fairly straightforward because the stages of the pipeline are somewhat standard, and it is unlikely that the CI/CD pipeline will change a lot over the course of development. But, when the project involves a machine learning component, this may not be true. As opposed to traditional software development, building a pipeline for a machine learning components may involve a lot of changes over time, mostly in response to observations made during past iterations of development. Therefore, for ML projects, notebooks are widely used to get started with the project, and once a stable foundation (base code for different stages of the ML pipeline) is available to build upon, the code is pushed to a version control system, and the pipeline is migrated to a CI/CD tool such as Jenkins or TravisCI.