Inductive Learning
Label-Guided Prompt for Multi-label Few-shot Aspect Category Detection
Guan, ChaoFeng, Zhu, YaoHui, Bai, Yu, Wang, LingYun
Multi-label few-shot aspect category detection aims at identifying multiple aspect categories from sentences with a limited number of training instances. The representation of sentences and categories is a key issue in this task. Most of current methods extract keywords for the sentence representations and the category representations. Sentences often contain many category-independent words, which leads to suboptimal performance of keyword-based methods. Instead of directly extracting keywords, we propose a label-guided prompt method to represent sentences and categories. To be specific, we design label-specific prompts to represent sentences by combining crucial contextual and semantic information. Further, the label is introduced into a prompt to obtain category descriptions by utilizing a large language model. This kind of category descriptions contain the characteristics of the aspect categories, guiding the construction of discriminative category prototypes. Experimental results on two public datasets show that our method outperforms current state-of-the-art methods with a 3.86% - 4.75% improvement in the Macro-F1 score.
Dense Self-Supervised Learning for Medical Image Segmentation
Seince, Maxime, Folgoc, Loic Le, de Souza, Luiz Augusto Facury, Angelini, Elsa
Deep learning has revolutionized medical image segmentation, but it relies heavily on high-quality annotations. The time, cost and expertise required to label images at the pixel-level for each new task has slowed down widespread adoption of the paradigm. We propose Pix2Rep, a self-supervised learning (SSL) approach for few-shot segmentation, that reduces the manual annotation burden by learning powerful pixel-level representations directly from unlabeled images. Pix2Rep is a novel pixel-level loss and pre-training paradigm for contrastive SSL on whole images. It is applied to generic encoder-decoder deep learning backbones (e.g., U-Net). Whereas most SSL methods enforce invariance of the learned image-level representations under intensity and spatial image augmentations, Pix2Rep enforces equivariance of the pixel-level representations. We demonstrate the framework on a task of cardiac MRI segmentation. Results show improved performance compared to existing semi- and self-supervised approaches; and a 5-fold reduction in the annotation burden for equivalent performance versus a fully supervised U-Net baseline. This includes a 30% (resp. 31%) DICE improvement for one-shot segmentation under linear-probing (resp. fine-tuning). Finally, we also integrate the novel Pix2Rep concept with the Barlow Twins non-contrastive SSL, which leads to even better segmentation performance.
Learning to Select the Best Forecasting Tasks for Clinical Outcome Prediction
Xue, Yuan, Du, Nan, Mottram, Anne, Seneviratne, Martin, Dai, Andrew M.
The paradigm of'pretraining' from a set of relevant auxiliary tasks and then'finetuning' on a target task has been successfully applied in many different domains. However, when the auxiliary tasks are abundant, with complex relationships to the target task, using domain knowledge or searching over all possible pretraining setups is inefficient and suboptimal. To address this challenge, we propose a method to automatically select from a large set of auxiliary tasks, which yields a representation most useful to the target task. In particular, we develop an efficient algorithm that uses automatic auxiliary task selection within a nested-loop metalearning process. We have applied this algorithm to the task of clinical outcome predictions in electronic medical records, learning from a large number of selfsupervised tasks related to forecasting patient trajectories. Experiments on a real clinical dataset demonstrate the superior predictive performance of our method compared to direct supervised learning, naive pretraining and simple multitask learning, in particular in low-data scenarios when the primary task has very few examples. With detailed ablation analysis, we further show that the selection rules are interpretable and able to generalize to unseen target tasks with new data.
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning
Xiao, Jia-Hao, Xie, Ming-Kun, Fan, Heng-Bo, Niu, Gang, Sugiyama, Masashi, Huang, Sheng-Jun
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. To solve this problem, the mainstream method developed an effective thresholding strategy to generate accurate pseudo-labels. Unfortunately, the method neglected the quality of model predictions and its potential impact on pseudo-labeling performance. In this paper, we propose a dual-perspective method to generate high-quality pseudo-labels. To improve the quality of model predictions, we perform dual-decoupling to boost the learning of correlative and discriminative features, while refining the generation and utilization of pseudo-labels. To obtain proper class-wise thresholds, we propose the metric-adaptive thresholding strategy to estimate the thresholds, which maximize the pseudo-label performance for a given metric on labeled data. Experiments on multiple benchmark datasets show the proposed method can achieve the state-of-the-art performance and outperform the comparative methods with a significant margin.
Quantum Supervised Learning
Recent advancements in quantum computing have positioned it as a prospective solution for tackling intricate computational challenges, with supervised learning emerging as a promising domain for its application. Despite this potential, the field of quantum machine learning is still in its early stages, and there persists a level of skepticism regarding a possible near-term quantum advantage. This paper aims to provide a classical perspective on current quantum algorithms for supervised learning, effectively bridging traditional machine learning principles with advancements in quantum machine learning. Specifically, this study charts a research trajectory that diverges from the predominant focus of quantum machine learning literature, originating from the prerequisites of classical methodologies and elucidating the potential impact of quantum approaches. Through this exploration, our objective is to deepen the understanding of the convergence between classical and quantum methods, thereby laying the groundwork for future advancements in both domains and fostering the involvement of classical practitioners in the field of quantum machine learning.
Systematic Reasoning About Relational Domains With Graph Neural Networks
Khalid, Irtaza, Schockaert, Steven
Developing models that can learn to reason is a notoriously challenging problem. We focus on reasoning in relational domains, where the use of Graph Neural Networks (GNNs) seems like a natural choice. However, previous work on reasoning with GNNs has shown that such models tend to fail when presented with test examples that require longer inference chains than those seen during training. This suggests that GNNs lack the ability to generalize from training examples in a systematic way, which would fundamentally limit their reasoning abilities. A common solution is to instead rely on neuro-symbolic methods, which are capable of reasoning in a systematic way by design. Unfortunately, the scalability of such methods is often limited and they tend to rely on overly strong assumptions, e.g.\ that queries can be answered by inspecting a single relational path. In this paper, we revisit the idea of reasoning with GNNs, showing that systematic generalization is possible as long as the right inductive bias is provided. In particular, we argue that node embeddings should be treated as epistemic states and that GNN should be parameterised accordingly. We propose a simple GNN architecture which is based on this view and show that it is capable of achieving state-of-the-art results. We furthermore introduce a benchmark which requires models to aggregate evidence from multiple relational paths. We show that existing neuro-symbolic approaches fail on this benchmark, whereas our considered GNN model learns to reason accurately.
Physics-Informed Weakly Supervised Learning for Interatomic Potentials
Takamoto, Makoto, Zaverkin, Viktor, Niepert, Mathias
Machine learning plays an increasingly important role in computational chemistry and materials science, complementing computationally intensive ab initio and first-principles methods. Despite their utility, machine-learning models often lack generalization capability and robustness during atomistic simulations, yielding unphysical energy and force predictions that hinder their real-world applications. We address this challenge by introducing a physics-informed, weakly supervised approach for training machine-learned interatomic potentials (MLIPs). We introduce two novel loss functions, extrapolating the potential energy via a Taylor expansion and using the concept of conservative forces. Our approach improves the accuracy of MLIPs applied to training tasks with sparse training data sets and reduces the need for pre-training computationally demanding models with large data sets. Particularly, we perform extensive experiments demonstrating reduced energy and force errors -- often lower by a factor of two -- for various baseline models and benchmark data sets. Finally, we show that our approach facilitates MLIPs' training in a setting where the computation of forces is infeasible at the reference level, such as those employing complete-basis-set extrapolation.
Federated Learning for Face Recognition via Intra-subject Self-supervised Learning
Kim, Hansol, Choi, Hoyeol, Kwak, Youngjun
Federated Learning (FL) for face recognition aggregates locally optimized models from individual clients to construct a generalized face recognition model. However, previous studies present two major challenges: insufficient incorporation of self-supervised learning and the necessity for clients to accommodate multiple subjects. To tackle these limitations, we propose FedFS (Federated Learning for personalized Face recognition via intra-subject Self-supervised learning framework), a novel federated learning architecture tailored to train personalized face recognition models without imposing subjects. Our proposed FedFS comprises two crucial components that leverage aggregated features of the local and global models to cooperate with representations of an off-the-shelf model. These components are (1) adaptive soft label construction, utilizing dot product operations to reformat labels within intra-instances, and (2) intra-subject self-supervised learning, employing cosine similarity operations to strengthen robust intra-subject representations. Additionally, we introduce a regularization loss to prevent overfitting and ensure the stability of the optimized model. To assess the effectiveness of FedFS, we conduct comprehensive experiments on the DigiFace-1M and VGGFace datasets, demonstrating superior performance compared to previous methods.
Assessing In-context Learning and Fine-tuning for Topic Classification of German Web Data
Schelb, Julian, Ulloa, Roberto, Spitz, Andreas
Researchers in the political and social sciences often rely on classification models to analyze trends in information consumption by examining browsing histories of millions of webpages. Automated scalable methods are necessary due to the impracticality of manual labeling. In this paper, we model the detection of topic-related content as a binary classification task and compare the accuracy of fine-tuned pre-trained encoder models against in-context learning strategies. Using only a few hundred annotated data points per topic, we detect content related to three German policies in a database of scraped webpages. We compare multilingual and monolingual models, as well as zero and few-shot approaches, and investigate the impact of negative sampling strategies and the combination of URL & content-based features. Our results show that a small sample of annotated data is sufficient to train an effective classifier. Fine-tuning encoder-based models yields better results than in-context learning. Classifiers using both URL & content-based features perform best, while using URLs alone provides adequate results when content is unavailable.
Causal Understanding For Video Question Answering
Guda, Bhanu Prakash Reddy, Kulkarni, Tanmay, Sampath, Adithya, Sathyendra, Swarnashree Mysore
Video Question Answering is a challenging task, which requires the model to reason over multiple frames and understand the interaction between different objects to answer questions based on the context provided within the video, especially in datasets like NExT-QA (Xiao et al., 2021a) which emphasize on causal and temporal questions. Previous approaches leverage either sub-sampled information or causal intervention techniques along with complete video features to tackle the NExT-QA task. In this work we elicit the limitations of these approaches and propose solutions along four novel directions of improvements on theNExT-QA dataset. Our approaches attempts to compensate for the shortcomings in the previous works by systematically attacking each of these problems by smartly sampling frames, explicitly encoding actions and creating interventions that challenge the understanding of the model. Overall, for both single-frame (+6.3%) and complete-video (+1.1%) based approaches, we obtain the state-of-the-art results on NExT-QA dataset.