source concept
MARLINE: Multi-Source Mapping Transfer Learning for Non-Stationary Environments
Du, Honghui, Minku, Leandro, Zhou, Huiyu
Concept drift is a major problem in online learning due to its impact on the predictive performance of data stream mining systems. Recent studies have started exploring data streams from different sources as a strategy to tackle concept drift in a given target domain. These approaches make the assumption that at least one of the source models represents a concept similar to the target concept, which may not hold in many real-world scenarios. In this paper, we propose a novel approach called Multi-source mApping with tRansfer LearnIng for Non-stationary Environments (MARLINE). MARLINE can benefit from knowledge from multiple data sources in non-stationary environments even when source and target concepts do not match. This is achieved by projecting the target concept to the space of each source concept, enabling multiple source sub-classifiers to contribute towards the prediction of the target concept as part of an ensemble. Experiments on several synthetic and real-world datasets show that MARLINE was more accurate than several state-of-the-art data stream learning approaches.
- Europe > United Kingdom > England > Leicestershire > Leicester (1.00)
- North America > United States > District of Columbia > Washington (0.05)
- Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
Interpretable Hierarchical Concept Reasoning through Attention-Guided Graph Learning
Debot, David, Barbiero, Pietro, Dominici, Gabriele, Marra, Giuseppe
Concept-Based Models (CBMs) are a class of deep learning models that provide interpretability by explaining predictions through high-level concepts. These models first predict concepts and then use them to perform a downstream task. However, current CBMs offer interpretability only for the final task prediction, while the concept predictions themselves are typically made via black-box neural networks. To address this limitation, we propose Hierarchical Concept Memory Reasoner (H-CMR), a new CBM that provides interpretability for both concept and task predictions. H-CMR models relationships between concepts using a learned directed acyclic graph, where edges represent logic rules that define concepts in terms of other concepts. During inference, H-CMR employs a neural attention mechanism to select a subset of these rules, which are then applied hierarchically to predict all concepts and the final task. Experimental results demonstrate that H-CMR matches state-of-the-art performance while enabling strong human interaction through concept and model interventions. The former can significantly improve accuracy at inference time, while the latter can enhance data efficiency during training when background knowledge is available.
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Dumas, Clément, Wendler, Chris, Veselovsky, Veniamin, Monea, Giovanni, West, Robert
A central question in multilingual language modeling is whether large language models (LLMs) develop a universal concept representation, disentangled from specific languages. In this paper, we address this question by analyzing latent representations (latents) during a word translation task in transformer-based LLMs. We strategically extract latents from a source translation prompt and insert them into the forward pass on a target translation prompt. By doing so, we find that the output language is encoded in the latent at an earlier layer than the concept to be translated. Building on this insight, we conduct two key experiments. First, we demonstrate that we can change the concept without changing the language and vice versa through activation patching alone. Second, we show that patching with the mean over latents across different languages does not impair and instead improves the models' performance in translating the concept. Our results provide evidence for the existence of language-agnostic concept representations within the investigated models.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Asia > Middle East > Jordan (0.04)
InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
Zhu, Chenyang, Li, Kai, Ma, Yue, Tang, Longxiang, Fang, Chengyu, Chen, Chubin, Chen, Qifeng, Li, Xiu
Recent advances in Customized Concept Swapping (CCS) enable a text-to-image model to swap a concept in the source image with a customized target concept. However, the existing methods still face the challenges of inconsistency and inefficiency. They struggle to maintain consistency in both the foreground and background during concept swapping, especially when the shape difference is large between objects. Additionally, they either require time-consuming training processes or involve redundant calculations during inference. To tackle these issues, we introduce InstantSwap, a new CCS method that aims to handle sharp shape disparity at speed. Specifically, we first extract the bbox of the object in the source image automatically based on attention map analysis and leverage the bbox to achieve both foreground and background consistency. For background consistency, we remove the gradient outside the bbox during the swapping process so that the background is free from being modified. For foreground consistency, we employ a cross-attention mechanism to inject semantic information into both source and target concepts inside the box. This helps learn semantic-enhanced representations that encourage the swapping process to focus on the foreground objects. To improve swapping speed, we avoid computing gradients at each timestep but instead calculate them periodically to reduce the number of forward passes, which improves efficiency a lot with a little sacrifice on performance. Finally, we establish a benchmark dataset to facilitate comprehensive evaluation. Extensive evaluations demonstrate the superiority and versatility of InstantSwap. Project Page: https://instantswap.github.io/
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Editing Massive Concepts in Text-to-Image Diffusion Models
Xiong, Tianwei, Wu, Yue, Xie, Enze, Wu, Yue, Li, Zhenguo, Liu, Xihui
While previous methods have mitigated the issues on a small scale, it is essential to handle them simultaneously in larger-scale real-world scenarios. We propose a two-stage method, Editing Massive Concepts In Diffusion Models (EMCID). The first stage performs memory optimization for each individual concept with dual self-distillation from text alignment loss and diffusion noise prediction loss. The second stage conducts massive concept editing with multi-layer, closed form model editing. We further propose a comprehensive benchmark, named ImageNet Concept Editing Benchmark (ICEB), for evaluating massive concept editing for T2I models with two subtasks, free-form prompts, massive concept categories, and extensive evaluation metrics. Extensive experiments conducted on our proposed benchmark and previous benchmarks demonstrate the superior scalability of EMCID for editing up to 1,000 concepts, providing a practical approach for fast adjustment and re-deployment of T2I diffusion models in real-world applications.
- North America > United States (0.97)
- North America > Mexico (0.14)
- Europe > United Kingdom (0.14)
- (3 more...)
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Bhavya, Bhavya, Xiong, Jinjun, Zhai, Chengxiang
We propose a novel application of prompting Pre-trained Language Models (PLMs) to generate analogies and study how to design effective prompts for two task settings: generating a source concept analogous to a given target concept (aka Analogous Concept Generation or ACG), and generating an explanation of the similarity between a given pair of target concept and source concept (aka Analogous Explanation Generation or AEG). We found that it is feasible to prompt InstructGPT to generate meaningful analogies and the best prompts tend to be precise imperative statements especially with a low temperature setting. We also systematically analyzed the sensitivity of the InstructGPT model to prompt design, temperature, and injected spelling errors, and found that the model is particularly sensitive to certain variations (e.g., questions vs. imperative statements). Further, we conducted human evaluation on 1.4k of the generated analogies and found that the quality of generations varies substantially by model size. The largest InstructGPT model can achieve human-level performance at generating meaningful analogies for a given target while there is still room for improvement on the AEG task.
- Education (1.00)
- Health & Medicine (0.68)
How Smart is BERT? Evaluating the Language Model's Commonsense Knowledge
In the new paper Does BERT Solve Commonsense Task via Commonsense Knowledge?, a team of researchers from Westlake University, Fudan University and Microsoft Research Asia dive deep into the large language model to discover how it encodes the structured commonsense knowledge it leverages on downstream commonsense tasks. The proven successes of pretrained language models such as BERT on various downstream tasks has stimulated research investigating the linguistic knowledge inside the model. Previous studies have revealed shallow syntactic, semantic and word sense knowledge in BERT, however, the question of how BERT deals with commonsense tasks has been relatively unexamined. CommonsenseQA is a multiple-choice question answering dataset built upon the CONCEPTNET knowledge graph. The researchers extracted multiple target concepts with the same semantic relation to a single source concept from CONCEPTNET, where each question has one of three target concepts as the correct answer. For example, "bird" is the source concept in the question "Where does a wild bird usually live?" and "countryside" is the correct answer from the possible target concepts "cage," "windowsill," and "countryside."
Teaching Machines to Learn by Metaphors
Levy, Omer (Technion - Israel Institute of Technology) | Markovitch, Shaul (Technion - Israel Institute of Technology)
Humans have an uncanny ability to learn new concepts with very few examples. Cognitive theories have suggested that this is done by utilizing prior experience of related tasks. We propose to emulate this process in machines, by transforming new problems into old ones. These transformations are called metaphors. Obviously, the learner is not given a metaphor, but must acquire one through a learning process. We show that learning metaphors yield better results than existing transfer learning methods. Moreover, we argue that metaphors give a qualitative assessment of task relatedness.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Finding Semantic Inconsistencies in UMLS using Answer Set Programming
Erdogan, Halit (Sabanci University) | Bodenreider, Olivier (National Library of Medicine) | Erdem, Esra (Sabanci University)
The UMLS Metathesaurus was assembled by integrating its ancestors. We introduced an inconsistency definition for some 150 source vocabularies; it contains more than Metathesaurus concepts based on their hierarchical relations 2 million concepts (i.e., clusters of synonymous terms coming and compute all such inconsistent concepts. After that we from multiple source vocabularies identified by a Concept manually review some of the inconsistent concepts to determine Unique Identifier). The UMLS Metathesaurus contains the ones that have erroneous synonymy relations such also more than 36 million relations between these concepts, as wrong synonymy.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- North America > United States (0.05)