composability
Towards Compositional Model Editing
Model editing has become a de-facto practice to address hallucinations and outdated knowledge of large language models (LLMs). However, existing methods are predominantly evaluated in isolation, i.e., one edit at a time, failing to consider a critical scenario of compositional model editing, where multiple edits must be integrated and jointly utilized to answer real-world multifaceted questions. For instance, in medical domains, if one edit informs LLMs that COVID-19 causes "fever" and another that it causes "loss of taste", a qualified compositional editor should enable LLMs to answer the question "What are the symptoms of COVID-19?" with both "fever" and "loss of taste" (and potentially more). In this work, we define and systematically benchmark this compositional model editing (CME) task, identifying three key undesirable issues that existing methods struggle with: knowledge loss, incorrect preceding and knowledge sinking. To overcome these issues, we propose A3E, a novel compositional editor that (1) adaptively combines and adaptively regularizes pre-trained foundation knowledge in LLMs in the stage of edit training and (2) adaptively merges multiple edits to better meet compositional needs in the stage of edit composing. Extensive experiments demonstrate that A3E improves the composability by at least 22.45% without sacrificing the performance of non-compositional model editing.
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions. However, despite their success, text descriptions often struggle to adequately convey detailed controls, even when composed of long and complex texts. Moreover, recent studies have also shown that these models face challenges in understanding such complex texts and generating the corresponding images. Therefore, there is a growing need to enable more control modes beyond text description. In this paper, we introduce Uni-ControlNet, a unified framework that allows for the simultaneous utilization of different local controls (e.g., edge maps, depth map, segmentation masks) and global controls (e.g., CLIP image embeddings) in a flexible and composable manner within one single model. Unlike existing methods, Uni-ControlNet only requires the fine-tuning of two additional adapters upon frozen pre-trained text-to-image diffusion models, eliminating the huge cost of training from scratch. Moreover, thanks to some dedicated adapter designs, Uni-ControlNet only necessitates a constant number (i.e., 2) of adapters, regardless of the number of local or global controls used. This not only reduces the fine-tuning costs and model size, making it more suitable for real-world deployment, but also facilitate composability of different conditions. Through both quantitative and qualitative comparisons, Uni-ControlNet demonstrates its superiority over existing methods in terms of controllability, generation quality and composability.
ContraGAN (R1, R2, R3, R4), the novelty of the proposed 2C loss (R1, R2, R4), composability with modern
We thank the reviewers for the constructive comments. Every experiment and explanation in this rebuttal will be included in the paper. We will introduce the concept of data-to-data relations carefully. Our 2C loss can take advantage of the strengths of both losses. Compared with Eq. 7 loss, 2C loss considers cosine We conduct experiments to compare 2C loss with other losses.
LEGO-Compiler: Enhancing Neural Compilation Through Translation Composability
Zhang, Shuoming, Zhao, Jiacheng, Xia, Chunwei, Wang, Zheng, Chen, Yunji, Feng, Xiaobing, Cui, Huimin
Large language models (LLMs) have the potential to revolutionize how we design and implement compilers and code translation tools. However, existing LLMs struggle to handle long and complex programs. We introduce LEGO-Compiler, a novel neural compilation system that leverages LLMs to translate high-level languages into assembly code. Our approach centers on three key innovations: LEGO translation, which decomposes the input program into manageable blocks; breaking down the complex compilation process into smaller, simpler verifiable steps by organizing it as a verifiable LLM workflow by external tests; and a feedback mechanism for self-correction. Supported by formal proofs of translation composability, LEGO-Compiler demonstrates high accuracy on multiple datasets, including over 99% on ExeBench and 97.9% on industrial-grade AnsiBench. Additionally, LEGO-Compiler has also acheived near one order-of-magnitude improvement on compilable code size scalability. This work opens new avenues for applying LLMs to system-level tasks, complementing traditional compiler technologies.
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Yang, Sohee, Kassner, Nora, Gribovskaya, Elena, Riedel, Sebastian, Geva, Mor
We evaluate how well Large Language Models (LLMs) latently recall and compose facts to answer multi-hop queries like "In the year Scarlett Johansson was born, the Summer Olympics were hosted in the country of". One major challenge in evaluating this ability is that LLMs may have developed shortcuts by encounters of the head entity "Scarlett Johansson" and the answer entity "United States" in the same training sequences or merely guess the answer based on frequency-based priors. To prevent shortcuts, we exclude test queries where the head and answer entities co-appear in pretraining corpora. Through careful selection of relations and facts and systematic removal of cases where models might guess answers or exploit partial matches, we construct an evaluation dataset SOCRATES (ShOrtCut-fRee lATent rEaSoning). We observe that LLMs demonstrate promising latent multi-hop reasoning abilities without exploiting shortcuts, but only for certain types of queries. For queries requiring latent recall of countries as the intermediate answer, the best models achieve 80% latent composability, but this drops to just 5% for the recall of years. Comparisons with Chain-of-Thought composability highlight a significant gap between the ability of models to reason latently versus explicitly. Analysis reveals that latent representations of the intermediate answer are constructed more often in queries with higher latent composability, and shows the emergence of latent multi-hop reasoning during pretraining.
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions. However, despite their success, text descriptions often struggle to adequately convey detailed controls, even when composed of long and complex texts. Moreover, recent studies have also shown that these models face challenges in understanding such complex texts and generating the corresponding images. Therefore, there is a growing need to enable more control modes beyond text description. In this paper, we introduce Uni-ControlNet, a unified framework that allows for the simultaneous utilization of different local controls (e.g., edge maps, depth map, segmentation masks) and global controls (e.g., CLIP image embeddings) in a flexible and composable manner within one single model.
A theory of understanding for artificial intelligence: composability, catalysts, and learning
Zhang, Zijian, Aronowitz, Sara, Aspuru-Guzik, Alรกn
Understanding is a crucial yet elusive concept in artificial intelligence (AI). This work proposes a framework for analyzing understanding based on the notion of composability. Given any subject (e.g., a person or an AI), we suggest characterizing its understanding of an object in terms of its ability to process (compose) relevant inputs into satisfactory outputs from the perspective of a verifier. This highly universal framework can readily apply to non-human subjects, such as AIs, non-human animals, and institutions. Further, we propose methods for analyzing the inputs that enhance output quality in compositions, which we call catalysts. We show how the structure of a subject can be revealed by analyzing its components that act as catalysts and argue that a subject's learning ability can be regarded as its ability to compose inputs into its inner catalysts. Finally we examine the importance of learning ability for AIs to attain general intelligence. Our analysis indicates that models capable of generating outputs that can function as their own catalysts, such as language models, establish a foundation for potentially overcoming existing limitations in AI understanding.
Composable Interventions for Language Models
Kolbeinsson, Arinbjorn, O'Brien, Kyle, Huang, Tianjin, Gao, Shanghua, Liu, Shiwei, Schwarz, Jonathan Richard, Vaidya, Anurag, Mahmood, Faisal, Zitnik, Marinka, Chen, Tianlong, Hartvigsen, Thomas
Test-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining. But despite a flood of new methods, different types of interventions are largely developing independently. In practice, multiple interventions must be applied sequentially to the same model, yet we lack standardized ways to study how interventions interact. We fill this gap by introducing composable interventions, a framework to study the effects of using multiple interventions on the same language models, featuring new metrics and a unified codebase. Using our framework, we conduct extensive experiments and compose popular methods from three emerging intervention categories -- Knowledge Editing, Model Compression, and Machine Unlearning. Our results from 310 different compositions uncover meaningful interactions: compression hinders editing and unlearning, composing interventions hinges on their order of application, and popular general-purpose metrics are inadequate for assessing composability. Taken together, our findings showcase clear gaps in composability, suggesting a need for new multi-objective interventions. All of our code is public: https://github.com/hartvigsen-group/composable-interventions.
Why composability is key to scaling digital twins
Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Digital twins enable enterprises to model and simulate buildings, products, manufacturing lines, facilities and processes. This can improve performance, quickly flag quality errors and support better decision-making. Today, most digital twin projects are one-off efforts.
Composability set to be next big thing for digital experience platforms
The pressure on organisations to offer excellent customer experience to each unique individual is growing. Today's consumers have much lower levels of patience for businesses that fail to meet their exacting expectations, whether around service, availability, delivery or choice. Rises in inflation rates and cost of living are driving a decrease in spending and consumers being more selective about what they buy, and where from. According to the latest EY Future Consumer Index, 42% of people will only buy from brands that align with their values, and 36% will only visit stores that offer great experiences. Against this backdrop, a digital experience platform (DXP) could prove a vital investment for organisations.