Goto

Collaborating Authors

 Large Language Model


Fountain -- an intelligent contextual assistant combining knowledge representation and language models for manufacturing risk identification

arXiv.org Artificial Intelligence

Deviations from the approved design or processes during mass production can lead to unforeseen risks. However, these changes are sometimes necessary due to changes in the product design characteristics or an adaptation in the manufacturing process. A major challenge is to identify these risks early in the workflow so that failures leading to warranty claims can be avoided. We developed Fountain as a contextual assistant integrated in the deviation management workflow that helps in identifying the risks based on the description of the existing design and process criteria and the proposed deviation. In the manufacturing context, it is important that the assistant provides recommendations that are explainable and consistent. We achieve this through a combination of the following two components 1) language models finetuned for domain specific semantic similarity and, 2) knowledge representation in the form of a property graph derived from the bill of materials, Failure Modes and Effect Analysis (FMEA) and prior failures reported by customers. Here, we present the nuances of selecting and adapting pretrained language models for an engineering domain, continuous model updates based on user interaction with the contextual assistant and creating the causal chain for explainable recommendations based on the knowledge representation. Additionally, we demonstrate that the model adaptation is feasible using moderate computational infrastructure already available to most engineering teams in manufacturing organizations and inference can be performed on standard CPU only instances for integration with existing applications making these methods easily deployable.


Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown remarkable However, despite the advantages of ICL, it is still unclear how ICL capacity for in-context learning (ICL), where learning a new task learns knowledge from the given prompts without updating its model from just a few training examples is done without being explicitly parameters. Preliminary research [1, 11] compared ICL with simple pre-trained. However, despite the success of LLMs, there has been machine learning models, such as logistic regression and shallow little understanding of how ICL learns the knowledge from the given neural networks. In this paper, we take a further step and investigate prompts. In this paper, to make progress toward understanding the learning behaviour differences between ICL and supervised learning learning behaviour of ICL, we train the same LLMs with the same (SL). Specifically, we train three LLMs with the same training data demonstration examples via ICL and supervised learning (SL), respectively, via in-context learning and supervised learning separately and analyze and investigate their performance under label perturbations their generated outputs. While SL is a well-established approach (i.e., noisy labels and label imbalance) on a range of classification that uses labelled data to train models to make accurate predictions, tasks. First, via extensive experiments, we find that gold labels ICL takes a different approach by leveraging the context of the text have significant impacts on the downstream in-context performance, to learn from unlabeled data in order to improve the accuracy of the especially for large language models; however, imbalanced predictions. By comparing the performance of ICL and SL, we gain labels matter little to ICL across all model sizes.


Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks

arXiv.org Artificial Intelligence

The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills. Are these skills general and transferable, or specialized to specific tasks seen during pretraining? To disentangle these effects, we propose an evaluation framework based on "counterfactual" task variants that deviate from the default assumptions underlying standard tasks. Across a suite of 11 tasks, we observe nontrivial performance on the counterfactual variants, but nevertheless find that performance substantially and consistently degrades compared to the default conditions. This suggests that while current LMs may possess abstract task-solving skills to a degree, they often also rely on narrow, non-transferable procedures for task-solving. These results motivate a more careful interpretation of language model performance that teases apart these aspects of behavior.


The Current State of Summarization

arXiv.org Artificial Intelligence

Summarization is the process of extracting the most important information from a text and presenting it in a condensed form. With vast amounts of information produced at an unprecedented rate, organizations and individuals alike face unique challenges, heightening the demand for effective summarization systems. For researchers of many fields, it is challenging to keep up with the latest developments in their field including Artificial Intelligence itself as vicariously indicated by the number of journal publications per year which has almost tripled since 2015 (D.


mCPT at SemEval-2023 Task 3: Multilingual Label-Aware Contrastive Pre-Training of Transformers for Few- and Zero-shot Framing Detection

arXiv.org Artificial Intelligence

This paper presents the winning system for the zero-shot Spanish framing detection task, which also achieves competitive places in eight additional languages. The challenge of the framing detection task lies in identifying a set of 14 frames when only a few or zero samples are available, i.e., a multilingual multi-label few- or zero-shot setting. Our developed solution employs a pre-training procedure based on multilingual Transformers using a label-aware contrastive loss function. In addition to describing the system, we perform an embedding space analysis and ablation study to demonstrate how our pre-training procedure supports framing detection to advance computational framing analysis.


In-Context Retrieval-Augmented Language Models

arXiv.org Artificial Intelligence

Retrieval-Augmented Language Modeling (RALM) methods, which condition a language model (LM) on relevant documents from a grounding corpus during generation, were shown to significantly improve language modeling performance. In addition, they can mitigate the problem of factually inaccurate text generation and provide natural source attribution mechanism. Existing RALM approaches focus on modifying the LM architecture in order to facilitate the incorporation of external information, significantly complicating deployment. This paper considers a simple alternative, which we dub In-Context RALM: leaving the LM architecture unchanged and prepending grounding documents to the input, without any further training of the LM. We show that In-Context RALM that builds on off-the-shelf general purpose retrievers provides surprisingly large LM gains across model sizes and diverse corpora. We also demonstrate that the document retrieval and ranking mechanism can be specialized to the RALM setting to further boost performance. We conclude that In-Context RALM has considerable potential to increase the prevalence of LM grounding, particularly in settings where a pretrained LM must be used without modification or even via API access.


Parallel Context Windows for Large Language Models

arXiv.org Artificial Intelligence

When applied to processing long text, Large Language Models (LLMs) are limited by their context window. Existing efforts to address this limitation involve training specialized architectures, and cannot be easily applied to off-the-shelf LLMs. We present Parallel Context Windows (PCW), a method that alleviates the context window restriction for any off-the-shelf LLM without further training. The key to the approach is to carve a long context into chunks (``windows''), restrict the attention mechanism to apply only within each window, and re-use the positional embeddings across the windows. Our main results test the PCW approach on in-context learning with models that range in size between 750 million and 178 billion parameters, and show substantial improvements for tasks with diverse input and output spaces. We show additional benefits in other settings where long context windows may be beneficial: multi-hop questions and retrieval-augmented question answering with multiple retrieved documents. Our results highlight Parallel Context Windows as a promising method for applying off-the-shelf LLMs in a range of settings that require long text sequences. We make our code publicly available at https://github.com/ai21labs/parallel-context-windows.


L-Space and Large Language Models

Communications of the ACM

It was Sir Terry Pratchett who suggested it first. Not the multiple universes, of course--that idea has been around for ages--but the idea that massive aggregation of data produced uncertainty. Sir Terry called it L-space, the warping of space and time by large numbers of books in the Unseen University's library in his Discworld series. It was a passing fancy, a grace note in a rich and well-constructed fantasy world. That was, up until late 2022, when the public started to have access to large language models.


Large Language Models

Communications of the ACM

I can remember the days when indexing text meant compiling lists of pages on which a word appeared or finding pages in which "keywords" appeared in context. Then came full text search as exemplified by the Google search engine. Pages found in the World Wide Web are indexed word-by-word and the retrieved Web page references are rank ordered by an elaboration of the original "page rank" concept developed by the founders of Google, Larry Page and Sergey Brin. Large language models (LLMs) represent a very different way of performing information retrieval. I am no expert in this field but my cartoon model of the LLM notion follows: A statistical model of the relationship of "tokens" (words or phrases) to each other (for example, likelihood of appearing "near" each other) is built.


ChatGPT better than undergraduates at solving SAT problems, study suggests

The Guardian

ChatGPT can solve problems at a level that matches or surpasses an undergraduate student, according to a new study. Researchers found that the GPT-3 large language model that underpins the chatbot performed about as well as US college undergraduates when asked to solve reasoning problems that appear on intelligence tests or exams such as the American college admission test, the SAT. Psychologists at the University of California, Los Angeles tested GPT-3's ability to predict the next image in a complex array of shapes, after converting the images to a text format that the model could process and also ensuring the model would never have encountered the questions before. The same problems were put to 40 UCLA undergraduates and the researchers found that GPT-3 solved 80% of the problems correctly, well above the average score of just below 60% for the human participants. The researchers also prompted the model to solve some SAT "analogy" questions – selecting pairs of words that are linked in some way – that they believe had not been published on the internet and therefore could not have appeared in the vast amount of data it was trained on.