Goto

Collaborating Authors

 Large Language Model


The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference

arXiv.org Artificial Intelligence

With a constant increase of learned parameters, modern neural language models become increasingly more powerful. Yet, explaining these complex model's behavior remains a widely unsolved problem. In this paper, we discuss the role interactive visualization can play in explaining NLP models (XNLP). We motivate the use of visualization in relation to target users and common NLP pipelines. We also present several use cases to provide concrete examples on XNLP with visualization. Finally, we point out an extensive list of research opportunities in this field.


Topics in Contextualised Attention Embeddings

arXiv.org Artificial Intelligence

Contextualised word vectors obtained via pre-trained language models encode a variety of knowledge that has already been exploited in applications. Complementary to these language models are probabilistic topic models that learn thematic patterns from the text. Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation. The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics. To address this question, we design different probe experiments. Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters. We strongly believe that our work paves way for further research into the relationships between probabilistic topic models and pre-trained language models.


Diving Deep into Modes of Fact Hallucinations in Dialogue Systems

arXiv.org Artificial Intelligence

Knowledge Graph(KG) grounded conversations often use large pre-trained models and usually suffer from fact hallucination. Frequently entities with no references in knowledge sources and conversation history are introduced into responses, thus hindering the flow of the conversation -- existing work attempt to overcome this issue by tweaking the training procedure or using a multi-step refining method. However, minimal effort is put into constructing an entity-level hallucination detection system, which would provide fine-grained signals that control fallacious content while generating responses. As a first step to address this issue, we dive deep to identify various modes of hallucination in KG-grounded chatbots through human feedback analysis. Secondly, we propose a series of perturbation strategies to create a synthetic dataset named FADE (FActual Dialogue Hallucination DEtection Dataset). Finally, we conduct comprehensive data analyses and create multiple baseline models for hallucination detection to compare against human-verified data and already established benchmarks.


ChatGPT is not all you need. A State of the Art Review of large Generative AI models

arXiv.org Artificial Intelligence

During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusion that have been published. Concretely, these models are able to perform tasks such as being a general question and answering system or automatically creating artistic images that are revolutionizing several sectors. Consequently, the implications that these generative models have in the industry and society are enormous, as several job positions may be transformed. For example, Generative AI is capable of transforming effectively and creatively texts to images, like the DALLE-2 model; text to 3D images, like the Dreamfusion model; images to text, like the Flamingo model; texts to video, like the Phenaki model; texts to audio, like the AudioLM model; texts to other texts, like ChatGPT; texts to code, like the Codex model; texts to scientific texts, like the Galactica model or even create algorithms like AlphaTensor. This work consists on an attempt to describe in a concise way the main models are sectors that are affected by generative AI and to provide a taxonomy of the main generative models published recently.


Machine Learning Is Not Your Copilot: AI System Accused of Violating Open Source Copyright Licenses

#artificialintelligence

As previously reported in this space, the Court of Appeal for the Federal Circuit has ruled that an AI machine cannot be an inventor because it is not a "natural person." You can read those posts here and here. On November 11, 2022, a group of plaintiffs filed suit in the Northern District of California against several defendants, including GitHub, Inc., Microsoft Corporation, and OpenAI, Inc. and related companies to OpenAI. The issue stems from a product called Copilot and a product integrated into Copilot called Codex. To provide some context of the issue, some backstory may help.


ChatGPT's insane powerful searches could be coming to your smartphone soon

#artificialintelligence

Launched in November last year, ChatGPT made global news for its ease of answering even complex questions in a conversational manner. The algorithm that powers the chatbot, GPT3.5 is built by Open AI and is trained to learn what humans mean when they ask a question. The algorithm uses large language models to predict what words will come next and uses human feedback to follow directions and provide responses that are satisfactory. It is this ability of ChatGPT that makes it a threat to the search engine business of Google. As per the revelations made by Jason Calacanis, entrepreneur, investor, and more recently known for being one of the people in Elon Musk's inner circle at Twitter, OpenAI's future app currently has a search function and a thread history that can be seen in the samples released so far.


Wolfram

#artificialintelligence

It happened to us with Wolfram Alpha back in 2009. It happened with our Physics Project in 2020. I've been tracking neural net technology for a long time (about 43 years, actually). And even having watched developments in the past few years I find the performance of ChatGPT thoroughly remarkable. Finally, and suddenly, here's a system that can successfully generate text about almost anything--that's very comparable to what humans might write.


ChatGPT Writes Well Enough to Fool Scientific Reviewers

#artificialintelligence

But in the remaining 32% of cases, the subjects were tricked. And that's despite just 8% of the falsified abstracts meeting the specific formatting and style requirement for the listed journal. Plus, the reviewers falsely identified 14% of the real article abstracts as having been AI-generated. "Reviewers indicated that it was surprisingly difficult to differentiate between the two," wrote the study researchers in the pre-print. While they were sorting the abstracts, the reviewers noted that they thought the generated samples were vaguer and more formulaic.


This AI can spoof your voice after just three seconds

#artificialintelligence

Artificial intelligence (AI) is having a moment right now, and the wind continues to blow in its sails with the news that Microsoft is working on an AI that can imitate anyone's voice after being fed a short three-second sample. The new tool, dubbed VALL-E, has been trained on roughly 60,000 hours of voice data in the English language, which Microsoft says is "hundreds of times larger than existing systems". Using that knowledge, its creators claim it only needs a small smattering of vocal input to understand how to replicate a user's voice. More impressive, VALL-E can reproduce the emotions, vocal tones, and acoustic environment found in each sample, something other voice AI programs have struggled with. That gives it a more realistic aura and brings its results closer to something that could pass as genuine human speech.


Speeding up text generation with non-autoregressive language models

#artificialintelligence

Large Language Models (LLMs) for generating text have recently exploded in popularity. In recent weeks, millions of users have experimented with OpenAI's ChatGPT model for tasks ranging from writing college essays to generating code. These models, however, come with a trade-off -- they are expensive and slow to run. Over the past several months, the team at Unstructured has focused on optimizing Vision Transformers (ViTs) as encoders and transformer decoders for text generation. Our goal is to convert PDFs and images to structured formats, such as JSON, fast enough for industrial use cases.