Goto

Collaborating Authors

 Large Language Model


Gabe and Aakesh: An AI Enhanced Story

#artificialintelligence

AI is leaps and bounds from where it was years ago. I remember Windows 7 had introduced voice recognition as a feature. It didn't work great, but over time it did get better as it trained on your voice. Speed ahead to now and we have speech to text and voice recognition technologies that are almost perfect in the palms of our hands. If you know me, I've been messing around with Craiyon, an image generating AI and OpenAI's GPT-3 based models that generate text based upon a prompt.


Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets

arXiv.org Artificial Intelligence

When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting methods.


Enhancing Semantic Understanding with Self-supervised Methods for Abstractive Dialogue Summarization

arXiv.org Artificial Intelligence

Contextualized word embeddings can lead to state-of-the-art performances in natural language understanding. Recently, a pre-trained deep contextualized text encoder such as BERT has shown its potential in improving natural language tasks including abstractive summarization. Existing approaches in dialogue summarization focus on incorporating a large language model into summarization task trained on large-scale corpora consisting of news articles rather than dialogues of multiple speakers. In this paper, we introduce self-supervised methods to compensate shortcomings to train a dialogue summarization model. Our principle is to detect incoherent information flows using pretext dialogue text to enhance BERT's ability to contextualize the dialogue text representations. We build and fine-tune an abstractive dialogue summarization model on a shared encoder-decoder architecture using the enhanced BERT. We empirically evaluate our abstractive dialogue summarizer with the SAMSum corpus, a recently introduced dataset with abstractive dialogue summaries. All of our methods have contributed improvements to abstractive summary measured in ROUGE scores. Through an extensive ablation study, we also present a sensitivity analysis to critical model hyperparameters, probabilities of switching utterances and masking interlocutors.


OpenAI is reducing the price of the GPT-3 API -- here's why it matters

#artificialintelligence

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! OpenAI is slashing the price of its GPT-3 API service by up to two-thirds, according to an announcement on the company's website. The new pricing plan, which is effective September 1, may have a large impact on companies that are building products on top of OpenAI's flagship large language model (LLM). The announcement comes as recent months have seen growing interest in LLMs and their applications in different fields.


DeepMind AI learns to play soccer using decades of match simulations

New Scientist

Artificial intelligence has learned to play soccer. By learning from decades worth of computer simulations, an AI took digital humanoids from flailing tots to proficient players. Researchers at the AI research company DeepMind taught the AI how to play soccer in a computer simulation through an athletic curriculum resembling a sped-up version of growing a human baby into a football player. The AI was given control over digital humanoids with realistic body mass and joint movements. "We don't put infants in a 11 versus 11 match," says Guy Lever at DeepMind.


What does GPT-3 "know" about me?

#artificialintelligence

I've been paranoid about posting anything about my personal life publicly since a bruising experience about a decade ago. My images and personal information were splashed across an online forum, then dissected and ridiculed by people who didn't like a column I'd written for a Finnish newspaper. Up to that point, like many people, I'd carelessly littered the internet with my data: personal blog posts, embarrassing photo albums from nights out, posts about my location, relationship status, and political preferences, out in the open for anyone to see. OpenAI has provided limited access to its famous large language model, GPT-3, and Meta lets people play around with its model OPT-175B though a publicly available chatbot called BlenderBot 3. I decided to try out both models, starting by asking GPT-3: Who is Melissa Heikkilä? When I read this, I froze.


The Download: AI privacy risks, and cleaning up shipping

MIT Technology Review

One of the biggest stories in tech this year has been the rise of large language models (LLMs). These are AI models that produce text a human might have written--sometimes so convincingly they have tricked people into thinking they are sentient. These models' power comes from troves of publicly available human-created text that has been hoovered from the internet. If you've posted anything even remotely personal in English on the internet, chances are your data might be part of some of the world's most popular LLMs. My colleague Melissa Heikkilä, our AI reporter, recently started to wonder what data these models might have on her--and how it could be misused. A bruising experience a decade ago left her paranoid about sharing personal details online, so she put OpenAI's GPT-3 to the test to see what it "knows" about her.


CoAuthor: Stanford experiments with human-AI collaborative writing

#artificialintelligence

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! This article is an existential crisis. It is written by a professional writer writing about artificial intelligence that helps writers write. I mean, shouldn't humans write their own content?


Introduction to Neural Transfer Learning with Transformers for Social Science Text Analysis

arXiv.org Artificial Intelligence

Transformer-based models for transfer learning have the potential to achieve high prediction accuracies on text-based supervised learning tasks with relatively few training data instances. These models are thus likely to benefit social scientists that seek to have as accurate as possible text-based measures but only have limited resources for annotating training data. To enable social scientists to leverage these potential benefits for their research, this paper explains how these methods work, why they might be advantageous, and what their limitations are. Additionally, three Transformer-based models for transfer learning, BERT (Devlin et al. 2019), RoBERTa (Liu et al. 2019), and the Longformer (Beltagy et al. 2020), are compared to conventional machine learning algorithms on three applications. Across all evaluated tasks, textual styles, and training data set sizes, the conventional models are consistently outperformed by transfer learning with Transformers, thereby demonstrating the benefits these models can bring to text-based social science research.


Flexible Job Classification with Zero-Shot Learning

arXiv.org Artificial Intelligence

Using a taxonomy to organize information requires classifying objects (documents, images, etc) with appropriate taxonomic classes. The flexible nature of zero-shot learning is appealing for this task because it allows classifiers to naturally adapt to taxonomy modifications. This work studies zero-shot multi-label document classification with fine-tuned language models under realistic taxonomy expansion scenarios in the human resource domain. Experiments show that zero-shot learning can be highly effective in this setting. When controlling for training data budget, zero-shot classifiers achieve a 12% relative increase in macro-AP when compared to a traditional multi-label classifier trained on all classes. Counterintuitively, these results suggest in some settings it would be preferable to adopt zero-shot techniques and spend resources annotating more documents with an incomplete set of classes, rather than spreading the labeling budget uniformly over all classes and using traditional classification techniques. Additional experiments demonstrate that adopting the well-known filter/re-rank decomposition from the recommender systems literature can significantly reduce the computational burden of high-performance zero-shot classifiers, empirically resulting in a 98% reduction in computational overhead for only a 2% relative decrease in performance. The evidence presented here demonstrates that zero-shot learning has the potential to significantly increase the flexibility of taxonomies and highlights directions for future research.