AITopics | zerogen

Collaborating Authors

zerogen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation

Gholami, Mohsen, Akbari, Mohammad, Hu, Cindy, Masrani, Vaden, Wang, Z. Jane, Zhang, Yong

arXiv.org Artificial IntelligenceMar-28-2024

Knowledge distillation from LLMs is essential for the efficient deployment of language models. Prior works have proposed data generation using LLMs for preparing distilled models. We argue that generating data with LLMs is prone to sampling mainly from the center of original content distribution. This limitation hinders the distilled model from learning the true underlying data distribution and to forget the tails of the distributions (samples with lower probability). To this end, we propose GOLD, a task-agnostic data generation and knowledge distillation framework, which employs an iterative out-of-distribution-guided feedback mechanism for the LLM. As a result, the generated data improves the generalizability of distilled models. An energy-based OOD evaluation approach is also introduced to deal with noisy generated data. Our extensive experiments on 10 different classification and sequence-to-sequence tasks in NLP show that GOLD respectively outperforms prior arts and the LLM with an average improvement of 5% and 14%. We will also show that the proposed method is applicable to less explored and novel tasks. The code is available.

dataset, gold, llm, (17 more...)

arXiv.org Artificial Intelligence

2403.19754

Country:

Europe > France (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Middle East > Iraq (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

Wang, Ruida, Zhou, Wangchunshu, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceOct-20-2023

Data Synthesis is a promising way to train a small model with very little labeled data. One approach for data synthesis is to leverage the rich knowledge from large language models to synthesize pseudo training examples for small models, making it possible to achieve both data and compute efficiency at the same time. However, a key challenge in data synthesis is that the synthesized dataset often suffers from a large distributional discrepancy from the real task data distribution. Thus, in this paper, we propose Synthesis Step by Step (S3), a data synthesis framework that shrinks this distribution gap by iteratively extrapolating the errors made Figure 1: Training and testing accuracy of DistilBert by a small model trained on the synthesized with ZeroGen (Ye et al., 2022b) on the IMDb dataset dataset on a small real-world validation dataset with 200k training datapoints. Also shown are the training using a large language model. Extensive experiments and testing accuracy of the model trained on Gold-on multiple NLP tasks show that our Data. We can see here that ZeroGen's training accuracy approach improves the performance of a small quickly reaches nearly 100%, but testing accuracy remains model by reducing the gap between the synthetic low.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.13671

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Overview -- ZeroGen, Efficient Zero-shot Learning via Dataset Generation

#artificialintelligenceApr-29-2022, 20:45:24 GMT

An interesting take on zero-shot learning was introduced in a paper that was dated Feb 16. More efficient and flexible ways to conduct zero-shot learning with PLMs were explored by the authors. They take the dataset generation method to the extreme and study ZeroGEN, a flexible and efficient zero-shot learning framework via dataset generation. With the pseudo-dataset, a tiny task model TAM is trained to conduct the given task. This procedure is highly flexible, meaning that any model architecture, loss function, and training strategy can be used.

dataset generation, efficient zero-shot learning, zero-shot learning framework, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Meet ZEROGEN: An Extreme Method for Dataset Generation via PLMs for Zero-Shot Learning

#artificialintelligenceFeb-24-2022, 16:18:56 GMT

The impressive generative capacity of large-scale pretrained language models (PLMs) has inspired machine learning researchers to explore methods for generating model training examples via PLMs and data augmentation procedures, i.e. dataset generation. A novel contribution in this research direction is proposed in the new paper ZeroGen: Efficient Zero-shot Learning via Dataset Generation, from researchers at the University of Hong Kong, Shanghai AI Lab, Huawei Noah's Ark Lab and the University of Washington. The team describes their proposed ZEROGEN as an "extreme instance" of dataset generation via PLMs for zero-shot learning. ZEROGEN is a framework for prompt-based zero-shot learning (PROMPTING). Unlike existing approaches that rely on gigantic PLMs during inference, ZEROGEM introduces a more flexible and efficient approach for conducting zero-shot learning with PLMs.

dataset generation, plm, zerogen, (8 more...)

#artificialintelligence

Country:

Asia > China > Shanghai > Shanghai (0.26)
Asia > China > Hong Kong (0.26)

Genre: Research Report (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback