AITopics | Tong, Simon

RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting

Shu, Lei, Luo, Liangchen, Hoskere, Jayakumar, Zhu, Yun, Liu, Yinxiao, Tong, Simon, Chen, Jindong, Meng, Lei

arXiv.org Artificial IntelligenceDec-19-2023

Large Language Models (LLMs) have demonstrated impressive capabilities in creative tasks such as storytelling and E-mail generation. However, as LLMs are primarily trained on final text results rather than intermediate revisions, it might be challenging for them to perform text rewriting tasks. Most studies in the rewriting tasks focus on a particular transformation type within the boundaries of single sentences. In this work, we develop new strategies for instruction tuning and reinforcement learning to better align LLMs for cross-sentence rewriting tasks using diverse wording and structures expressed through natural languages including 1) generating rewriting instruction data from Wiki edits and public corpus through instruction generation and chain-of-thought prompting; 2) collecting comparison data for reward model training through a new ranking function. To facilitate this research, we introduce OpenRewriteEval, a novel benchmark covers a wide variety of rewriting types expressed through natural language instructions. Our results show significant improvements over a variety of baselines. The public repository is available on GitHub under Google Research (https://github.com/google-research/google-research/tree/master/rewritelm).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.15685

Country:

Europe > Spain (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine > Consumer Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Corpora Generation for Grammatical Error Correction

Lichtarge, Jared, Alberti, Chris, Kumar, Shankar, Shazeer, Noam, Parmar, Niki, Tong, Simon

arXiv.org Machine LearningApr-10-2019

Grammatical Error Correction (GEC) has been recently modeled using the sequence-to-sequence framework. However, unlike sequence transduction problems such as machine translation, GEC suffers from the lack of plentiful parallel data. We describe two approaches for generating large parallel datasets for GEC using publicly available Wikipedia data. The first method extracts source-target pairs from Wikipedia edit histories with minimal filtration heuristics, while the second method introduces noise into Wikipedia sentences via round-trip translation through bridge languages. Both strategies yield similar sized parallel corpora containing around 4B tokens. We employ an iterative decoding strategy that is tailored to the loosely supervised nature of our constructed corpora. We demonstrate that neural GEC models trained using either type of corpora give similar performance. Fine-tuning these models on the Lang-8 corpus and ensembling allows us to surpass the state of the art on both the CoNLL-2014 benchmark and the JFLEG task. We provide systematic analysis that compares the two approaches to data generation and highlights the effectiveness of ensembling.

artificial intelligence, machine translation, natural language, (17 more...)

arXiv.org Machine Learning

1904.0578

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Active Learning for Parameter Estimation in Bayesian Networks

Tong, Simon, Koller, Daphne

Neural Information Processing SystemsDec-31-2001

Bayesian networks are graphical representations of probability distributions. In virtually all of the work on learning these networks, the assumption is that we are presented with a data set consisting of randomly generated instances from the underlying distribution. In many situations, however, we also have the option of active learning, where we have the possibility of guiding the sampling process by querying for certain types of samples. This paper addresses the problem of estimating the parameters of Bayesian networks in an active learning setting. We provide a theoretical framework for this problem, and an algorithm that chooses which active learning queries to generate based on the model learned so far. We present experimental results showing that our active learning algorithm can significantly reduce the need for training data in many situations.

active learning, artificial intelligence, bayesian inference, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Active Learning for Parameter Estimation in Bayesian Networks

Tong, Simon, Koller, Daphne

Neural Information Processing SystemsDec-31-2001

Bayesian networks are graphical representations of probability distributions. In virtually all of the work on learning these networks, the assumption is that we are presented with a data set consisting of randomly generated instances from the underlying distribution. In many situations, however, we also have the option of active learning, where we have the possibility of guiding the sampling process by querying for certain types of samples. This paper addresses the problem of estimating the parameters of Bayesian networks in an active learning setting. We provide a theoretical framework for this problem, and an algorithm that chooses which active learning queries to generate based on the model learned so far. We present experimental results showing that our active learning algorithm can significantly reduce the need for training data in many situations.

active learning, artificial intelligence, bayesian inference, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Active Learning for Parameter Estimation in Bayesian Networks

Tong, Simon, Koller, Daphne

Neural Information Processing SystemsDec-31-2001

Bayesian networks are graphical representations of probability distributions. In virtually all of the work on learning these networks, the assumption is that we are presented with a data set consisting of randomly generated instances from the underlying distribution. In many situations, however, we also have the option of active learning, where we have the possibility of guiding the sampling process by querying for certain types of samples. This paper addresses the problem of estimating the parameters of Bayesian networks in an active learning setting. We provide a theoretical framework for this problem, and an algorithm that chooses which active learning queries to generate based on the model learned so far. We present experimental results showing that our active learning algorithm can significantly reduce the need for training data in many situations.

active learning, artificial intelligence, bayesian inference, (18 more...)

Neural Information Processing Systems

Country: