Goto

Collaborating Authors

 Large Language Model


Empirical Evaluation of Post-Training Quantization Methods for Language Tasks

arXiv.org Artificial Intelligence

Transformer-based architectures like BERT have achieved great success in a wide range of Natural Language tasks. Despite their decent performance, the models still have numerous parameters and high computational complexity, impeding their deployment in resource-constrained environments. Post-Training Quantization (PTQ), which enables low-bit computations without extra training, could be a promising tool. In this work, we conduct an empirical evaluation of three PTQ methods on BERT-Base and BERT-Large: Linear Quantization (LQ), Analytical Clipping for Integer Quantization (ACIQ), and Outlier Channel Splitting (OCS). OCS theoretically surpasses the others in minimizing the Mean Square quantization Error and avoiding distorting the weights' outliers. That is consistent with the evaluation results of most language tasks of GLUE benchmark and a reading comprehension task, SQuAD. Moreover, low-bit quantized BERT models could outperform the corresponding 32-bit baselines on several small language tasks, which we attribute to the alleviation of over-parameterization. We further explore the limit of quantization bit and show that OCS could quantize BERT-Base and BERT-Large to 3-bits and retain 98% and 96% of the performance on the GLUE benchmark accordingly. Moreover, we conduct quantization on the whole BERT family, i.e., BERT models in different configurations, and comprehensively evaluate their performance on the GLUE benchmark and SQuAD, hoping to provide valuable guidelines for their deployment in various computation environments.


Mutual Information Alleviates Hallucinations in Abstractive Summarization

arXiv.org Artificial Intelligence

Despite significant progress in the quality of language generated from abstractive summarization models, these models still exhibit the tendency to hallucinate, i.e., output content not supported by the source document. A number of works have tried to fix--or at least uncover the source of--the problem with limited success. In this paper, we identify a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty. This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, i.e., high-frequency occurrences in the training set, when uncertain about a continuation. It also motivates possible routes for real-time intervention during decoding to prevent such hallucinations. We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty. Experiments on the XSum dataset show that our method decreases the probability of hallucinated tokens while maintaining the Rouge and BertS scores of top-performing decoding strategies.


Large language models are not zero-shot communicators

#artificialintelligence

Understanding of pragmatics is an essential and ubiquitous part of human communication. We show large language models (LLMs) mostly don't capture this aspect of language, hindering their applicability in the real world. Our analysis indicates where the largest room for improvement is to ultimately make this technology more useful. Recently, a large language model (LLM) called LaMDA beautifully passed (a variation of) the Turing test. In our most recent paper's title we state that LLMs are not zero-shot communicators.


Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

arXiv.org Artificial Intelligence

Large language models produce human-like text that drives a growing number of applications. However, recent literature and, increasingly, real world observations, have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful. Though work to evaluate language model harms is under way, translating foresight about which harms may arise into rigorous benchmarks is not straightforward. To facilitate this translation, we outline six ways of characterizing harmful text which merit explicit consideration when designing new benchmarks. We then use these characteristics as a lens to identify trends and gaps in existing benchmarks. Finally, we apply them in a case study of the Perspective API, a toxicity classifier that is widely used in harm benchmarks. Our characteristics provide one piece of the bridge that translates between foresight and effective evaluation.


Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders

arXiv.org Artificial Intelligence

Text-based voice editing (TBVE) uses synthetic output from text-to-speech (TTS) systems to replace words in an original recording. Recent work has used neural models to produce edited speech that is similar to the original speech in terms of clarity, speaker identity, and prosody. However, one limitation of prior work is the usage of finetuning to optimise performance: this requires further model training on data from the target speaker, which is a costly process that may incorporate potentially sensitive data into server-side models. In contrast, this work focuses on the zero-shot approach which avoids finetuning altogether, and instead uses pretrained speaker verification embeddings together with a jointly trained reference encoder to encode utterance-level information that helps capture aspects such as speaker identity and prosody. Subjective listening tests find that both utterance embeddings and a reference encoder improve the continuity of speaker identity and prosody between the edited synthetic speech and unedited original recording in the zero-shot setting.


Zero-Shot Text Matching for Automated Auditing using Sentence Transformers

arXiv.org Artificial Intelligence

Natural language processing methods have several applications in automated auditing, including document or passage classification, information retrieval, and question answering. However, training such models requires a large amount of annotated data which is scarce in industrial settings. At the same time, techniques like zero-shot and unsupervised learning allow for application of models pre-trained using general domain data to unseen domains. In this work, we study the efficiency of unsupervised text matching using Sentence-Bert, a transformer-based model, by applying it to the semantic similarity of financial passages. Experimental results show that this model is robust to documents from in- and out-of-domain data.


Controllable Fake Document Infilling for Cyber Deception

arXiv.org Artificial Intelligence

Recent works in cyber deception study how to deter malicious intrusion by generating multiple fake versions of a critical document to impose costs on adversaries who need to identify the correct information. However, existing approaches are context-agnostic, resulting in sub-optimal and unvaried outputs. We propose a novel context-aware model, Fake Document Infilling (FDI), by converting the problem to a controllable mask-then-infill procedure. FDI masks important concepts of varied lengths in the document, then infills a realistic but fake alternative considering both the previous and future contexts. We conduct comprehensive evaluations on technical documents and news stories. Results show that FDI outperforms the baselines in generating highly believable fakes with moderate modification to protect critical information and deceive adversaries.


The risks posed by artificial intelligence demand serious consideration

#artificialintelligence

Amidst the Russian invasion of Ukraine, the risk of nuclear war is now larger than it has been since the end of the Cold War. The spectre of nuclear annihilation, once thought a thing of the past, has returned. While technology can avert some forms of annihilation, for example by diverting major asteroid strikes, these naturally occurring risks are likely small, evidenced by our long history free from them. The same cannot be said for those caused or exacerbated by technology. Nuclear war, climate change, engineered bioweapons, and even pandemics: these risks are unfortunately all too familiar.


TheSequence

#artificialintelligence

TheSequence is an ML community media, trusted by over 144,000+ specialists from all over the world, including the top AI labs like DeepMind, OpenAI, Google Brain, MSFT Research, LinkedIn, universities like MIT, Cornell, Berkeley, Carnegie Mellon, Columbia, and hundreds of large enterprises. Sent Bi-Weekly.


How DeepMind's AlphaTensor AI Devised a Faster Matrix Multiplication & More Latest News - Up Jobs

#artificialintelligence

After growing a man-made intelligence that may obtain superhuman mastery of video games like chess and go, along with one other AI that may predict how proteins fold themselves in three-dimensional area, the researchers over at DeepMind have completed it once more -- this time utilizing a deep studying AI mannequin to effectively clear up a elementary arithmetic downside, whereas beating a 50-year-old document besides. In a weblog put up from earlier this month, the DeepMind group introduces AlphaTensor, an AI system that's designed for locating new and extra environment friendly algorithms for fixing essential mathematical operations -- on this case, matrix multiplication. Whether they're used to course of or compress pictures or video, recognizing spoken instructions, or working simulations to foretell the climate, matrix multiplication underpins a lot of recent computing. So it's little surprise that consultants and firms everywhere in the world are continuously in search of extra environment friendly methods to enhance the algorithms for fixing these mathematical operations behind such duties. Matrix multiplication is without doubt one of the easiest mathematical operations in algebra, the place particular person numbers which might be organized in grids -- or matrices -- are multiplied collectively after which added in particular manner with the intention to generate a new matrix.