Goto

Collaborating Authors

 Large Language Model


Code Prediction by Feeding Trees to Transformers

arXiv.org Artificial Intelligence

We advance the state-of-the-art in the accuracy of code prediction (next token prediction) used in autocomplete systems. First, we report that using the recently proposed Transformer architecture even out-of-the-box outperforms previous neural and non-neural systems for code prediction. We then show that by making the Transformer architecture aware of the syntactic structure of code, we further increase the margin by which a Transformer-based system outperforms previous systems. With this, it outperforms the accuracy of an RNN-based system (similar to Hellendoorn et al. 2018) by 18.3%, the Deep3 system (Raychev et al 2016) by 14.1%, and an adaptation of Code2Seq (Alon et al., 2018) for code prediction by 14.4%. We present in the paper several ways of communicating the code structure to the Transformer, which is fundamentally built for processing sequence data. We provide a comprehensive experimental evaluation of our proposal, along with alternative design choices, on a standard Python dataset, as well as on a Facebook internal Python corpus. Our code and data preparation pipeline will be available in open source.


The AI Index 2021 Annual Report

arXiv.org Artificial Intelligence

Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.


GPT-3 & Beyond: 10 NLP Research Papers You Should Read

#artificialintelligence

NLP research advances in 2020 are still dominated by large pre-trained language models, and specifically transformers. There were many interesting updates introduced this year that have made transformer architecture more efficient and applicable to long documents. Another hot topic relates to the evaluation of NLP models in different applications. We still lack evaluation approaches that clearly show where a model fails and how to fix it. Also, with the growing capabilities of language models such as GPT-3, conversational AI is enjoying a new wave of interest. Chatbots are improving, with several impressive bots like Meena and Blender introduced this year by top technology companies.


Rethinking Return-over-Investment for Machine Learning

#artificialintelligence

It is very common for ML practitioners to split their dataset into three disjoint subsets: training, validation, and test. Multiple instantiations of the model are trained on the training set and evaluated on the validation set, in search of the best combination of hyperparameter values. The combination that yields the highest validation performance is selected, and the final model is judged based on its performance on the test set, often expressed as a single-valued metric like accuracy. This established best practice deliberately closes an eye to the results on the validation set: once a piece of data is used to alter the model (in this case the value of the hyperparameters), it is considered tainted and loses its ability to evaluate how well the model generalizes. Dodge et al. [1] argue that discarding the validation metrics is a missed opportunity to quantify how much computation has gone into finding the right set of hyperparameters.


Intuitive Introduction to BERT – MachineCurve

#artificialintelligence

Transformers are taking the world of NLP by storm. After being introduced in Vaswani et al.'s Attention is all you need work back in 2017, they – and particularly their self-attention mechanism requiring no recurrent elements to be used anymore – have proven to show state-of-the-art performance on a wide variety of language tasks. Nevertheless, what's good can still be improved, and this process has been applied to Transformers as well. After the introduction of the'vanilla' Transformer by Vaswani and colleagues, a group of people at OpenAI have used just the decoder segment and built a model that works great. However, according to Devlin et al., the authors of a 2018 paper about pretrained Transformers in NLP, they do one thing wrong: the attention that they apply is unidirectional. This hampers learning unnecessarily, they argue, and they proposed a bidirectional variant instead: BERT, or Bidirectional Encoder Representations from Transformers.


Annual index finds AI is 'industrializing' but needs better metrics and testing

#artificialintelligence

China has overtaken the United States in total number of AI research citations, fewer AI startups are receiving funding, and Congress is talking about AI more than ever. Those are three major trends highlighted in the 2021 AI Index, an annual report released today by Stanford University. Now in its fourth year, the AI Index attempts to document advances in artificial intelligence, as well as the technology's impact on education, startups, and government policy. The report details progress in the performance of major subdomains of AI, like deep learning, image recognition, and object detection, as well as in areas like protein folding. The AI Index is compiled by the Stanford Institute for Human-Centered Artificial Intelligence and an 11-member steering committee, with contributors from Harvard University, OECD, the Partnership on AI, and SRI International.


GPT-3 for Corporates -- Is Data Privacy an Issue?

#artificialintelligence

Generative Pre-trained Transformer 3 is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation of language prediction model in the GPT-n series created by OpenAI. GPT-3 is an extension and scaled-up version of GPT-2 model architecture -- It includes the modified initialization, pre-normalization, and reversible tokenization and shows strong performance on many NLP tasks in the zero-shot, one-shot, and few-shot settings. In the above graph, it is clearly visible how GPT-3 dominates all the small models and gets substantial gains on almost all the NLP tasks. It is based on the approach of pretraining on a large dataset followed by fine-tuning or priming for a specific task.


Generating Images with Sparse Representations

arXiv.org Machine Learning

The high dimensionality of images presents architecture and sampling-efficiency challenges for likelihood-based generative models. Previous approaches such as VQ-VAE use deep autoencoders to obtain compact representations, which are more practical as inputs for likelihood-based models. We present an alternative approach, inspired by common image compression methods like JPEG, and convert images to quantized discrete cosine transform (DCT) blocks, which are represented sparsely as a sequence of DCT channel, spatial location, and DCT coefficient triples. We propose a Transformer-based autoregressive architecture, which is trained to sequentially predict the conditional distribution of the next element in such sequences, and which scales effectively to high resolution images. On a range of image datasets, we demonstrate that our approach can generate high quality, diverse images, with sample metric scores competitive with state of the art methods. We additionally show that simple modifications to our method yield effective image colorization and super-resolution models.


Artificial intelligence is going industrial, says Stanford report

#artificialintelligence

Artificial intelligence is becoming a true industry, with all the pluses and minuses that entails, according to a sweeping new report.Why it matters: AI is now in nearly every area of business, with the pandemic pushing even more investment in drug design and medicine. But as the technology matures, challenges around ethics and diversity grow.Stay on top of the latest market trends and economic insights with Axios Markets. Subscribe for freeDriving the news: This morning, the Stanford Institute for Human-Centered Artificial Intelligence (HAI) released its annual AI Index, a top overview of the current state of the field.A majority of North American AI Ph.D.s — 65% — now go into industry, up from 44% in 2010, a sign of the growing role that large companies are playing both in AI research and implementation."The striking thing to me is that AI is moving from a research phase to much more of an industrial practice," says Erik Brynjolfsson, a senior fellow at HAI and director of the Stanford Digital Economy Lab.By the numbers: Even with the pandemic, private AI investment grew by 9.3% in 2020, a bigger increase than in 2019.For the third year in a row, however, the number of newly funded companies decreased, a sign that "we're moving from pure research and exploratory small startups to industrial-stage companies," says Brynjolfsson.While academia remains the single-biggest source worldwide for peer-reviewed AI papers, corporate-affiliated research now represents nearly a fifth of all papers in the U.S., making it the second-biggest source.The drug and medical industries took in by far the biggest share of overall AI private investment in 2020, absorbing more than $13.8 billion — 4.5 times greater than in 2019 and nearly three times more than the next category of autonomous vehicles.The catch: While the field has experienced sudden busts in the past — the "AI winters" that vaporized funding — there's little indication such a collapse is on the horizon. But industrialization comes with its own growing pains.Cutting-edge AI increasingly requires huge amounts of computing and data, which puts more power in the hands of fewer big players.Conversely, the commoditization of AI technologies like facial recognition means more players in the field, both domestically and internationally, which makes it more difficult to regulate their use. As AI grows, the ethical challenges embedded in the field — and the fact that 45% of new AI Ph.D.s are white, compared to just about 2% who are Black — will mean "there's a new frontier of potential privacy violations and other abuses," says Brynjolfsson.The AI Index found that while the field of AI ethics is growing, the interest level of big companies is still "disappointingly small," says Brynjolfsson.Details: Those growing pains are at play in one of the most exciting applications in AI today: massive text-generating models. Systems like OpenAI's GPT-3, released last year, swallow hundreds of billions of words along the way to producing original text that can be eerily human-like in its execution.Text-generating AI models could help polish human-written resumes for job search, but could also potentially be used to spam corporate competitors with realistic computer-generated applicants, not to mention warp our shared reality."What we increasingly have with these models is a double-edged sword," says Kristin Tynski, a co-founder and senior VP at Fractl, a data-driven marketing company.What to watch: The growing geopolitical AI competition between the U.S. and China.The National Security Commission on Artificial Intelligence warned in a major report this week that "China possesses the might, talent, and ambition to surpass the United States as the world’s leader in AI in the next decade if current trends do not change.""We don’t have to go to war with China," former Google CEO Eric Schmidt, who chaired the committee that authored the report, told my Axios colleague Ina Fried. "We do need to be competitive."Yes, but: While researchers in China publish the most AI papers, the U.S. still leads on quality, according to the Stanford survey.And while a majority of AI Ph.D.s in the U.S. are from abroad, more than 80% remain in the country when they take jobs — a sign of the lasting attraction of the U.S. tech sector.The bottom line: AI still has a long way to go, but the challenges the field faces are shifting from what it can do to what it should do.Like this article? Get more from Axios and subscribe to Axios Markets for free.


Reinforcement learning and reasoning

#artificialintelligence

Reinforcement learning has seen a lot of progress in recent years. From DeepMind success with teaching machines how to play Atari games, then AlphaGo beating world champions in Go to recent OpenAI's progress on Dota 2, a multiplayer game where players divided into two teams compete with each other. The common thread is an artificial agent operating in a virtual world, where the prize is clear (e.g. On the other hand people are experimenting with AI agents operating in real-world. Each clip of Boston Dynamics gets a lot of press, showing robots performing amazing stunts, as you can see yourself here or here.