Goto

Collaborating Authors

 Large Language Model


Robust fine-tuning of zero-shot models

#artificialintelligence

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve accuracy on a given target distribution, they often reduce robustness to distribution shifts. We address this tension by introducing a simple and effective method for improving robustness while fine-tuning: ensembling the weights of the zero-shot and fine-tuned models (WiSE-FT). Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution. On ImageNet and five derived distribution shifts, WiSE-FT improves accuracy under distribution shift by 4 to 6 percentage points (pp) over prior work while increasing ImageNet accuracy by 1.6 pp. WiSE-FT achieves similarly large robustness gains (2 to 23 pp) on a diverse set of six further distribution shifts, and accuracy gains of 0.8 to 3.3 pp compared to standard fine-tuning on seven commonly used transfer learning datasets. These improvements come at no additional computational cost during fine-tuning or inference.


The Best Machine Learning Company of 2021

#artificialintelligence

We had a lot of developments with multiple tops and turns. The sheer number and quality of the multiple papers and outcomes released in the ML space were amazing. We had innovations in GPU, newer models, lots of research into different fields, and some ground-breaking discoveries. The Machine Learning industry continued to grow by leaps and bounds. Here are some interesting stats.


Experts Say That Soon, Almost the Entire Internet Could Be Generated by AI

#artificialintelligence

The Internet of the future could be written by bots, but will that make it better or worse? Experts at the Copenhagen Institute for Future Studies (CIFS) are raising questions about AI-generated content, and how it could come to dominate the metaverse and other digital locations. CIFS expert Timothy Shoup estimates that 99 percent to 99.9 percent of the internet's content will be AI-generated by 2025 to 2030, especially if models like OpenAI's GPT-3 achieve wider adoption. "The internet would be completely unrecognizable," Shoup told colleague Sofie Hvitved. As its capabilities advance, the idea is that AI could start to generate entire online worlds, along with all the stuff that inhabits them -- not to mention all the online material that's currently mostly made by humans.


La veille de la cybersécurité

#artificialintelligence

DeepMind has opened new paths for drug discovery and design by solving a 50-year-old problem in biology. By the end of 2020, DeepMind, the UK-based artificial-intelligence lab, had already produced many impressive achievements in AI. Still, when the group's program for predicting protein folding was released in November of that year, biologists were shocked by how well it worked. Nearly everything your body does, it does with proteins. Understanding what individual proteins do is therefore crucial for most drug development and for understanding many diseases. And what a protein does is determined by its three-dimensional shape.


Meet ZEROGEN: An Extreme Method for Dataset Generation via PLMs for Zero-Shot Learning

#artificialintelligence

The impressive generative capacity of large-scale pretrained language models (PLMs) has inspired machine learning researchers to explore methods for generating model training examples via PLMs and data augmentation procedures, i.e. dataset generation. A novel contribution in this research direction is proposed in the new paper ZeroGen: Efficient Zero-shot Learning via Dataset Generation, from researchers at the University of Hong Kong, Shanghai AI Lab, Huawei Noah's Ark Lab and the University of Washington. The team describes their proposed ZEROGEN as an "extreme instance" of dataset generation via PLMs for zero-shot learning. ZEROGEN is a framework for prompt-based zero-shot learning (PROMPTING). Unlike existing approaches that rely on gigantic PLMs during inference, ZEROGEM introduces a more flexible and efficient approach for conducting zero-shot learning with PLMs.


OpenAI's GPT-3 Inspired Model can Solve Problems from the Math Olympiads

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Formal mathematics has long been considered one of the toughest challenges for deep learning.


AI for protein folding

#artificialintelligence

The software, which uses an AI technique called deep learning, can predict the shape of proteins to the nearest atom, the first time a computer has matched the slow but accurate techniques used in the lab. Scientific teams around the world have started using it for research on cancer, antibiotic resistance, and covid-19. DeepMind has also set up a public database that it's filling with protein structures as AlphaFold2 predicts them. It currently has around 800,000 entries, and DeepMind says it will add more than 100 million--nearly every protein known to science--in the next year. DeepMind has spun off this work into a company called Isomorphic Labs, which it says will collaborate with existing biotech and pharma companies.


DeepMind Trains Agents to Control Computers as Humans Do to Solve Everyday Tasks

#artificialintelligence

While the design and development of contemporary AI systems has been largely results-oriented, there are also scenarios where it could be advantageous if models learned to do things "as a human would" to help with everyday tasks. That's the premise of the new DeepMind paper A Data-driven Approach for Learning To Control Computers, which proposes agents that can operate our digital devices via keyboard and mouse with goals specified in natural language. The study builds on recent developments in natural language processing, code production, and multimodal interactive behaviour in 3D simulated worlds that have enabled the generation of models with remarkable domain knowledge and desirable human-agent interaction capabilities. The proposed agents are trained on keyboard and mouse computer control for specific tasks with pixel and Document Object Model (DOM) observations, and achieve state-of-the-art and human-level mean performance across all tasks on the MiniWob benchmark. MiniWob is a challenging suite of web-browser-based tasks for computer control, ranging from simple button clicking to complex formfilling.



When DeepMind's 'AlphaCode' Competed Against Human Programmers

#artificialintelligence

Among at least a few programmers, this has already provoked some concern. Recently a programming student on Hacker News complained of "AlphaCode Anxiety" (as well as worries about GitHub's Copilot). "Now it feels like I'm running against a clock until the career I am working very hard for will automate itself away," the student wrote. When a blog post at CodeForces declared "The future has arrived," one worried programmer even argued that "there is a limit to what humans should automate." The programmer added pointedly that the DeepMind developers who built AlphaCode "think that they are irreplaceable, but they would be the first ones to get replaced." But the fact that AlphaCode finished in the bottom half was also greeted with a very human disparagement. "AI is such a noob," the first commenter responded.