Goto

Collaborating Authors

 Large Language Model


The NLP Cypher

#artificialintelligence

Around five percent of papers from the conference were on graphs so lots to discuss. A new paper (with authors from every major big tech), was recently published showing how one can attack language models like GPT-2 and extract information verbatim like personal identifiable information from just by querying the model. The information extracted derived from the models' training data that was based on scraped internet info. This is a big problem especially when you train a language model on a private custom dataset. Looks like Booking.com wants a new recommendation engine and they are offering up their dataset of over 1 million anonymized hotel reservations to get you in the game.


Leveraging GPT-2 for Classifying Spam Reviews with Limited Labeled Data via Adversarial Training

arXiv.org Artificial Intelligence

Online reviews are a vital source of information when purchasing a service or a product. Opinion spammers manipulate these reviews, deliberately altering the overall perception of the service. Though there exists a corpus of online reviews, only a few have been labeled as spam or non-spam, making it difficult to train spam detection models. We propose an adversarial training mechanism leveraging the capabilities of Generative Pre-Training 2 (GPT-2) for classifying opinion spam with limited labeled data and a large set of unlabeled data. Experiments on TripAdvisor and YelpZip datasets show that the proposed model outperforms state-of-the-art techniques by at least 7% in terms of accuracy when labeled data is limited. The proposed model can also generate synthetic spam/non-spam reviews with reasonable perplexity, thereby, providing additional labeled data during training.


DeepMind's latest AI can master games without being told their rules

Engadget

In 2016, Alphabet's DeepMind came out with AlphaGo, an AI which consistently beat the best human Go players. One year later, the subsidiary went on to refine its work, creating AlphaGo Zero. Where its predecessor learned to play Go by observing amateur and professional matches, AlphaGo Zero mastered the ancient game by simply playing against itself. DeepMind then created AlphaZero, which could play Go, chess and shogi with a single algorithm. What tied all those AIs together is that they knew the rules of the games they had to master going into their training.



Toward Transformer-Based Object Detection

arXiv.org Artificial Intelligence

Transformers have become the dominant model in natural language processing, owing to their ability to pretrain on massive amounts of data, then transfer to smaller, more specific tasks via fine-tuning. The Vision Transformer was the first major attempt to apply a pure transformer model directly to images as input, demonstrating that as compared to convolutional networks, transformer-based architectures can achieve competitive results on benchmark classification tasks. However, the computational complexity of the attention operator means that we are limited to low-resolution inputs. For more complex tasks such as detection or segmentation, maintaining a high input resolution is crucial to ensure that models can properly identify and reflect fine details in their output. This naturally raises the question of whether or not transformer-based architectures such as the Vision Transformer are capable of performing tasks other than classification. In this paper, we determine that Vision Transformers can be used as a backbone by a common detection task head to produce competitive COCO results. The model that we propose, ViT-FRCNN, demonstrates several known properties associated with transformers, including large pretraining capacity and fast fine-tuning performance. We also investigate improvements over a standard detection backbone, including superior performance on out-of-domain images, better performance on large objects, and a lessened reliance on non-maximum suppression. We view ViT-FRCNN as an important stepping stone toward a pure-transformer solution of complex vision tasks such as object detection.


A powerful AI generated some predictions for the future and they're quite outrageous

#artificialintelligence

A powerful AI algorithm has some, well, unusual predictions for what lies in store down the road. It's a been a weird year, what with monoliths, terrifying animals, and of course a global pandemic dominating the news cycle. Inspired by all that chaos, research scientist and author Janelle Shane asked GPT-3, a powerful text-generating algorithm, to guess the future. With killer orchids, monster toads, and deadly puffballs, the algorithm seems to have missed the mark. But then again, who could have predicted half of the nonsense we've endured lately?


Could GPT-3 Change The Way Future AI Models Are Developed and Deployed ?

#artificialintelligence

Much has been said about GPT-3 already. Traditionally, we start with data for a problem and develop the model based on the data. The model is specific to the problem. If you want to train a model to predict traffic patterns in New York, you build a model of New York traffic patterns. If you want to model air pollution in New York, that's a different model With GPT-3 you start with the model instead of the data.


DeepMind's latest AI breakthrough could turbocharge drug discovery

#artificialintelligence

While impressive, the technology wasn't yet capable of replacing the existing expensive and time-consuming experimental methods for determining what these proteins look like. However, its latest software comes close. In November, AlphaFold again outperformed all the other competing groups at CASP. The technology solved protein structures other labs had been working on for years. Scientists think the technology could have immense implications for the way proteins are studied.


DeepMind AI Predicts Protein Structure

#artificialintelligence

If you are even remotely interested in science, you will have probably already heard about DeepMind's latest leap. Their AI system Alphafold 2 has cracked predicting proteins' 3D structure. There are plenty of great articles about it. Since I have written about machine learning/AI in an earlier series of posts, I decided to write a brief post about this development as well. For more details, do check the Nature/New Scientist/DeepMind articles linked above.


Everything Product People Need to Know About Transformers, GPT-3, and HuggingFace (

#artificialintelligence

This is Part 1in the 3 Part Series on Transformers for Product People. Natural language processing (NLP) has passed an industry-changing inflection point. More than 20 long standing NLP challenges have been solved with near-human results in the past year, all by a single model: the attention-based transformer. This model was developed and published in December 2017, and has since kicked off an arms race between Google and OpenAI, with both labs shattering state of the art results with each new model release. With models like GPT-3 making a splash in the media, decision makers are wondering just how big this development is.