Goto

Collaborating Authors

 Large Language Model


Detecting Generated Scientific Papers using an Ensemble of Transformer Models

arXiv.org Artificial Intelligence

The paper describes neural models developed for the DAGPap22 shared task hosted at the Third Workshop on Scholarly Document Processing. This shared task targets the automatic detection of generated scientific papers. Our work focuses on comparing different transformer-based models as well as using additional datasets and techniques to deal with imbalanced classes. As a final submission, we utilized an ensemble of SciBERT, RoBERTa, and DeBERTa fine-tuned using random oversampling technique. Our model achieved 99.24% in terms of F1-score. The official evaluation results have put our system at the third place.


Twitter pranksters derail GPT-3 bot with newly discovered "prompt injection" hack

#artificialintelligence

On Thursday, a few Twitter users discovered how to hijack an automated tweet bot, dedicated to remote jobs, running on the GPT-3 language model by OpenAI. Using a newly discovered technique called a "prompt injection attack," they redirected the bot to repeat embarrassing and ridiculous phrases. The bot is run by Remoteli.io, It would normally respond to tweets directed to it with generic statements about the positives of remote work. After the exploit went viral and hundreds of people tried the exploit for themselves, the bot shut down late yesterday. This recent hack came just four days after data researcher Riley Goodside discovered the ability to prompt GPT-3 with "malicious inputs" that order the model to ignore its previous directions and do something else instead.


The Download: discovering proteins, and Pakistan's climate crisis

MIT Technology Review

What's happened?: A new AI tool could help researchers discover previously unknown proteins and design entirely new ones. When harnessed, it could help unlock the development of more efficient vaccines, speed up research into cures for cancer, or lead to completely new materials. How it works: ProteinMPNN, developed by a group of researchers from the University of Washington, offers scientists a tool that will complement DeepMind's AlphaFold tool's ability to predict the shapes of all proteins known to science. ProteinMPNN will help researchers with the inverse problem. If they already have an exact protein structure in mind, it will help them find the amino acid sequence that folds into that shape.


GhostWriter Beta and AI Mode

#artificialintelligence

Showing the correct code recommendation is only half of the equation. Very often, users keep typing even after the code recommendation is shown. If you type something that matches the suggestion, we need to adjust the suggestion shown on screen to hide the part the user has typed--and resurface it when you hit backspace. What's more, when you type, the editor would autocomplete a beginning bracket whose corresponding ending bracket; we want to make sure the latter appears in the right spot.


Is Sanskrit the best language to program computers and AI?

#artificialintelligence

Ramachandran quotes a variety of sources--Indian government officials, a motley bunch of academics and Indian-American author Rajiv Malhotra, who goes on to claim that Sanskrit should be credited with the last 20 years of development in Natural Language Processing (NLP), the technology behind prominent LLMs like GPT-3, DALL-E 2, etc. The claims are wide-ranging: Sanskrit is the most'scientific' language, and so the "best to programme computers, or code AI/ML"; it is the "language for future super computers", etc. One common source that everyone cites, and which Ramachandran explores in detail, is "Nasa". Yes, the same Nasa that sends rockets into space. The reference actually has a published source, a 1985 paper'Knowledge Representation in Sanskrit and Artificial Intelligence' by Nasa researcher Rick Briggs (bit.ly/3qrIjMr).


Out of One, Many: Using Language Models to Simulate Human Samples

#artificialintelligence

We propose and explore the possibility that language models can be studied as effective proxies for specific human sub-populations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the "algorithmic bias" within one such tool -- the GPT-3 language model -- is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property "algorithmic fidelity" and explore its extent in GPT-3. We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and socio-cultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.


LATTE: LAnguage Trajectory TransformEr

arXiv.org Artificial Intelligence

Natural language is one of the most intuitive ways to express human intent. However, translating instructions and commands towards robotic motion generation and deployment in the real world is far from being an easy task. The challenge of combining a robot's inherent low-level geometric and kinodynamic constraints with a human's high-level semantic instructions traditionally is solved using task-specific solutions with little generalizability between hardware platforms, often with the use of static sets of target actions and commands. This work instead proposes a flexible language-based framework that allows a user to modify generic robotic trajectories. Our method leverages pre-trained language models (BERT and CLIP) to encode the user's intent and target objects directly from a free-form text input and scene images, fuses geometrical features generated by a transformer encoder network, and finally outputs trajectories using a transformer decoder, without the need of priors related to the task or robot information. We significantly extend our own previous work presented in Bucker et al. by expanding the trajectory parametrization space to 3D and velocity as opposed to just XY movements. In addition, we now train the model to use actual images of the objects in the scene for context (as opposed to textual descriptions), and we evaluate the system in a diverse set of scenarios beyond manipulation, such as aerial and legged robots. Our simulated and real-life experiments demonstrate that our transformer model can successfully follow human intent, modifying the shape and speed of trajectories within multiple environments. Codebase available at: https://github.com/arthurfenderbucker/LaTTe-Language-Trajectory-TransformEr.git


APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations

arXiv.org Artificial Intelligence

Using style-transfer models to reduce offensiveness of social media comments can help foster a more inclusive environment. However, there are no sizable datasets that contain offensive texts and their inoffensive counterparts, and fine-tuning pretrained models with limited labeled data can lead to the loss of original meaning in the style-transferred text. To address this issue, we provide two major contributions. First, we release the first publicly-available, parallel corpus of offensive Reddit comments and their style-transferred counterparts annotated by expert sociolinguists. Then, we introduce the first discourse-aware style-transfer models that can effectively reduce offensiveness in Reddit text while preserving the meaning of the original text. These models are the first to examine inferential links between the comment and the text it is replying to when transferring the style of offensive Reddit text. We propose two different methods of integrating discourse relations with pretrained transformer models and evaluate them on our dataset of offensive comments from Reddit and their inoffensive counterparts. Improvements over the baseline with respect to both automatic metrics and human evaluation indicate that our discourse-aware models are better at preserving meaning in style-transferred text when compared to the state-of-the-art discourse-agnostic models.


Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval

arXiv.org Artificial Intelligence

State-of-the-art neural (re)rankers are notoriously data-hungry which -- given the lack of large-scale training data in languages other than English -- makes them rarely used in multilingual and cross-lingual retrieval settings. Current approaches therefore commonly transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders: they fine-tune all parameters of pretrained massively multilingual Transformers (MMTs, e.g., multilingual BERT) on English relevance judgments, and then deploy them in the target language(s). In this work, we show that two parameter-efficient approaches to cross-lingual transfer, namely Sparse Fine-Tuning Masks (SFTMs) and Adapters, allow for a more lightweight and more effective zero-shot transfer to multilingual and cross-lingual retrieval tasks. We first train language adapters (or SFTMs) via Masked Language Modelling and then train retrieval (i.e., reranking) adapters (SFTMs) on top, while keeping all other parameters fixed. At inference, this modular design allows us to compose the ranker by applying the (re)ranking adapter (or SFTM) trained with source language data together with the language adapter (or SFTM) of a target language. We carry out a large scale evaluation on the CLEF-2003 and HC4 benchmarks and additionally, as another contribution, extend the former with queries in three new languages: Kyrgyz, Uyghur and Turkish. The proposed parameter-efficient methods outperform standard zero-shot transfer with full MMT fine-tuning, while being more modular and reducing training times. The gains are particularly pronounced for low-resource languages, where our approaches also substantially outperform the competitive machine translation-based rankers.


La veille de la cybersécurité

#artificialintelligence

For AI to reach its potential and for society to benefit from it, AI needs to be decentralised, i.e., different stakeholders in the AI community should have equal access to all resources like datasets, compute power and the source codes for different AI models. But that is not the case today. Today, most of the breakthroughs in the field of AI come from big organisations. AI text-to-image generators such as DALL-E2 and Imagen to Large Language Models (LLM) such as GPT-3, have all come from large organisations. However, none of these AI models are open-sourced.