Goto

Collaborating Authors

 Large Language Model


BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling

arXiv.org Artificial Intelligence

The pre-training of large language models usually requires massive amounts of resources, both in terms of computation and data. Frequently used web sources such as Common Crawl might contain enough noise to make this pre-training sub-optimal. In this work, we experiment with different sampling methods from the Spanish version of mC4, and present a novel data-centric technique which we name $\textit{perplexity sampling}$ that enables the pre-training of language models in roughly half the amount of steps and using one fifth of the data. The resulting models are comparable to the current state-of-the-art, and even achieve better results for certain tasks. Our work is proof of the versatility of Transformers, and paves the way for small teams to train their models on a limited budget. Our models are available at this $\href{https://huggingface.co/bertin-project}{URL}$.


A Personalized Zero-Shot ECG Arrhythmia Monitoring System: From Sparse Representation Based Domain Adaption to Energy Efficient Abnormal Beat Detection for Practical ECG Surveillance

arXiv.org Artificial Intelligence

This paper proposes a low-cost and highly accurate ECG-monitoring system intended for personalized early arrhythmia detection for wearable mobile sensors. Earlier supervised approaches for personalized ECG monitoring require both abnormal and normal heartbeats for the training of the dedicated classifier. However, in a real-world scenario where the personalized algorithm is embedded in a wearable device, such training data is not available for healthy people with no cardiac disorder history. In this study, (i) we propose a null space analysis on the healthy signal space obtained via sparse dictionary learning, and investigate how a simple null space projection or alternatively regularized least squares-based classification methods can reduce the computational complexity, without sacrificing the detection accuracy, when compared to sparse representation-based classification. (ii) Then we introduce a sparse representation-based domain adaptation technique in order to project other existing users' abnormal and normal signals onto the new user's signal space, enabling us to train the dedicated classifier without having any abnormal heartbeat of the new user. Therefore, zero-shot learning can be achieved without the need for synthetic abnormal heartbeat generation. An extensive set of experiments performed on the benchmark MIT-BIH ECG dataset shows that when this domain adaptation-based training data generator is used with a simple 1-D CNN classifier, the method outperforms the prior work by a significant margin. (iii) Then, by combining (i) and (ii), we propose an ensemble classifier that further improves the performance. This approach for zero-shot arrhythmia detection achieves an average accuracy level of 98.2% and an F1-Score of 92.8%. Finally, a personalized energy-efficient ECG monitoring scheme is proposed using the above-mentioned innovations.


An open-source model that dwarfs GPT-3 aims to free AI from Big Tech

#artificialintelligence

A language model bigger than GPT-3 has arrived with a bold ambition: freeing AI from Big Tech's clutches. Named BLOOM, the large language model (LLM) promises a similar performance to Silicon Valley's leading systems -- but with a radically different approach to access. While tech giants tend to keep their vaunted LLMs hidden from the public, BLOOM is available to anyone for free. These features could democratize access to technology that's set to make a deep impact on society. Powerful AI models can be trained and released in an open way.


Inner Monologue: Embodied Reasoning through Planning with Language Models - Technology Org

#artificialintelligence

Large language models (LLMs) have rich internalized knowledge about the world and are able to carry out some degree of deduction and respond to questions requiring reasoning and inference. An example of ViLD object detection segmentation mask and bounding box predictions. The Inner Monologue system is created to chain together these components in a shared language prompt. As a result, the system can accomplish complex, long-horizon, and unseen tasks in simulation as well as on real-world robotic platforms. Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots.


AI21 Labs raises $64M to help it compete against OpenAI

#artificialintelligence

AI21 Labs has raised $64 million in a funding round to help it compete against OpenAI and other NLP leaders. Competition in NLP (Natural Language Processing) is heating up. OpenAI is currently seen as the industry leader with its GPT-3 model but rivals are gaining traction. Investors see AI21 Labs as one of the most promising contenders. "We completed this round during a period of market uncertainty, which highlights the confidence our investors have in AI21's vision to change the way people consume and produce information," said Ori Goshen, Co-Founder and Co-CEO of AI21 Labs.


Large language models might reason--if you know how to speak to them

#artificialintelligence

This article is part of our coverage of the latest in AI research. Large language models (LLM), neural networks trained on huge corpora of text (or other types of data) have become a hot topic of discussion in the artificial intelligence community, especially since a Google engineer claimed that one of the company's LLMs was sentient. On the one hand, large language models can perform wonderful feats, generating large sequences of text that are mostly coherent and create the impression that they have indeed mastered human language and its underlying skills. On the other hand, numerous experiments show that LLMs are just parroting their training data and are only showing impressive results because they have been exposed to huge amounts of text and break as soon as they are presented with tasks and problems that require reasoning, common sense, and skills that are implicitly learned. But a new study by researchers at the University of Tokyo shows that if you provide the LLMs with well-crafted prompts, you can steer them toward answering questions that require reasoning and step-by-step thinking.


La veille de la cybersécurité

#artificialintelligence

Unlike other, more famous large language models such as OpenAI's GPT-3 and Google's LaMDA, BLOOM (which stands for BigScience Large Open-science Open-access Multilingual Language Model) is designed to be as transparent as possible, with researchers sharing details about the data it was trained on, the challenges in its development, and the way they evaluated its performance. OpenAI and Google have not shared their code or made their models available to the public, and external researchers have very little understanding of how these models are trained. BLOOM was created over the last year by over 1,000 volunteer researchers in a project called BigScience, which was coordinated by AI startup Hugging Face using funding from the French government. It officially launched on July 12. The researchers hope developing an open-access LLM that performs as well as other leading models will lead to long-lasting changes in the culture of AI development and help democratize access to cutting-edge AI technology for researchers around the world.


La veille de la cybersécurité

#artificialintelligence

Inspired by research into how infants learn, computer scientists have created a program that can learn simple physical rules about the behaviour of objects -- and express surprise when they seem to violate those rules. The results were published on 11 July in Nature Human Behaviour1. Developmental psychologists test how babies follow the motion of objects by tracking their gaze. When shown a video of, for example, a ball that suddenly disappears, the children express surprise, which researchers measure by how long they stare in a particular direction. Luis Piloto, a computer scientist at Google-owned company DeepMind in London, and his collaborators wanted to develop a similar test for artificial intelligence (AI).


First 50 ODSC West 2022 Speakers Announced

#artificialintelligence

Having just wrapped up a successful ODSC Europe, we're now turning our attention to ODSC West 2022 and we couldn't be more excited to announce our first group of speakers. These innovators and experts have helped shape the fields of data science and AI into what we have today, and will continue to do so in the years to come. You can find a full list of our currently confirmed ODSC West speakers here, and a sneak peek of just a few of them (and their session topics) below. In recent years, the fields of NLP, robotics, and computer vision, among others, have seen significant advancement thanks to Self-supervised and Unsupervised learning techniques. This session will provide hands-on examples of how you can apply large language models and transformers to zero-shot and few-shot learning in NLP applications.


Large Language Models Are Being Open-Sourced

#artificialintelligence

Large Language Models (LLM's) have received much attention of late, with Co:here, OpenAI and AI21Labs being the big commercial offerings. There are also solutions like Botpress' OpenBook that leverages large language models in order to bootstrap a chatbot implementation. And recently I have written about and shared an architecture for bootstrapping a chatbot with LLM's. But what are the advantages of LLM's? The unique differentiators of Large language Models (LLM's) are: Language modelling usually refers to the type of supervision objective used during training.