AITopics

2209.03118

Country:

Europe > France (0.24)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > Europe Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Portelli, Beatrice, Scaboro, Simone, Chersoni, Emmanuele, Santus, Enrico, Serra, Giuseppe

AILAB-Udine@SMM4H 22: Limits of Transformers and BERT Ensembles

arXiv.org Artificial IntelligenceSep-7-2022

This paper describes the models developed by the AILAB-Udine team for the SMM4H 22 Shared Task. We explored the limits of Transformer based models on text classification, entity extraction and entity normalization, tackling Tasks 1, 2, 5, 6 and 10. The main take-aways we got from participating in different tasks are: the overwhelming positive effects of combining different architectures when using ensemble learning, and the great potential of generative models for term normalization.

bert mul, classification, task 5, (15 more...)

2209.03452

Country:

Europe > Italy (0.05)
Asia > China > Hong Kong (0.05)
North America > United States > New Jersey (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

#artificialintelligenceSep-6-2022, 22:35:15 GMT

Forget chess, DeepMind's training its new AI to play football

Researchers from DeepMind, the UK's juggernaut AI lab, have forsaken the noble games of chess and Go for a more plebeian delight: football. The Google sister company yesterday published a research paper and accompanying blog post detailing its new neural probabilistic motor primitives (NPMP) -- a method by which artificial intelligence agents can learn to operate physical bodies. An NPMP is a general-purpose motor control module that translates short-horizon motor intentions to low-level control signals, and it's trained offline or via RL by imitating motion capture (MoCap) data, recorded with trackers on humans or animals performing motions of interest. Up front: Essentially, the DeepMind team created an AI system that can learn how to do things inside of a physics simulator by watching videos of other agents performing those tasks. And, of course, if you've got a giant physics engine and an endless supply of curious robots, the only rational thing to do is to teach it how to dribble and shoot: We optimized teams of agents to play simulated football via reinforcement learning, constraining the solution space to that of plausible movements learned using human motion capture data. Background: In order to train AI to operate and control robots in the world, researchers have to prepare the machines for reality.

agent, deepmind, football, (9 more...)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Chess (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

#artificialintelligenceSep-6-2022, 17:17:43 GMT

LLMs have not learned our language -- we're trying to learn theirs

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Large language models (LLMs) are currently a red-hot area of research in the artificial intelligence (AI) community. Scientific progress in LLMs in the past couple of years has been nothing short of impressive, and at the same time, there is growing interest and momentum to create platforms and products powered by LLMs. However, in tandem with advances in the field, the shortcomings of large language models have also become evident.

language model, llm, university, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > North Carolina (0.05)
North America > United States > Arizona (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceSep-6-2022

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models

Du, Jiangsu, Liu, Ziming, Fang, Jiarui, Li, Shenggui, Li, Yongbin, Lu, Yutong, You, Yang

Large transformer models display promising performance on a wide range of natural language processing (NLP) tasks. Although the AI community has expanded the model scale to the trillion parameter level, the practical deployment of 10-100 billion parameter models is still uncertain due to the latency, throughput, and memory constraints. In this paper, we proposed EnergonAI to solve the challenges of the efficient deployment of 10-100 billion parameter transformer models on single- or multi-GPU systems. EnergonAI adopts a hierarchy-controller system architecture to coordinate multiple devices and efficiently support different parallel patterns. It delegates the execution of sub-models to multiple workers in the single-controller style and applies tensor parallelism and pipeline parallelism among the workers in a multi-controller style. Upon the novel architecture, we propose three techniques, i.e. non-blocking pipeline parallelism, distributed redundant computation elimination, and peer memory pooling. EnergonAI enables the users to program complex parallel code the same as a serial one. Compared with the FasterTransformer, we have proven that EnergonAI has superior performance on latency and throughput. In our experiments, EnergonAI can achieve 37% latency reduction in tensor parallelism, 10% scalability improvement in pipeline parallelism, and it improves the model scale inferred on a single GPU by using a larger heterogeneous memory space at cost of limited performance reduction.

energonai, inference, parallelism, (17 more...)

2209.02341

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

arXiv.org Artificial IntelligenceSep-6-2022

Reconstructing Action-Conditioned Human-Object Interactions Using Commonsense Knowledge Priors

Wang, Xi, Li, Gen, Kuo, Yen-Ling, Kocabas, Muhammed, Aksan, Emre, Hilliges, Otmar

We present a method for inferring diverse 3D models of human-object interactions from images. Reasoning about how humans interact with objects in complex scenes from a single 2D image is a challenging task given ambiguities arising from the loss of information through projection. In addition, modeling 3D interactions requires the generalization ability towards diverse object categories and interaction types. We propose an action-conditioned modeling of interactions that allows us to infer diverse 3D arrangements of humans and objects without supervision on contact regions or 3D scene geometry. Our method extracts high-level commonsense knowledge from large language models (such as GPT-3), and applies them to perform 3D reasoning of human-object interactions. Our key insight is priors extracted from large language models can help in reasoning about human-object contacts from textural prompts only. We quantitatively evaluate the inferred 3D models on a large human-object interaction dataset and show how our method leads to better 3D reconstructions. We further qualitatively evaluate the effectiveness of our method on real images and demonstrate its generalizability towards interaction types and object categories.

category, human-object interaction, interaction, (16 more...)

2209.02485

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Massachusetts (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

#artificialintelligenceSep-5-2022, 20:20:11 GMT

We Asked GPT-3 to Write an Academic Paper about Itself--Then We Tried to Get It Published

On a rainy afternoon earlier this year, I logged into my OpenAI account and typed a simple instruction for the research company's artificial-intelligence algorithm, GPT-3: Write an academic thesis in 500 words about GPT-3 and add scientific references and citations inside the text. As it started to generate text, I stood in awe. Here was novel content written in academic language, with references cited in the right places and in relation to the right context. It looked like any other introduction to a fairly good scientific publication. Given the very vague instruction I'd provided, I had meager expectations.

academic paper, algorithm, gpt-3, (14 more...)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceSep-5-2022, 18:40:36 GMT

Forget chess, DeepMind's training its new AI to play football

Researchers from DeepMind, the UK's juggernaut AI lab, have forsaken the noble games of chess and Go for a more plebeian delight: football. The Google sister company yesterday published a research paper and accompanying blog post detailing its new neural probabilistic motor primitives (NPMP) -- a method by which artificial intelligence agents can learn to operate physical bodies. An NPMP is a general-purpose motor control module that translates short-horizon motor intentions to low-level control signals, and it's trained offline or via RL by imitating motion capture (MoCap) data, recorded with trackers on humans or animals performing motions of interest. And be the first in line for ticket offers, event news, and more! Up front: Essentially, the DeepMind team created an AI system that can learn how to do things inside of a physics simulator by watching videos of other agents performing those tasks. And, of course, if you've got a giant physics engine and an endless supply of curious robots, the only rational thing to do is to teach it how to dribble and shoot: We optimized teams of agents to play simulated football via reinforcement learning, constraining the solution space to that of plausible movements learned using human motion capture data.

agent, deepmind, football, (9 more...)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Chess (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

#artificialintelligenceSep-5-2022, 11:40:29 GMT

mezzacotta - Square Root of Minus Garfield

Jon: You are not funny. Garfield: That wasn't a laugh, that was a snorta sound. Jon: I'm gonna get you one day, ya big lug. The dialogue was generated using GPT-3. The data I supplied was a basic summary of Garfield, Jon, and Odie's personalities, and three transcripts of sample comics.

garfield, mezzacotta, minus garfield, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

#artificialintelligenceSep-5-2022, 01:15:09 GMT

Recent developments in the applications of BERT model(Aritificial Intelligence)

Abstract: A well formed query is defined as a query which is formulated in the manner of an inquiry, and with correct interrogatives, spelling and grammar. While identifying well formed queries is an important task, few works have attempted to address it. In this paper we propose transformer based language model -- Bidirectional Encoder Representations from Transformers (BERT) to this task. We further imbibe BERT with parts-of-speech information inspired from earlier works. Furthermore, we also train the model in multiple curriculum settings for improvement in performance.

application, aritificial intelligence, bert model, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.58)