AITopics

2202.0217

Country:

Europe > Netherlands (0.25)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(18 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Polu, Stanislas, Han, Jesse Michael, Zheng, Kunhao, Baksys, Mantas, Babuschkin, Igor, Sutskever, Ilya

Formal Mathematics Statement Curriculum Learning

arXiv.org Artificial IntelligenceFeb-2-2022

We explore the use of expert iteration in the context of language modeling applied to formal mathematics. We show that at same compute budget, expert iteration, by which we mean proof search interleaved with learning, dramatically outperforms proof search only. We also observe that when applied to a collection of formal statements of sufficiently varied difficulty, expert iteration is capable of finding and solving a curriculum of increasingly difficult problems, without the need for associated ground-truth proofs. Finally, by applying this expert iteration to a manually curated set of problem statements, we achieve state-of-the-art on the miniF2F benchmark, automatically solving multiple challenging problems drawn from high school olympiads.

iteration, objective, proof search, (13 more...)

2202.01344

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

#artificialintelligenceFeb-1-2022, 05:25:15 GMT

Should AI Be Centered on Machine Learning Algorithms or Data?

Arun Shastri, PhD, leads ZS's global AI strategy practice, which spans research, helping clients build their capabilities and platform solutions. In this role, he also oversees analytics services and solutions for several industry sectors. PKS Prakash, PhD is a principal at ZS Associates; he designs and implements advanced data science and AI techniques across multiple verticals including healthcare, hospitality, retail and manufacturing.

centered, machine learning algorithm

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)

Majewska, Olga, Razumovskaia, Evgeniia, Ponti, Edoardo Maria, Vulić, Ivan, Korhonen, Anna

Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation

arXiv.org Artificial IntelligenceJan-31-2022

Multilingual task-oriented dialogue (ToD) facilitates access to services and information for many (communities of) speakers. Nevertheless, the potential of this technology is not fully realised, as current datasets for multilingual ToD - both for modular and end-to-end modelling - suffer from severe limitations. 1) When created from scratch, they are usually small in scale and fail to cover many possible dialogue flows. 2) Translation-based ToD datasets might lack naturalness and cultural specificity in the target language. In this work, to tackle these limitations we propose a novel outline-based annotation process for multilingual ToD datasets, where domain-specific abstract schemata of dialogue are mapped into natural language outlines. These in turn guide the target language annotators in writing a dialogue by providing instructions about each turn's intents and slots. Through this process we annotate a new large-scale dataset for training and evaluation of multilingual and cross-lingual ToD systems. Our Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding, dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages: Arabic, Indonesian, Russian, and Kiswahili. Qualitative and quantitative analyses of COD versus an equivalent translation-based dataset demonstrate improvements in data quality, unlocked by the outline-based approach. Finally, we benchmark a series of state-of-the-art systems for cross-lingual ToD, setting reference scores for future work and demonstrating that COD prevents over-inflated performance, typically met with prior translation-based ToD datasets.

artificial intelligence, machine learning, natural language, (19 more...)

doi: 10.1162/tacl_a_00539

2201.13405

Country:

Africa > Niger (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Air (0.93)
Transportation > Passenger (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceJan-28-2022, 16:18:04 GMT

Use a web browser plugin to quickly translate text with Amazon Translate

Web browsers can be a single pane of glass for organizations to interact with their information--all of the tools can be viewed and accessed on one screen so that users don't have to switch between applications and interfaces. For example, a customer call center might have several different applications to see customer reviews, social media feeds, and customer data. Each one of these applications are interacted with through web browsers. If the information is in a language that the user doesn't speak, however, a separate application often needs to be pulled up to translate text. Web browser plugins enable customization of this user experience.

amazon translate, browser plugin, plugin, (12 more...)

Industry: Retail > Online (0.40)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.55)

#artificialintelligenceJan-27-2022, 11:00:23 GMT

Best Machine language Translators

Machine language translators have improved a lot over the years. They have become earlier to use and produce accurate translations at cheaper to no cost. For localization translation machine translation services and software have served as a boon. The neural machine translation algorithm makes the delivery of translations natural. Let's take a look at the best machine translation engines in 2022

best machine language translator

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

AIHubJan-26-2022, 14:30:55 GMT

New voices in AI: David Adelani

Welcome to the first episode of New voices in AI! You can find David on Twitter @davlanade and find out more about Masakhane here. The music used is'Wholesome' by Kevin MacLeod, Licensed under Creative Commons Daly: Hello and welcome to new voices in AI, this a new series from AIhub where we celebrate the voices PhD students, early career researchers, and those with a new perspective on AI. And without further ado, let's begin. First up, a big welcome to our very first guest on "New voices in AI" and if you could introduce yourself, who are you? Adelani: Thank you very much for having me. So, Masakhane is this grassroots organization, whose mission is to strengthen and spur NLP research in African languages, by Africans for Africans, so, and currently the organization we are majorly operating on Slack we already have over 1000 Members. Of course, not everyone is active but we have more than 100 or close to 100 active members as well, yeah. So how did, how did you get into AI?

adelani, african language, daly, (14 more...)

AIHub

Country:

Africa > Nigeria (0.05)
North America > United States (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)
(5 more...)

Genre: Personal > Interview (0.67)

Technology:

Information Technology > Communications > Social Media (0.87)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.30)

#artificialintelligenceJan-19-2022, 06:05:44 GMT

"Artificial Intelligence" Science-Research, January 2022, Week 3 -- summary from Europe PMC

Background Liver is one of the most typical metastatic sites of colon cancer cells and liver metastasis determines subsequent therapy along with prognosis of patients, particularly in T1 patients. There is still no effective model to predict the danger of LM in T1 CRC patients. Objectives Chest radiographs are commonly performed in emergency units, yet the interpretation calls for radiology experience. Presently, top quality English-Chinese parallel corpus is presently in a phase of shortage. After that, the multilingual dictionary summed up by the translation model is combined with the language model, unsupervised translation model is initialized, unsupervised English-Chinese neural machine translation model is optimized with the back translation technique.

artificial intelligence, science-research, translation model, (2 more...)

Country: Europe (0.40)

Industry:

Health & Medicine > Therapeutic Area (0.99)
Health & Medicine > Nuclear Medicine (0.99)
Health & Medicine > Diagnostic Medicine > Imaging (0.99)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)

arXiv.org Artificial IntelligenceJan-17-2022

An Empirical Study on the Overlapping Problem of Open-Domain Dialogue Datasets

Wen, Yuqiao, Luo, Guoqing, Mou, Lili

Open-domain dialogue systems aim to converse with humans through text, and its research has heavily relied on benchmark datasets. In this work, we first identify the overlapping problem in DailyDialog and OpenSubtitles, two popular open-domain dialogue benchmark datasets. Our systematic analysis then shows that such overlapping can be exploited to obtain fake state-of-the-art performance. Finally, we address this issue by cleaning these datasets and setting up a proper data processing procedure for future research.

computational linguistic, dataset, proceedings, (15 more...)

2201.06219

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

arXiv.org Artificial IntelligenceJan-14-2022

Cost-Effective Training in Low-Resource Neural Machine Translation

Koneru, Sai, Liu, Danni, Niehues, Jan

While Active Learning (AL) techniques are explored in Neural Machine Translation (NMT), only a few works focus on tackling low annotation budgets where a limited number of sentences can get translated. Such situations are especially challenging and can occur for endangered languages with few human annotators or having cost constraints to label large amounts of data. Although AL is shown to be helpful with large budgets, it is not enough to build high-quality translation systems in these low-resource conditions. In this work, we propose a cost-effective training procedure to increase the performance of NMT models utilizing a small number of annotated sentences and dictionary entries. Our method leverages monolingual data with self-supervised objectives and a small-scale, inexpensive dictionary for additional supervision to initialize the NMT model before applying AL. We show that improving the model using a combination of these knowledge sources is essential to exploit AL strategies and increase gains in low-resource conditions. We also present a novel AL strategy inspired by domain adaptation for NMT and show that it is effective for low budgets. We propose a new hybrid data-driven approach, which samples sentences that are diverse from the labelled data and also most similar to unlabelled data. Finally, we show that initializing the NMT model and further using our AL strategy can achieve gains of up to $13$ BLEU compared to conventional AL methods.

computational linguistic, proceedings, translation, (14 more...)

2201.057

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > India (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)