computational linguist
For the First Time, AI Analyzes Language as Well as a Human Expert
If language is what makes us human, what does it mean now that large language models have gained "metalinguistic" abilities? Among the myriad abilities that humans possess, which ones are uniquely human? Language has been a top candidate at least since Aristotle, who wrote that humanity was "the animal that has language." Even as large language models such as ChatGPT superficially replicate ordinary speech, researchers want to know if there are specific aspects of human language that simply have no parallels in the communication systems of other animals or artificially intelligent devices. In particular, researchers have been exploring the extent to which language models can reason about language itself.
The Analysis of Lexical Errors in Machine Translation from English into Romanian
The research explores error analysis in the performance of translating by Machine Translation from English into Romanian, and it focuses on lexical errors found in texts which include official information, provided by the World Health Organization (WHO), the Gavi Organization, by the patient information leaflet (the information about the active ingredients of the vaccines or the medication, the indications, the dosage instructions, the storage instructions, the side effects and warning, etc.). All of these texts are related to C ovid - 19 and have been translated by Google Translate, a multilingual Machine Translation that was created by Google. In the last decades, Google has actively work ed to develop a more accurate and fluent automatic translation system. This research, specifically focused on improving Google Translate, aims to enhance the overall quality of Machine Translation by achieving better lexical selection and by reducing errors. The investigation involves a comprehensive analysis of 230 texts that have been translated from English into Romanian.
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
This paper provides a proof of concept that audio of tabletop role-playing games (TTRPG) could serve as a challenge for diarization systems. TTRPGs are carried out mostly by conversation. Participants often alter their voices to indicate that they are talking as a fictional character. Audio processing systems are susceptible to voice conversion with or without technological assistance. TTRPG present a conversational phenomenon in which voice conversion is an inherent characteristic for an immersive gaming experience. This could make it more challenging for diarizers to pick the real speaker and determine that impersonating is just that. We present the creation of a small TTRPG audio dataset and compare it against the AMI and the ICSI corpus. The performance of two diarizers, pyannote.audio and wespeaker, were evaluated. We observed that TTRPGs' properties result in a higher confusion rate for both diarizers. Additionally, wespeaker strongly underestimates the number of speakers in the TTRPG audio files. We propose TTRPG audio as a promising challenge for diarization systems.
Machines Beat Humans on a Reading Test. But Do They Understand? Quanta Magazine
In the fall of 2017, Sam Bowman, a computational linguist at New York University, figured that computers still weren't very good at understanding the written word. Sure, they had become decent at simulating that understanding in certain narrow domains, like automatic translation or sentiment analysis (for example, determining if a sentence sounds "mean or nice," he said). But Bowman wanted measurable evidence of the genuine article: bona fide, human-style reading comprehension in English. So he came up with a test. In an April 2018 paper coauthored with collaborators from the University of Washington and DeepMind, the Google-owned artificial intelligence company, Bowman introduced a battery of nine reading-comprehension tasks for computers called GLUE (General Language Understanding Evaluation). The test was designed as "a fairly representative sample of what the research community thought were interesting challenges," said Bowman, but also "pretty straightforward for humans." For example, one task asks whether a sentence is true based on information offered in a preceding sentence.
Machines Beat Humans on a Reading Test. But Do They Understand? Quanta Magazine
In the fall of 2017, Sam Bowman, a computational linguist at New York University, figured that computers still weren't very good at understanding the written word. Sure, they had become decent at simulating that understanding in certain narrow domains, like automatic translation or sentiment analysis (for example, determining if a sentence sounds "mean or nice," he said). But Bowman wanted measurable evidence of the genuine article: bona fide, human-style reading comprehension in English. So he came up with a test. In an April 2018 paper coauthored with collaborators from the University of Washington and DeepMind, the Google-owned artificial intelligence company, Bowman introduced a battery of nine reading-comprehension tasks for computers called GLUE (General Language Understanding Evaluation). The test was designed as "a fairly representative sample of what the research community thought were interesting challenges," said Bowman, but also "pretty straightforward for humans."
A day in the life of... a data scientist in an AI company
This week's Day in the Life comes from the burgeoning world of AI-powered marketing. Neil Yager is Chief Scientist at Phrasee, a company that uses artificial intelligence and natural language processing to generate and optimise marketing copy. Phrasee also happens to be one of the sponsors of Supercharged, a July 2017 event from Econsultancy which looks at exciting new AI technology in marketing. Neil Yager: My role is at Phrasee is lead'data scientist'. This is a job that has only existed (at least with its own name) for a few years.