passive voice
Semantic Prosody in Machine Translation: the English-Chinese Case of Passive Structures
Ma, Xinyue, Pastells, Pol, Farrús, Mireia, Taulé, Mariona
Semantic prosody is a collocational meaning formed through the co-occurrence of a linguistic unit and a consistent series of collocates, which should be treated separately from semantic meaning. Since words that are literal translations of each other may have different semantic prosody, more attention should be paid to this linguistic property to generate accurate translations. However, current machine translation models cannot handle this problem. To bridge the gap, we propose an approach to teach machine translation models about semantic prosody of a specific structure. We focus on Chinese BEI passives and create a dataset of English-Chinese sentence pairs with the purpose of demonstrating the negative semantic prosody of BEI passives. Then we fine-tune OPUS-MT, NLLB-600M and mBART50 models with our dataset for the English-Chinese translation task. Our results show that fine-tuned MT models perform better on using BEI passives for translating unfavourable content and avoid using it for neutral and favourable content. Also, in NLLB-600M, which is a multilingual model, this knowledge of semantic prosody can be transferred from English-Chinese translation to other language pairs, such as Spanish-Chinese.
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
- (6 more...)
Controlling Topic-Focus Articulation in Meaning-to-Text Generation using Graph Neural Networks
Wang, Chunliu, van Noord, Rik, Bos, Johan
A bare meaning representation can be expressed in various ways using natural language, depending on how the information is structured on the surface level. We are interested in finding ways to control topic-focus articulation when generating text from meaning. We focus on distinguishing active and passive voice for sentences with transitive verbs. The idea is to add pragmatic information such as topic to the meaning representation, thereby forcing either active or passive voice when given to a natural language generation system. We use graph neural models because there is no explicit information about word order in a meaning represented by a graph. We try three different methods for topic-focus articulation (TFA) employing graph neural models for a meaning-to-text generation task. We propose a novel encoding strategy about node aggregation in graph neural models, which instead of traditional encoding by aggregating adjacent node information, learns node representations by using depth-first search. The results show our approach can get competitive performance with state-of-art graph models on general text generation, and lead to significant improvements on the task of active-passive conversion compared to traditional adjacency-based aggregation strategies. Different types of TFA can have a huge impact on the performance of the graph models.
- Europe > Germany > Berlin (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (6 more...)
Language Models Can Learn Exceptions to Syntactic Rules
Leong, Cara Su-Yi, Linzen, Tal
Artificial neural networks can generalize productively to novel contexts. Can they also learn exceptions to those productive rules? We explore this question using the case of restrictions on English passivization (e.g., the fact that "The vacation lasted five days" is grammatical, but "*Five days was lasted by the vacation" is not). We collect human acceptability judgments for passive sentences with a range of verbs, and show that the probability distribution defined by GPT-2, a language model, matches the human judgments with high correlation. We also show that the relative acceptability of a verb in the active vs. passive voice is positively correlated with the relative frequency of its occurrence in those voices. These results provide preliminary support for the entrenchment hypothesis, according to which learners track and uses the distributional properties of their input to learn negative exceptions to rules. At the same time, this hypothesis fails to explain the magnitude of unpassivizability demonstrated by certain individual verbs, suggesting that other cues to exceptionality are available in the linguistic input.
- North America > United States > New York (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Dominican Republic (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
A Large-Scale Multilingual Study of Visual Constraints on Linguistic Selection of Descriptions
Berger, Uri, Frermann, Lea, Stanovsky, Gabriel, Abend, Omri
We present a large, multilingual study into how vision constrains linguistic choice, covering four languages and five linguistic properties, such as verb transitivity or use of numerals. We propose a novel method that leverages existing corpora of images with captions written by native speakers, and apply it to nine corpora, comprising 600k images and 3M captions. We study the relation between visual input and linguistic choices by training classifiers to predict the probability of expressing a property from raw images, and find evidence supporting the claim that linguistic properties are constrained by visual context across languages. We complement this investigation with a corpus study, taking the test case of numerals. Specifically, we use existing annotations (number or type of objects) to investigate the effect of different visual conditions on the use of numeral expressions in captions, and show that similar patterns emerge across languages. Our methods and findings both confirm and extend existing research in the cognitive literature. We additionally discuss possible applications for language generation.
- Europe > Germany > Berlin (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
Investigating Stylistic Profiles for the Task of Empathy Classification in Medical Narrative Essays
One important aspect of language is how speakers generate utterances and texts to convey their intended meanings. In this paper, we bring various aspects of the Construction Grammar (CxG) and the Systemic Functional Grammar (SFG) theories in a deep learning computational framework to model empathic language. Our corpus consists of 440 essays written by premed students as narrated simulated patient-doctor interactions. We start with baseline classifiers (state-of-the-art recurrent neural networks and transformer models). Then, we enrich these models with a set of linguistic constructions proving the importance of this novel approach to the task of empathy classification for this dataset. Our results indicate the potential of such constructions to contribute to the overall empathy profile of first-person narrative essays.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (4 more...)
- Health & Medicine > Therapeutic Area > Oncology (0.67)
- Health & Medicine > Health Care Technology (0.67)
- Education > Curriculum > Subject-Specific Education (0.66)
- Education > Educational Setting > Higher Education (0.46)
On Neurons Invariant to Sentence Structural Changes in Neural Machine Translation
Patel, Gal, Choshen, Leshem, Abend, Omri
We present a methodology that explores how sentence structure is reflected in neural representations of machine translation systems. We demonstrate our model-agnostic approach with the Transformer English-German translation model. We analyze neuron-level correlation of activations between paraphrases while discussing the methodology challenges and the need for confound analysis to isolate the effects of shallow cues. We find that similarity between activation patterns can be mostly accounted for by similarity in word choice and sentence length. Following that, we manipulate neuron activations to control the syntactic form of the output. We show this intervention to be somewhat successful, indicating that deep models capture sentence-structure distinctions, despite finding no such indication at the neuron level. To conduct our experiments, we develop a semi-automatic method to generate meaning-preserving minimal pair paraphrases (active-passive voice and adverbial clause-noun phrase) and compile a corpus of such pairs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (8 more...)
Working With AI – The Passive Voice
In August, first prize in the digital-art category of the Colorado State Fair's fine-art competition went to a man who used artificial intelligence (AI) to generate his submission, "Théâtre d'Opéra Spatial." He supplied the AI, a program called Midjourney, with only a "prompt"--a textual description of what he wanted. Systems like Midjourney and the similar DALL-E 2 have led to a new role in our AI age: "prompt engineer." Such people can even sell their textual wares in an online market called PromptBase. Midjourney and DALL-E 2 emerged too late to be included in "Working With AI: Real Stories of Human-Machine Collaboration," by Thomas Davenport and Steven Miller, information-systems professors at Babson College and Singapore Management University, respectively.
- North America > United States > Colorado (0.25)
- Asia > Singapore (0.25)
- North America > United States > California (0.16)
- North America > United States > Arkansas (0.05)
How Animacy and Information Status Determine Word Order in Translation of the Passive Voice
Fain, Ashli (Northern Illinois University) | Freedman, Reva (Northern Illinois University)
English uses the passive voice more frequently than French. One method of translating the passive includes rendering the sentence as active by using an active verb, and changing the placement of the verb’s arguments. We are studying extra-syntactic features that predict where this method of translating the passive voice is used, in-cluding animacy and information status. We have obtained data from examining the Hansard, the transactions of the Canadian Parliament, which is published in both languages. This paper presents the results of a small mechanized corpus analysis on the relevance of the relative animacy of the agent (or experiencer) and the theme. This information will help to achieve desired stylistic output in a bilingual surface realizer.
- North America > Canada (0.15)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- (3 more...)
The machine learning in Microsoft Word's new Editor is scarily good
I mean, you really suck. That's what I wanted to write, but a new feature in the Office 365 version of Word called Editor made an interesting suggestion. The "machine" noted how the word "really" is superfluous, and it's true. The extra word doesn't add anything to the sentence, so I removed it. I've been writing professionally since 2001 (around 10,000 published articles now), but I'm still learning, I guess.
Flipboard on Flipboard
I mean, you really suck. That's what I wanted to write, but a new feature in the Office 365 version of Word called Editor made an interesting suggestion. The "machine" noted how the word "really" is superfluous, and it's true. The extra word doesn't add anything to the sentence, so I removed it. I've been writing professionally since 2001 (around 10,000 published articles now), but I'm still learning, I guess.