habitual
Disambiguation of morpho-syntactic features of African American English -- the case of habitual be
Santiago, Harrison, Martin, Joshua, Moeller, Sarah, Tang, Kevin
Recent research has highlighted that natural language processing (NLP) systems exhibit a bias against African American speakers. The bias errors are often caused by poor representation of linguistic features unique to African American English (AAE), due to the relatively low probability of occurrence of many such features in training data. We present a workflow to overcome such bias in the case of habitual "be". Habitual "be" is isomorphic, and therefore ambiguous, with other forms of "be" found in both AAE and other varieties of English. This creates a clear challenge for bias in NLP technologies. To overcome the scarcity, we employ a combination of rule-based filters and data augmentation that generate a corpus balanced between habitual and non-habitual instances. With this balanced corpus, we train unbiased machine learning classifiers, as demonstrated on a corpus of AAE transcribed texts, achieving .65 F$_1$ score disambiguating habitual "be".
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA
Virgo, Felix, Cheng, Fei, Pereira, Lis Kanashiro, Asahara, Masayuki, Kobayashi, Ichiro, Kurohashi, Sadao
We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. The human evaluation demonstrates that our pseudo labels exhibit surprisingly high accuracy and balanced coverage. In the temporal commonsense QA task, experimental results show that using only pseudo examples of 400 events, we achieve performance comparable to the existing BERT-based weakly supervised approaches that require a significant amount of training examples. When compared to the RoBERTa baselines, our best approach establishes state-of-the-art performance with a 7% improvement in Exact Match.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- (7 more...)
- Leisure & Entertainment (0.70)
- Media (0.48)
Sex robots are coming. We might even fall in love with them.
Is mutual love with a robot possible? And if it is possible, would it make relationships between human beings less desirable? Those are the questions examined by Lily Eva Frank, a philosophy professor at the Technical University of Eindhoven in the Netherlands who wrote an essay with Sven Nyholm for the new book Robot Sex. We already have sex robots, but the technology is still limited. Eventually, the machines will become sufficiently lifelike that the line between person and robot will be blurred.
Tennessee Offender Management Information System
Sentences for the 50,000 offenders vary from community work release and probation to lifelong incarceration. Tennessee was one of 38 states required by court order to improve prison conditions and reduce overcrowding; it is the target of over 300 inmate lawsuits each year. The new $14 million system is the largest and most comprehensive computer system ever developed in the field of corrections. Sentences C and D are consecutive to sentence B, and sentence B is consecutive to sentence A. C, and D of an offender, as shown in figure 1, it must be determined which sentence is not consecutive to any others. In this case, A is the sentence that must first be calculated because its dates do not depend on a previous sentence.