Goto

Collaborating Authors

 subtext


HistRED: A Historical Document-Level Relation Extraction Dataset

arXiv.org Artificial Intelligence

Despite the extensive applications of relation extraction (RE) tasks in various domains, little has been explored in the historical context, which contains promising data across hundreds and thousands of years. To promote the historical RE research, we present HistRED constructed from Yeonhaengnok. Yeonhaengnok is a collection of records originally written in Hanja, the classical Chinese writing, which has later been translated into Korean. HistRED provides bilingual annotations such that RE can be performed on Korean and Hanja texts. In addition, HistRED supports various self-contained subtexts with different lengths, from a sentence level to a document level, supporting diverse context settings for researchers to evaluate the robustness of their RE models. To demonstrate the usefulness of our dataset, we propose a bilingual RE model that leverages both Korean and Hanja contexts to predict relations between entities. Our model outperforms monolingual baselines on HistRED, showing that employing multiple language contexts supplements the RE predictions. The dataset is publicly available at: https://huggingface.co/datasets/Soyoung/HistRED under CC BY-NC-ND 4.0 license.


Training computers to tease out subtext behind text

#artificialintelligence

WEST LAFAYETTE, Ind. โ€“ It is hard enough for humans to interpret the deeper meaning and context of social media and news articles. Asking computers to do it is a nearly impossible task. Even C-3PO, fluent in over 6 million forms of communication, misses the subtext much of the time. Natural language processing, the subfield of artificial intelligence connecting computers with human languages, uses statistical methods to analyze language, often without incorporating the real-world context needed for understanding the shifts and currents of human society. To do that, you have to translate online communication, and the context from which it emerges, into something the computers can parse and reason over.


SASICM A Multi-Task Benchmark For Subtext Recognition

arXiv.org Artificial Intelligence

Subtext is a kind of deep semantics which can be acquired after one or more rounds of expression transformation. As a popular way of expressing one's intentions, it is well worth studying. In this paper, we try to make computers understand whether there is a subtext by means of machine learning. We build a Chinese dataset whose source data comes from the popular social media (e.g. Weibo, Netease Music, Zhihu, and Bilibili). In addition, we also build a baseline model called SASICM to deal with subtext recognition. The F1 score of SASICMg, whose pretrained model is GloVe, is as high as 64.37%, which is 3.97% higher than that of BERT based model, 12.7% higher than that of traditional methods on average, including support vector machine, logistic regression classifier, maximum entropy classifier, naive bayes classifier and decision tree and 2.39% higher than that of the state-of-the-art, including MARIN and BTM. The F1 score of SASICMBERT, whose pretrained model is BERT, is 65.12%, which is 0.75% higher than that of SASICMg. The accuracy rates of SASICMg and SASICMBERT are 71.16% and 70.76%, respectively, which can compete with those of other methods which are mentioned before.


A Human Responds to a Robot's Essay

#artificialintelligence

What does GPT-3's AI-generated op-ed teach us about ourselves? The answers are in the subtext. Well, readers, it finally happened. I've been replaced by a robot. Last week, The Guardian published an essay "written" by GPT-3, OpenAI's new language generator. According to the news outlet, "GPT-3 is a cutting edge language model that uses machine learning to produce human like text. It takes in a prompt, and attempts to complete it."


A.I. Artificial Intelligence shows us a future where we neglect to dream

#artificialintelligence

The Verge is a place where you can consider the future. In Yesterday's Future, we revisit a movie about the future and consider the things it tells us about today, tomorrow, and yesterday. The future: A.I. begins with a brief summary of the sorry state of the world: climate change has melted the polar ice caps, wiping out coastal cities and severely reducing the human population. With regulations in place for reproduction on a resource-starved planet, corporations developed Mecha -- androids that appear human but lack emotions. They're seen as objects -- useful for labor or sex work, just human enough to not be strange but machine enough to not mistake them for people.


The Secret of Nym.health: Autonomous Medical Coding The official blog for dotHealth LLC - .health domain names

#artificialintelligence

We recently asked Alexa if she could code a few medical charts for us. "Sorry I don't know that." After all, the U.S. healthcare industry spends billions of dollars on 250,000 medical coders every year to do the job. This way of doing business might be error-prone, inefficient, and bound by constantly changing regulations, but hey, IT IS a solution. But you know what else is a solution?


When your tech knows you better than you know yourself

#artificialintelligence

For more on new technology that can read human emotions, check out the third episode of Should This Exist? the podcast that debates how emerging technologies will impact humanity. If we were sitting across a table from each other at a cafe and I asked about your day, you might answer with a polite response, like, "Fine." But if you were lying, I'd know from your expression, tone, twitches, and tics. We read subtext--unspoken clues--to get at the truth, to cut through what people say to understand what they mean. And now, with so many of our exchanges taking place in text online, much of our messaging, traditionally delivered via subtext, tells us less than ever before.