Goto

Collaborating Authors

 Information Extraction


UK regulator to write to WhatsApp over Facebook data sharing

The Guardian

The UK's data regulator is writing to WhatsApp to demand that the chat app does not hand user data to Facebook, as millions worldwide continue to sign up for alternatives such as Signal and Telegram to avoid forthcoming changes to its terms of service. Elizabeth Denham, the information commissioner, told a parliamentary committee that in 2017, WhatsApp had committed not to hand any user information over to Facebook until it could prove that doing so respected GDPR. But, she said, that agreement was enforced by the Irish data protection authority until the Brexit transition period ended on 1 January. Now that Britain is fully outside the EU, ensuring that those promises are being kept falls to the Information Commissioner's Office. "The change in the terms of service, and the requirement of users to share information with Facebook, does not apply to UK users or to users in the EU," Denham told the digital, culture, media and sport sub-committee on online harms and disinformation, "and that's because in 2017 my office negotiated with WhatsApp so that they agreed not to share user information and contact information until they could show that they complied with the GDPR."


LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction

arXiv.org Artificial Intelligence

Open Information Extraction (OIE) systems seek to compress the factual propositions of a sentence into a series of n-ary tuples. These tuples are useful for downstream tasks in natural language processing like knowledge base creation, textual entailment, and natural language understanding. However, current OIE datasets are limited in both size and diversity. We introduce a new dataset by converting the QA-SRL 2.0 dataset to a large-scale OIE dataset (LSOIE). Our LSOIE dataset is 20 times larger than the next largest human-annotated OIE dataset. We construct and evaluate several benchmark OIE models on LSOIE, providing baselines for future improvements on the task. Our LSOIE data, models, and code are made publicly available


Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks

arXiv.org Artificial Intelligence

In zero-shot cross-lingual transfer, a supervised NLP task trained on a corpus in one language is directly applicable to another language without any additional training. A source of cross-lingual transfer can be as straightforward as lexical overlap between languages (e.g., use of the same scripts, shared subwords) that naturally forces text embeddings to occupy a similar representation space. Recently introduced cross-lingual language model (XLM) pretraining brings out neural parameter sharing in Transformer-style networks as the most important factor for the transfer. In this paper, we aim to validate the hypothetically strong cross-lingual transfer properties induced by XLM pretraining. Particularly, we take XLM-RoBERTa (XLMR) in our experiments that extend semantic textual similarity (STS), SQuAD and KorQuAD for machine reading comprehension, sentiment analysis, and alignment of sentence embeddings under various cross-lingual settings. Our results indicate that the presence of cross-lingual transfer is most pronounced in STS, sentiment analysis the next, and MRC the last. That is, the complexity of a downstream task softens the degree of crosslingual transfer. All of our results are empirically observed and measured, and we make our code and data publicly available.


Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

arXiv.org Artificial Intelligence

Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection and recognition) and information extraction, which completely ignored the high correlation among them during optimization. In this paper, we propose a robust visual information extraction system (VIES) towards real-world scenarios, which is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction by taking a single document image as input and outputting the structured information. Specifically, the information extraction branch collects abundant visual and semantic representations from text spotting for multimodal feature fusion and conversely, provides higher-level semantic clues to contribute to the optimization of text spotting. Moreover, regarding the shortage of public benchmarks, we construct a fully-annotated dataset called EPHOIE (https://github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances. Compared with the state-of-the-art methods, our VIES shows significant superior performance on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used SROIE dataset under the end-to-end scenario.


Artificial Intelligence for Emotion-Semantic Trending and People Emotion Detection During COVID-19 Social Isolation

arXiv.org Artificial Intelligence

This more than a yearlong outbreak is likely to have a significant impact on mental health of many individuals who lost loved ones, who lost personal contacts with others due to strictly enforced public health guidelines of mandatory social segregation. Complex psychological reactions to COVID-19 regulatory mechanisms of mandatory quarantine and related emotional reactions has been recognized as hard to disentangle [1] - [4]. A study conducted in Belgium found social media being positively associated with constructive coping for adolescents with anxious feelings during the quarantine period of COVID-19 [4]. Another study conducted among social media users during COVID-19 pandemic in Spain was able to capture added stress placed on people's emotional health during the pandemic period [5]. However, social media providing a platform of risk communication and exchange of feelings and emotions to curb social isolation, this text data provides a wealth of information on the natural flow of people's emotional feelings and expressions [6]. This rich source of data can be utilized to curb the data collection barriers during the pandemic. The goal of this research was to use AI to uncover the hidden, implicit signal related to emotional health of people subject to mandatory quarantine, embedded in a latent manner in their twitter messages. Within the context of this paper, an NLPbased emotion detection system aims to provide useful information by examining unstructured text data used in social media. The purpose of the NLP system used herein is to show the meaning and emotions of users' expressions related to a particular topic, which can be used to understand their psychological health and emotional wellbeing.


Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis

arXiv.org Artificial Intelligence

Video sentiment analysis as a decision-making process is inherently complex, involving the fusion of decisions from multiple modalities and the so-caused cognitive biases. Inspired by recent advances in quantum cognition, we show that the sentiment judgment from one modality could be incompatible with the judgment from another, i.e., the order matters and they cannot be jointly measured to produce a final decision. Thus the cognitive process exhibits "quantum-like" biases that cannot be captured by classical probability theories. Accordingly, we propose a fundamentally new, quantum cognitively motivated fusion strategy for predicting sentiment judgments. In particular, we formulate utterances as quantum superposition states of positive and negative sentiment judgments, and uni-modal classifiers as mutually incompatible observables, on a complex-valued Hilbert space with positive-operator valued measures. Experiments on two benchmarking datasets illustrate that our model significantly outperforms various existing decision level and a range of state-of-the-art content-level fusion approaches. The results also show that the concept of incompatibility allows effective handling of all combination patterns, including those extreme cases that are wrongly predicted by all uni-modal classifiers.


WhatsApp Has Shared Your Data With Facebook for Years

WIRED

Since Facebook acquired WhatsApp in 2014, users have wondered and worried about how much data would flow between the two platforms. Many of them experienced a rude awakening this week, as a new in-app notification raises awareness about a step WhatsApp actually took to share more with Facebook back in 2016. On Monday, WhatsApp updated its terms of use and privacy policy, primarily to expand on its practices around how WhatsApp business users can store their communications. A pop-up has been notifying users that as of February 8, the app's privacy policy will change and they must accept the terms to keep using the app. As part of that privacy policy refresh, WhatsApp also removed a passage about opting out of sharing certain data with Facebook: "If you are an existing user, you can choose not to have your WhatsApp account information shared with Facebook to improve your Facebook ads and products experiences."


Digital.com Reviews & Comparisons Give Online Businesses Wings To Succeed - Digital.com

#artificialintelligence

Unlike so many review sites, we look at what real people say. We apply sentiment analysis to reviews about small business online tools, products and services, and we use real people approval rating to score companies. Because we think this is the right way to provide you with honest and unbiased reviews of brands like SiteGround, BlueHost & Wix. Being a small business, we know that often it is difficult to decide which service to use. Especially when it comes to something you are going to spend hundrds of dollars, like web hosting, a website builder or an ecommerce platform.


WhatsApp: Let us share your data with Facebook or else

Engadget

In a surprise move, WhatsApp recently gave many of its users a difficult choice: they could either accept a revised privacy policy that explicit allowed the service to share information with parent company Facebook by February 8th, or decline and risk not being able to use the service at all. The company informed those users through an in-app notification which lays out the changes in very broad terms: the updates to the policy include "more information about WhatsApp's service and how we process your data, how businesses can use Facebook hosted services to store and manage their WhatsApp chats, [and] how we partner with Facebook to offer integrations across the Facebook Company Products." Upon further inspection, the updated policy makes clear that data collected by WhatsApp -- including user phone numbers, "transaction data, service-related information, information on how you interact with others (including businesses) when using our Services, mobile device information, your IP address" and more are subject to be shared with other properties owned and controlled by Facebook. "As part of the Facebook Companies, WhatsApp receives information from, and shares information (see here) with, the other Facebook Companies," the updated privacy policy reads. "We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Facebook Company Products."


A Joint Training Dual-MRC Framework for Aspect Based Sentiment Analysis

arXiv.org Artificial Intelligence

Aspect based sentiment analysis (ABSA) involves three fundamental subtasks: aspect term extraction, opinion term extraction, and aspect-level sentiment classification. Early works only focused on solving one of these subtasks individually. Some recent work focused on solving a combination of two subtasks, e.g., extracting aspect terms along with sentiment polarities or extracting the aspect and opinion terms pair-wisely. More recently, the triple extraction task has been proposed, i.e., extracting the (aspect term, opinion term, sentiment polarity) triples from a sentence. However, previous approaches fail to solve all subtasks in a unified end-to-end framework. In this paper, we propose a complete solution for ABSA. We construct two machine reading comprehension (MRC) problems, and solve all subtasks by joint training two BERT-MRC models with parameters sharing. We conduct experiments on these subtasks and results on several benchmark datasets demonstrate the effectiveness of our proposed framework, which significantly outperforms existing state-of-the-art methods.