Goto

Collaborating Authors

 Information Extraction


Mining Facebook data for science

#artificialintelligence

It seems Christmas is coming early this year for social scientists. That's because just months after Harvard's Gary King wrote an academic paper about a system that would allow researchers to access the massive data troves held by Facebook and other private companies, it is set to become a reality. Along with collaborator Nathaniel Persily at Stanford University, King, the Albert J. Weatherhead III University Professor, created an organization called Social Science One that will lead the effort to identify data inside Facebook, prepare it for researchers, and fund numerous scholars to analyze the data. The organization is today making available for research the first of what King says will be many data sets, more than half a trillion numbers that include every link clicked by Facebook users in the last year, information on the types of people who clicked, and indicators of whether links were judged to be intentionally false news stories. "As social scientists, our goal is to understand and solve the greatest challenges that affect human society," King said.


World's First Cognitive Dance Party - Daybreaker with Watson

#artificialintelligence

IBM Watson and Daybreaker hosted the World's First Cognitive Dance Party in San Francisco by using Watson Tone Analyzer, Watson Personality Insights, Chef Watson and Watson Beat. With Personality Insights API Daybreak was able to base the colors, music playlists, kick-off fitness session, healthy breakfast, and intention card all on the each attendees' personality. Tone Analyzer drove the color of a rising cognitive sun based on sentiment analysis of tweets of around the country. While Watson Beat created new riffs using inputs from pianist ELEW, using one or several of his musical filters. Even the Breakfast was courtesy of Chef Watson, which featured unexpected ingredient combinations, tailored again to attendee personality.


Britain to fine Facebook over data breach

The Japan Times

LONDON โ€“ Britain's data regulator will fine Facebook half a million pounds ($660,000) for failing to protect users' data, in an inquiry into whether personal information had been misused by campaigns on both sides of Britain's 2016 EU referendum. An investigation by the Information Commissioner's Office (ICO) has focused on the social media giant since earlier this year, when evidence emerged that an app had been used to harvest the data of tens of millions of Facebook users worldwide. In a progress report early Wednesday the watchdog said it plans to issue Facebook with the maximum fine available to it for breaches of the Data Protection Act. "The ICO's investigation concluded that Facebook contravened the law by failing to safeguard people's information," it said, adding that the company had "failed to be transparent about how people's data was harvested by others." Facebook has admitted that up to 87 million users may have had their data hijacked by British consultancy firm Cambridge Analytica, which was working for U.S. President Donald Trump's 2016 campaign.


Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis

arXiv.org Machine Learning

Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal learning involves learning representations that can process and relate information from multiple modalities. In this paper, we propose two methods for unsupervised learning of joint multimodal representations using sequence to sequence (Seq2Seq) methods: a \textit{Seq2Seq Modality Translation Model} and a \textit{Hierarchical Seq2Seq Modality Translation Model}. We also explore multiple different variations on the multimodal inputs and outputs of these seq2seq models. Our experiments on multimodal sentiment analysis using the CMU-MOSI dataset indicate that our methods learn informative multimodal representations that outperform the baselines and achieve improved performance on multimodal sentiment analysis, specifically in the Bimodal case where our model is able to improve F1 Score by twelve points. We also discuss future directions for multimodal Seq2Seq methods.


Natural Language Processing for Information Extraction

arXiv.org Artificial Intelligence

With rise of digital age, there is an explosion of information in the form of news, articles, social media, and so on. Much of this data lies in unstructured form and manually managing and effectively making use of it is tedious, boring and labor intensive. This explosion of information and need for more sophisticated and efficient information handling tools gives rise to Information Extraction(IE) and Information Retrieval(IR) technology. Information Extraction systems takes natural language text as input and produces structured information specified by certain criteria, that is relevant to a particular application. Various sub-tasks of IE such as Named Entity Recognition, Coreference Resolution, Named Entity Linking, Relation Extraction, Knowledge Base reasoning forms the building blocks of various high end Natural Language Processing (NLP) tasks such as Machine Translation, Question-Answering System, Natural Language Understanding, Text Summarization and Digital Assistants like Siri, Cortana and Google Now. This paper introduces Information Extraction technology, its various sub-tasks, highlights state-of-the-art research in various IE subtasks, current challenges and future research directions.


Sentiment Analysis: nearly everything you need to know MonkeyLearn

#artificialintelligence

Sentiment analysis is the automated process of understanding an opinion about a given subject from written or spoken language. In a world where we generate 2.5 quintillion bytes of data every day, sentiment analysis has become a key tool for making sense of that data. This has allowed companies to get key insights and automate all kind of processes. Butโ€ฆ How does it work? What are the different approaches? What are its caveats and limitations? How can you use sentiment analysis in your business? Below, you'll find the answers to these questions and everything you need to know about sentiment analysis. No matter if you are an experienced data scientist a coder, a marketer, a product analyst, or if you're just getting started, this comprehensive guide is for you. How Does Sentiment Analysis Work? Sentiment Analysis also known as Opinion Mining is a field within Natural Language Processing (NLP) that builds systems that try to identify and extract opinions within text. Currently, sentiment analysis is a topic of great interest and development since it has many practical applications. Since publicly and privately available information over Internet is constantly growing, a large number of texts expressing opinions are available in review sites, forums, blogs, and social media. With the help of sentiment analysis systems, this unstructured information could be automatically transformed into structured data of public opinions about products, services, brands, politics, or any topic that people can express opinions about. This data can be very useful for commercial applications like marketing analysis, public relations, product reviews, net promoter scoring, product feedback, and customer service. Before going into further details, let's first give a definition of opinion. Text information can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about something. Opinions are usually subjective expressions that describe people's sentiments, appraisals, and feelings toward a subject or topic. In an opinion, the entity the text talks about can be an object, its components, its aspects, its attributes, or its features.


Extracting Actionable Knowledge from Domestic Violence Discourses on Social Media

arXiv.org Machine Learning

Domestic Violence (DV) is considered as big social issue and there exists a strong relationship between DV and health impacts of the public. Existing research studies have focused on social media to track and analyse real world events like emerging trends, natural disasters, user sentiment analysis, political opinions, and health care. However there is less attention given on social welfare issues like DV and its impact on public health. Recently, the victims of DV turned to social media platforms to express their feelings in the form of posts and seek the social and emotional support, for sympathetic encouragement, to show compassion and empathy among public. But, it is difficult to mine the actionable knowledge from large conversational datasets from social media due to the characteristics of high dimensions, short, noisy, huge volume, high velocity, and so on. Hence, this paper will propose a novel framework to model and discover the various themes related to DV from the public domain. The proposed framework would possibly provide unprecedentedly valuable information to the public health researchers, national family health organizations, government and public with data enrichment and consolidation to improve the social welfare of the community. Thus provides actionable knowledge by monitoring and analysing continuous and rich user generated content.


The FBI, SEC and Justice Department Now Want to Know What Facebook Knew About Cambridge Analytica

TIME - Tech

A federal probe into Facebook's sharing of user data with Cambridge Analytica now involves the FBI, the Securities and Exchange Commission and the Justice Department, the Washington Post reported. Representatives from these agencies have joined the Federal Trade Commission in the inquiry, the newspaper reported, citing five unnamed people familiar with the matter. Those people spoke on condition of anonymity because the probes are not complete. The probe reportedly centers on what Facebook knew in 2015, when it learned that the political data-mining firm Cambridge Analytica had improperly accessed the personal data of tens of millions of Facebook users. Facebook didn't disclose the incident with the political firm, which later worked for the Trump campaign and other Republican candidates, until this March.


Probe Into Facebook's Data Breach Broadens: Washington Post

U.S. News

The emphasis has been on what Facebook has reported publicly about its sharing of information with Cambridge Analytica, whether those representations square with the underlying facts and whether Facebook made sufficiently complete and timely disclosures to the public and investors about the matter, the Washington Post report said.


Learning Semantic Sentence Embeddings using Pair-wise Discriminator

arXiv.org Artificial Intelligence

In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.