Information Extraction
Towards Controlled Transformation of Sentiment in Sentences
Leeftink, Wouter, Spanakis, Gerasimos
An obstacle to the development of many natural language processing products is the vast amount of training examples necessary to get satisfactory results. The generation of these examples is often a tedious and time-consuming task. This paper this paper proposes a method to transform the sentiment of sentences in order to limit the work necessary to generate more training data. This means that one sentence can be transformed to an opposite sentiment sentence and should reduce by half the work required in the generation of text. The proposed pipeline consists of a sentiment classifier with an attention mechanism to highlight the short phrases that determine the sentiment of a sentence. Then, these phrases are changed to phrases of the opposite sentiment using a baseline model and an autoencoder approach. Experiments are run on both the separate parts of the pipeline as well as on the end-to-end model. The sentiment classifier is tested on its accuracy and is found to perform adequately. The autoencoder is tested on how well it is able to change the sentiment of an encoded phrase and it was found that such a task is possible. We use human evaluation to judge the performance of the full (end-to-end) pipeline and that reveals that a model using word vectors outperforms the encoder model. Numerical evaluation shows that a success rate of 54.7% is achieved on the sentiment change.
The Most Disturbing Thing About Facebook's Controversial Data Research Program
Facebook is ending a controversial research program in which it paid users up to $20 a month to install a smartphone app that gave the company nearly unfettered access to their activity. The move comes after the program was highlighted in a report by TechCrunch, and after Apple said the app violated its policies and revoked its certificate. A Facebook spokesperson says the program was not "secret," as some early reports suggested, and that it was opt-in. "It wasn't'spying' as all of the people who signed up to participate went through a clear on-boarding process asking for their permission and were paid to participate," said the spokesperson. But the program had major privacy implications even still, and seemed likely to prey on the vulnerabilities of Facebook's most financially desperate users.
Apple escalates its fight with Facebook following report of data-collecting iPhone app
Apple has had it up to here with Facebook. Following a report by TechCrunch Tuesday night that the company had circumvented the App Store to distribute a "research" app to users, Apple has revoked a developer license from the social media giant, effectively shutting down any iOS apps that haven't already been approved for the App Store. While the move won't have an effect on your ability to post and message your friends using your iPhone, Facebook employees will certainly feel the repercussions. Without the developer certificate, Facebook's internal iOS apps, which likely include beta versions of its consumer apps as well as company-specific resources, will no longer work. Apple hasn't indicated whether this is a temporary ban or how it will monitor Facebook's activities in the future, but it sends a clear message: Play by our rules or pay the price.
QA4IE: A Question Answering based Framework for Information Extraction
Qiu, Lin, Zhou, Hao, Qu, Yanru, Zhang, Weinan, Li, Suoheng, Rong, Shu, Ru, Dongyu, Qian, Lihua, Tu, Kewei, Yu, Yong
Information Extraction (IE) refers to automatically extracting structured relation tuples from unstructured texts. Common IE solutions, including Relation Extraction (RE) and open IE systems, can hardly handle cross-sentence tuples, and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation tuples). In order to overcome these weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering (QA) approaches to produce high quality relation triples across sentences. Based on the framework, we develop a large IE benchmark with high quality human evaluation. This benchmark contains 293K documents, 2M golden relation triples, and 636 relation types. We compare our system with some IE baselines on our benchmark and the results show that our system achieves great improvements.
Text Analytics and its applications Dimensionless Data Science Blog
More than 90 per cent of total data was generated in the last two years. Of this, text data has a large chunk. Just think about how much you type each day. Ironically, I was thinking about it while writing this blog. And more than chunks of characters, there is information hidden in the text which can be extremely insightful if harnessed well.
Machine Learning & Applications: Complete Bundle - Total Training
This bundle includes 8 courses that will immerse you in the fields of Machine Learning & Analytics by teaching you the skills used to master both theory & practice. Learn how to install Python, and then use it to perform sentiment analysis, build a recommendation system, and so much more. With over 40 hours of expert instruction, by the time you've completed this bundle of courses, you'll have a firm grasp of core machine learning concepts and be on your way to applying this essential technology in your career.
Russia launches case against Facebook and Twitter over 'breach of data laws'
Russia has launched a civil case against Facebook and Twitter for failing to provide details about how they will comply with the country's data laws, according to local media reports. Communication watchdog Roskomnadzor said the social media firms had failed to explain exactly how local laws would be adhered to considering the companies both store data in centres outside of Russia. The Interfax news agency quoted the watchdog as saying that Twitter and Facebook had not explained how and when they would comply with legislation that requires all servers used to store Russians' personal data to be located in Russia. The agency's head, Alexander Zharov, was quoted as saying the companies have a month to provide information or else action would be taken against them. Russia has introduced tougher internet laws in the last five years, requiring search engines to delete some search results, messaging services to share encryption keys with security services and social networks to store Russian users' personal data on servers within the country.
Most people don't know about Facebook's invasive data practices, study finds
Three quarters of Facebook users are unaware that the social network lists their personal interests and traits for advertisers, according to new research. A study published by Pew Research Center revealed the scale of Facebook users' ignorance when it comes to how the tech giant uses their data to make money. Facebook consistently claims that it is transparent in its data collection practices, making it possible for people to find out how its algorithm categorises their interests via the'Your ad preferences' page. But the research suggests that most users are unaware of this. In a survey of 963 US adults, 74 per cent said they did not know Facebook maintained lists of their interests and 51 per cent said they are not comforable with Facebook compiling this information.
Deloitte: Agencies Could Use Natural Language Processing to Derive Insights From Unstructured Data
A Deloitte article says government agencies seeking to generate insights from unstructured data to facilitate the decision-making process and policy analysis could use artificial intelligence in the form of natural language processing. NLP has seven technical capabilities that could help agencies identify patterns, analyze public opinion and categorize topics, according to the article published Wednesday. Those capabilities are topic modeling; text categorization; text clustering; information extraction; named entity resolution; relationship extraction; and sentiment analysis. NLP could help address several issues across defense and national security, health care, energy and financial services domains and those issues include analyzing public feedback and improving regulatory compliance and predictions. The article cited how the Defense Advanced Research Projects Agency uses NLP in the Deep Exploration and Filtering of Text program to glean insights from data.