Goto

Collaborating Authors

 Information Extraction


Getting your data out of Tinder is hard. It shouldn't be Paul-Olivier Dehaye

The Guardian

When a journalist approached me to help her get a copy of her personal data from Tinder, I knew this would be a good story. Judith Duportail had read my work researching the use of psychometrics during the US elections and the Brexit referendum. Duportail knew that Tinder computes a "desirability score" for their users: Tinder's CEO had told another journalist their score, emphasising how complex and advanced its algorithm supposedly was. Curiosity piqued, Duportail wondered whether Tinder would tell her, or any other user who asked, their score, and how it was computed. Any European company has in theory the obligation to disclose the personal data it holds about any individual who asks them Companies even have to disclose the "logic of the processing" of that data.


Amazon Machine Learning Summary

#artificialintelligence

Amazon Machine Learning is a part of the Amazon Artifical Intelligence (AI) family, which includes the Amazon Rekognition, Amazon Lex, and Amazon Polly services. Amazon Machine Learning provides AWS customers with an easy way to take advantage of the benefits of complex machine learning capabilities without requiring extensive AI domain expertise. The Machine Learning service enables easy addition of features like fraud detection, sentiment analysis, and customer churn prediction to applications and products. The Amazon Machine Learning service was announced at the 2016 AWS re:Invent conference along with other members of the AI family, Rekognition, Lex, & Polly.


Characterizing Diabetes, Diet, Exercise, and Obesity Comments on Twitter

arXiv.org Machine Learning

Social media provide a platform for users to express their opinions and share information. Understanding public health opinions on social media, such as Twitter, offers a unique approach to characterizing common health issues such as diabetes, diet, exercise, and obesity (DDEO), however, collecting and analyzing a large scale conversational public health data set is a challenging research task. The goal of this research is to analyze the characteristics of the general public's opinions in regard to diabetes, diet, exercise and obesity (DDEO) as expressed on Twitter. A multi-component semantic and linguistic framework was developed to collect Twitter data, discover topics of interest about DDEO, and analyze the topics. From the extracted 4.5 million tweets, 8% of tweets discussed diabetes, 23.7% diet, 16.6% exercise, and 51.7% obesity. The strongest correlation among the topics was determined between exercise and obesity. Other notable correlations were: diabetes and obesity, and diet and obesity DDEO terms were also identified as subtopics of each of the DDEO topics. The frequent subtopics discussed along with Diabetes, excluding the DDEO terms themselves, were blood pressure, heart attack, yoga, and Alzheimer. The non-DDEO subtopics for Diet included vegetarian, pregnancy, celebrities, weight loss, religious, and mental health, while subtopics for Exercise included computer games, brain, fitness, and daily plan. Non-DDEO subtopics for Obesity included Alzheimer, cancer, and children. With 2.67 billion social media users in 2016, publicly available data such as Twitter posts can be utilized to support clinical providers, public health experts, and social scientists in better understanding common public opinions in regard to diabetes, diet, exercise, and obesity.


Computational Content Analysis of Negative Tweets for Obesity, Diet, Diabetes, and Exercise

arXiv.org Machine Learning

Social media based digital epidemiology has the potential to support faster response and deeper understanding of public health related threats. This study proposes a new framework to analyze unstructured health related textual data via Twitter users' post (tweets) to characterize the negative health sentiments and non-health related concerns in relations to the corpus of negative sentiments; regarding Diet Diabetes Exercise, and Obesity (DDEO). Through the collection of 6 million Tweets for one month, this study identified the prominent topics of users as it relates to the negative sentiments. Our proposed framework uses two text mining methods, sentiment analysis and topic modeling, to discover negative topics. The negative sentiments of Twitter users support the literature narratives and the many morbidity issues that are associated with DDEO and the linkage between obesity and diabetes. The framework offers a potential method to understand the publics' opinions and sentiments regarding DDEO. More importantly, this research provides new opportunities for computational social scientists, medical experts, and public health professionals to collectively address DDEO-related issues.


Sentiment Analysis Just Got Smarter

@machinelearnbot

Sentiment analysis, sometimes called opinion mining, is one of the easiest and quickest ways to find out what consumers are thinking about a brand, product or event. It's a natural language processing technique often used in social listening scenarios, that aims to systematically identify opinions in a document and give it a score of positive, negative or neutral. There are few things as mind-numbingly tedious as manually tagging documents with the right sentiment because the technology doesn't get it. Sentiment analysis (ironically) has a bad reputation in the social listening industry, because truth be told, it needs a lot of manual work to deliver great results. Our data science guys (the brains behind our award winning image recognition technology) have been working on fixing this behind the scenes, and I'm excited to finally share their fantastic results.


Text Compression for Sentiment Analysis via Evolutionary Algorithms

arXiv.org Machine Learning

Can textual data be compressed intelligently without losing accuracy in evaluating sentiment? In this study, we propose a novel evolutionary compression algorithm, PARSEC (PARts-of-Speech for sEntiment Compression), which makes use of Parts-of-Speech tags to compress text in a way that sacrifices minimal classification accuracy when used in conjunction with sentiment analysis algorithms. An analysis of PARSEC with eight commercial and non-commercial sentiment analysis algorithms on twelve English sentiment data sets reveals that accurate compression is possible with (0%, 1.3%, 3.3%) loss in sentiment classification accuracy for (20%, 50%, 75%) data compression with PARSEC using LingPipe, the most accurate of the sentiment algorithms. Other sentiment analysis algorithms are more severely affected by compression. We conclude that significant compression of text data is possible for sentiment analysis depending on the accuracy demands of the specific application and the specific sentiment analysis algorithm used.


Google rolls out improvements to classification, sentiment analysis in Natural Language API

#artificialintelligence

This is a Techmeme archive page. It shows how the site appeared at 3:50 PM ET, September 19, 2017. The most current version of the site as always is available at our home page. To view an earlier snapshot click here and then modify the date indicated.


I Feel, Therefore I Am

#artificialintelligence

Although the quest for Artificial Intelligence (AI), equipping trading algorithms with human qualities such as self-learning, continues to fascinate, it will be the explosion of the Internet of Things that will soon re-energize trading in capital markets. The Internet of Things (IoT) is rapidly growing through the addition of sensors to machines that allow them to "feel." Once they are equipped with feelings-- particularly sight, sound and touch-- machines can behave more intelligently, for example optimizing operations to use less fuel or predicting when they need maintenance. However, an interesting side effect is that the data from the IoT could be a new source of "insider" data for trading firms. For example, if combine harvesters (accessorized with sensors) signal a bumper wheat cropin the U.S. grain belt, traders can take advantage of this information before the crop report is issued.


3 ways cognitive technology can help you better understand people - Watson

#artificialintelligence

February 7, 2017 Written by: Susan C. Daffron IBM surveyed more than 600 decision-makers about their cognitive initiatives and 62 percent of respondents stated that the results of their cognitive implementations exceed expectations*. Cognitive services, like those offered byIBM Watson, can help you find out how your customers feel and help you predict what they might do. With Watson, IBM is pioneering the development of models that can tell you about different and often hidden, aspects of an individual. These insights can then be used by an organization to deepen relationships, shape initiatives and drive innovation. REST APIs, like Watson Personality Insights and Watson Emotion Analysis, allow organizations to learn about an individual's: Organizations can now train apps to quickly analyze and interpret large volumes of unstructured sensory data.


Overcoming Language Variation in Sentiment Analysis with Social Attention

arXiv.org Artificial Intelligence

Variation in language is ubiquitous, particularly in newer forms of writing such as social media. Fortunately, variation is not random; it is often linked to social properties of the author. In this paper, we show how to exploit social networks to make sentiment analysis more robust to social language variation. The key idea is linguistic homophily: the tendency of socially linked individuals to use language in similar ways. We formalize this idea in a novel attention-based neural network architecture, in which attention is divided among several basis models, depending on the author's position in the social network. This has the effect of smoothing the classification function across the social network, and makes it possible to induce personalized classifiers even for authors for whom there is no labeled data or demographic metadata. This model significantly improves the accuracies of sentiment analysis on Twitter and on review data.