Goto

Collaborating Authors

 Information Extraction


My data security is better than yours: tech CEOs throw shade in privacy wars

The Guardian

"Privacy cannot be a luxury good offered only to people who can afford to buy premium products and services," declared Sundar Pichai, the chief executive officer of Google, in a New York Times op-ed this week. "Privacy must be equally available to everyone in the world." Pichai's column, published in conjunction with Google's annual developer conference, was a two-pronged public relations offensive: an attempt by the company that has been one of the chief architects and primary beneficiaries of digital surveillance to wrap itself in the mantle of privacy, while simultaneously taking a swipe at one of its competitors. In Silicon Valley, "privacy" is in 2019 what reclaimed wood was in 2010: a must-have design feature that signals a certain degree of authenticity and hipness and could also double as a weapon in a pinch. Pichai's broadside, in case you're not attuned to the subtleties of tech CEO shade, was aimed at Apple.


A Modern Hands-On Approach to Sentiment Analysis - Synerzip

#artificialintelligence

Human emotions are complex and difficult to decode. However, recent advancements in artificial intelligence and deep learning, are enabling new leaps in sentiment analysis. Put simply, sentiment analysis is a machine decoding human emotions for a specific purpose. Applications vary from mining opinions to gauging political inclinations to see how product reviews are affecting real-time sales. Social media companies actively use sentiment analysis to root out offensive and prejudiced content.


Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction

arXiv.org Artificial Intelligence

In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain. We focus on four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and propose a unified knowledge expression form, SAOKE, to express them. We publicly release a data set which contains more than forty thousand sentences and the corresponding facts in the SAOKE format labeled by crowd-sourcing. To our knowledge, this is the largest publicly available human labeled data set for open information extraction tasks. Using this labeled SAOKE data set, we train an end-to-end neural model using the sequenceto-sequence paradigm, called Logician, to transform sentences into facts. For each sentence, different to existing algorithms which generally focus on extracting each single fact without concerning other possible facts, Logician performs a global optimization over all possible involved facts, in which facts not only compete with each other to attract the attention of words, but also cooperate to share words. An experimental study on various types of open domain relation extraction tasks reveals the consistent superiority of Logician to other states-of-the-art algorithms. The experiments verify the reasonableness of SAOKE format, the valuableness of SAOKE data set, the effectiveness of the proposed Logician model, and the feasibility of the methodology to apply end-to-end learning paradigm on supervised data sets for the challenging tasks of open information extraction.


Personal Facebook data was made publicly available on the internet, company admits

The Independent - Tech

Personal Facebook data was uploaded to be publicly accessible on the internet, the company has admitted. Hundreds of millions of records which included people's activity on the site had been stored on the internet in a way that allowed anyone to access them, cybersecurity firm UpGuard found. In all, more than 540 million of the records, including account names, comments and likes, were publicly available on Amazon's cloud servers after they were uploaded by two third-party apps. We'll tell you what's true. You can form your own view.


Hundreds of millions of Facebook records exposed on public servers โ€“ report

The Guardian

More than 540m Facebook records were left exposed on public internet servers, cybersecurity researchers said on Wednesday, in just the latest security black eye for the company. Researchers for the firm UpGuard discovered two separate sets of Facebook user data on public Amazon cloud servers, the company detailed in a blogpost. One dataset, linked to the Mexican media company Cultura Colectiva, contained more than 540m records, including comments, likes, reactions, account names, Facebook IDs and more. The other set, linked to a defunct Facebook app called At the Pool, was significantly smaller, but contained plaintext passwords for 22,000 users. The large dataset was secured on Wednesday after Bloomberg, which first reported the leak, contacted Facebook.


Another scandal: Facebook user data reportedly at risk again

USATODAY - Tech Top Stories

In what seems like a broken record, Facebook is facing another scandal related to the transparency of its user data. The UpGuard cybersecurity firm reports that it uncovered two cases in which massive buckets of third-party Facebook app data were left exposed on the public internet. In one such case, a Mexico-based media company named Cultura Colectiva amassed 146 gigabytes of data with more than 540 million records. The records are said to include user comments, likes, reactions, account names, Facebook IDs and more. Don't yell, text: The new normal of how families'talk' at home Another exposure, UpGuard says, came from a since-discontinued Facebook-integrated app called At The Pool and was apparently posted on a public Amazon cloud server.


Another 540 Million Facebook Users' Data Has Been Exposed

Slate

Facebook is still a privacy nightmare. The company's history of porous data sharing continues to haunt both it and us (its fairly helpless users) on the regular. On Wednesday, researchers from the cybersecurity firm UpGuard shared that they found two massive troves of exposed Facebook user data that had been posted publicly on Amazon cloud servers. The data included users' passwords, names, comments, and likes. The scope of this particular privacy foul from Facebook is tremendous: More than 540 million user records were sitting in plain sight, available to anyone who found them.


Third-party errors left over 540 million Facebook records exposed

Engadget

Facebook is embroiled in another privacy scandal, although this time it's not of the company's direct making. UpGuard researchers have discovered over 540 million Facebook interaction records left exposed by third parties using Amazon's cloud services. Nearly all of them come from Mexican media company Cultura Colectiva, which recorded account names, comments, Facebook IDs and likes, among other details. Another exposure comes from At the Pool, a long-defunct app that left 22,000 passwords unprotected in addition to other sensitive details. UpGuard didn't have much success getting Amazon to take down the content.


Facebook Exposed Data Again, but This Viral Cat Can Save Lives

WIRED

Researchers discovered hundreds of millions of Facebook users' data was left unprotected once again, this time on Amazon's servers. The information exposed was stuff like names, passwords, comments, interests, and likes. The tl;dr: Facebook doesn't seem to have much control over what third parties do with your data, basically ever, so you might want to lock down those privacy settings. President Trump has hosted everyone from foreign dignitaries to sitting members of Congress at his home away from home--Mar-a-Lago. But after a woman was arrested for sneaking in this week, it raised the question: How safe is this place where Donald Trump conducts major presidential business?


Deep Learning Sentiment Analysis of Amazon.com Reviews and Ratings

arXiv.org Machine Learning

Our study employs sentiment analysis to evaluate the compatibility of Amazon.com reviews with their corresponding ratings. Sentiment analysis is the task of identifying and classifying the sentiment expressed in a piece of text as being positive or negative. On e-commerce websites such as Amazon.com, consumers can submit their reviews along with a specific polarity rating. In some instances, there is a mismatch between the review and the rating. To identify the reviews with mismatched ratings we performed sentiment analysis using deep learning on Amazon.com product review data. Product reviews were converted to vectors using paragraph vector, which then was used to train a recurrent neural network with gated recurrent unit. Our model incorporated both semantic relationship of review text and product information. We also developed a web service application that predicts the rating score for a submitted review using the trained model and if there is a mismatch between predicted rating score and submitted rating score, it provides feedback to the reviewer.