AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Brooklyn Nine-Nine Meets Data Science

#artificialintelligenceDec-25-2019, 14:29:30 GMT

This job [Data Scientist] is eating me alive. I spent all these years trying to be the good guy, the man in the white hat. I'm not becoming like them… I am them -- Jake Peralta, Pilot I recently binge-watched a show on Netflix called Brooklyn Nine-Nine and I really enjoyed it. As I eagerly await the release of the next season, I thought it'd be fun to perform exploratory data analysis and sentiment analysis on the pilot episode. I found the script online and extracted the text into CSV file format.

brooklyn nine-nine meet data science, character use, sentiment analysis, (4 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.36)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.36)

Add feedback

Simultaneous Identification of Tweet Purpose and Position

Iyer, Rahul Radhakrishnan, Pei, Yulong, Sycara, Katia

arXiv.org Machine LearningDec-24-2019

Tweet classification has attracted considerable attention recently. Most of the existing work on tweet classification focuses on topic classification, which classifies tweets into several predefined categories, and sentiment classification, which classifies tweets into positive, negative and neutral. Since tweets are different from conventional text in that they generally are of limited length and contain informal, irregular or new words, so it is difficult to determine user intention to publish a tweet and user attitude towards certain topic. In this paper, we aim to simultaneously classify tweet purpose, i.e., the intention for user to publish a tweet, and position, i.e., supporting, opposing or being neutral to a given topic. By transforming this problem to a multi-label classification problem, a multi-label classification method with post-processing is proposed. Experiments on real-world data sets demonstrate the effectiveness of this method and the results outperform the individual classification methods.

classification, post-processing strategy, tweet, (14 more...)

arXiv.org Machine Learning

2001.00051

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Oregon (0.04)
Asia (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Law (0.48)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.50)

Add feedback

More than 267 millions of Facebook user phone numbers exposed online

#artificialintelligenceDec-20-2019, 14:18:29 GMT

Security expert Bob Diachenko, along with Comparitech, has discovered more than 267 million Facebook user IDs, phone numbers and names in an unsecured database. The huge trove of data is likely the result of an illegal scraping operation or Facebook API abuse by a group of hackers in Vietnam. The exposed data could be used by threat actors to conduct large-scale SMS spam and phishing campaigns. "A database containing more than 267 million Facebook user IDs, phone numbers, and names was left exposed on the web for anyone to access without a password or any other authentication." "Comparitech partnered with security researcher Bob Diachenko to uncover the Elasticsearch cluster.

database, phone number, user phone number, (12 more...)

#artificialintelligence

Country:

Asia > Vietnam (0.27)
North America > United States (0.07)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Data Science > Data Mining > Web Mining (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)

Add feedback

GoodNewsEveryone: A Corpus of News Headlines Annotated with Emotions, Semantic Roles, and Reader Perception

Bostan, Laura, Kim, Evgeny, Klinger, Roman

arXiv.org Artificial IntelligenceDec-19-2019

Most research on emotion analysis from text focuses on the task of emotion classification or emotion intensity regression. Fewer works address emotions as structured phenomena, which can be explained by the lack of relevant datasets and methods. We fill this gap by releasing a dataset of 5000 English news headlines annotated via crowdsourcing with their dominant emotions, emotion experiencers and textual cues, emotion causes and targets, as well as the reader's perception and emotion of the headline. We propose a multiphase annotation procedure which leads to high quality annotations on such a task via crowdsourcing. Finally, we develop a baseline for the task of automatic prediction of structures and discuss results. The corpus we release enables further research on emotion classification, emotion intensity prediction, emotion cause detection, and supports further qualitative studies.

artificial intelligence, computational linguistic, natural language, (17 more...)

arXiv.org Artificial Intelligence

1912.03184

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > El Salvador (0.14)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
(25 more...)

Genre: Research Report (0.82)

Industry:

Government > Regional Government (0.67)
Leisure & Entertainment > Sports (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.83)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)

Add feedback

A Heterogeneous Graphical Model to Understand User-Level Sentiments in Social Media

Iyer, Rahul Radhakrishnan, Chen, Jing, Sun, Haonan, Xu, Keyang

arXiv.org Machine LearningDec-17-2019

Social Media has seen a tremendous growth in the last decade and is continuing to grow at a rapid pace. With such adoption, it is increasingly becoming a rich source of data for opinion mining and sentiment analysis. The detection and analysis of sentiment in social media is thus a valuable topic and attracts a lot of research efforts. Most of the earlier efforts focus on supervised learning approaches to solve this problem, which require expensive human annotations and therefore limits their practical use. In our work, we propose a semi-supervised approach to predict user-level sentiments for specific topics. We define and utilize a heterogeneous graph built from the social networks of the users with the knowledge that connected users in social networks typically share similar sentiments. Compared with the previous works, we have several novelties: (1) we incorporate the influences/authoritativeness of the users into the model, 2) we include comment-based and like-based user-user links to the graph, 3) we superimpose multiple heterogeneous graphs into one thereby allowing multiple types of links to exist between two users.

machine learning, natural language, sentiment, (18 more...)

arXiv.org Machine Learning

1912.07911

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > New York (0.04)
Asia (0.04)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.89)
(2 more...)

Add feedback

Myths and Realities: Sentiment Analysis for Crypto Assets

#artificialintelligenceDec-16-2019, 22:50:49 GMT

In Act II, Scene II of the famous play Richelieu; Or the Conspiracy, British playwright Edward Bulwer-Lytton coined a phrase that has transcended generations: "The pen is mightier than the sword."

crypto market, sentiment, sentiment analysis, (14 more...)

#artificialintelligence

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.70)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.60)

Add feedback

How AI is Making Sentiment Analysis Easy

#artificialintelligenceDec-16-2019, 22:42:20 GMT

It's a far more complex way of analyzing how consumers feel about our products and services, using not just simple words but longer sentence fragments. Yes, AI is becoming smart enough to understand the tone of a statement, rather than simply understanding whether certain words within a group of text have a positive or negative connotation. This is incredibly impactful for companies seeking to optimize their message, improve customer engagement, or even identify top influencers in their customer base. The possibilities of sentiment analysis are incredibly far-reaching. The types of information that AI can gather from both unstructured data and affective computing in sentiment analysis are huge.

affective computing, customer, sentiment analysis, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)

Add feedback

Na\"iveRole: Author-Contribution Extraction and Parsing from Biomedical Manuscripts

Tkaczyk, Dominika, Collins, Andrew, Beel, Joeran

arXiv.org Machine LearningDec-15-2019

Information about the contributions of individual authors to scientific publications is important for assessing authors' achievements. Some biomedical publications have a short section that describes authors' roles and contributions. It is usually written in natural language and hence author contributions cannot be trivially extracted in a machine-readable format. In this paper, we present 1) A statistical analysis of roles in author contributions sections, and 2) Na\"iveRole, a novel approach to extract structured authors' roles from author contribution sections. For the first part, we used co-clustering techniques, as well as Open Information Extraction, to semi-automatically discover the popular roles within a corpus of 2,000 contributions sections from PubMed Central. The discovered roles were used to automatically build a training set for Na\"iveRole, our role extractor approach, based on Na\"ive Bayes. Na\"iveRole extracts roles with a micro-averaged precision of 0.68, recall of 0.48 and F1 of 0.57. It is, to the best of our knowledge, the first attempt to automatically extract author roles from research papers. This paper is an extended version of a previous poster published at JCDL 2018.

corpus, manuscript, role mention, (15 more...)

arXiv.org Machine Learning

1912.1017

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

LexNLP: Natural language processing and information extraction for legal and regulatory texts

#artificialintelligenceDec-13-2019, 02:58:23 GMT

By accepting the Deed and closing the Transaction, Buyer, on behalf of itself and its successors and assigns, shall thereby release each of the Seller Parties from, and waive any and all Liabilities against each of the Seller Parties for, attributable to, or in connection with the Property, whether arising or accruing before, on or after the Closing and whether attributable to events or circumstances which arise or occur before, on or after the Closing, including, without limitation, the following: (a) any and all statements or opinions heretofore or hereafter made, or information furnished, by any Seller Parties to any Buyerâ s Representatives; and (b) any and all Liabilities with respect to the structural, physical, or environmental condition of the Property, including, without limitation, all Liabilities relating to the release, presence, discovery or removal of any hazardous or regulated substance, chemical, waste or material that may be located in, at, about or under the Property, or connected with or arising out of any and all claims or causes of action based upon CERCLA (Comprehensive Environmental Response, Compensation, and Liability Act of 1980, 42 U.S.C. Notwithstanding the foregoing, the foregoing release and waiver is not intended and shall not be construed as affecting or impairing any rights or remedies that Buyer may have against Seller with respect to (i) a breach of any of Sellerâ s Warranties, (ii) a breach of any Surviving Covenants, or (iii) any acts constituting fraud by Seller.

artificial intelligence, data mining, natural language, (9 more...)

#artificialintelligence

Country: North America > United States (0.77)

Industry:

Law > Environmental Law (0.77)
Government > Regional Government > North America Government > United States Government (0.77)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)

Add feedback

Model Deployment for Data Scientists Using TensorFlow: Part 1 - Nightfall AI

#artificialintelligenceDec-12-2019, 20:05:51 GMT

In the world of machine learning, model deployment is a crucial piece of the puzzle. While data scientists excel at other parts of the pipeline, deploying machine learning models tends to fall under the umbrella of software engineering or IT operations. And for good reason--successful deployments require a myriad of complex tasks, including building infrastructure, implementing APIs, load balancing, and integrating with data pipelines. We'll briefly walk you through a basic model deployment example by picking out tools and planning out an approach to construct a simple sentiment classification model. By the end of this post you will have the tools to serve your deep learning (DL) models via an API.

deployment, server, tensorflow, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.35)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.35)

Add feedback