AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Free Trial Signup - Gather Twitter Data DiscoverText

#artificialintelligenceJul-21-2019, 21:20:36 GMT

Use this information to train machine-learning classifiers to recognize relevant text and social media data. Jump into data using an interactive word CloudExplorer or build a mini topic dictionary using "defined" search.

artificial intelligence, machine learning, natural language, (5 more...)

#artificialintelligence

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.38)

Add feedback

The Great Hack: the film that goes behind the scenes of the Facebook data scandal

#artificialintelligenceJul-20-2019, 20:35:36 GMT

Cambridge Analytica may have become the byword for a scandal, but it's not entirely clear that anyone knows exactly what that scandal is. It's more like toxic word association: "Facebook", "data", "harvested", "weaponised", "Trump" and, in this country, most controversially, "Brexit". It was a media firestorm that's yet to be extinguished, a year on from whistleblower Christopher Wylie's revelations in the Observer and the New York Times about how the company acquired the personal data of tens of millions of Facebook users in order to target them in political campaigns. This week sees the release of The Great Hack, a Netflix documentary that is the first feature-length attempt to gather all the strands of the affair into some sort of narrative – though it is one contested even by those appearing in the film. "This is not about one company," Julian Wheatland, the ex-chief operating officer of Cambridge Analytica, claims at one point. "This technology is going on unabated and will continue to go on unabated.[…] There was always going to be a Cambridge Analytica. It just sucks to me that it's Cambridge Analytica."

artificial intelligence, natural language, social media, (19 more...)

#artificialintelligence

Country:

North America > United States (1.00)
Europe > United Kingdom (1.00)

Genre: Financial News (0.34)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.61)

Add feedback

Text Analytics: the convergence of Big Data and Artificial Intelligence

#artificialintelligenceJul-17-2019, 12:49:26 GMT

The analysis of the text content in emails, blogs, tweets, forums and other forms of textual communication constitutes what we call text analytics. Text analytics is applicable to most industries: it can help analyze millions of emails; you can analyze customers-- comments and questions in forums; you can perform sentiment analysis using text analytics by measuring positive or negative perceptions of a company, brand, or product. Text Analytics has also been called text mining, and is a subcategory of the Natural Language Processing (NLP) field, which is one of the founding branches of Artificial Intelligence, back in the 1950s, when an interest in understanding text originally developed. Currently Text Analytics is often considered as the next step in Big Data analysis. Text Analytics has a number of subdivisions: Information Extraction, Named Entity Recognition, Semantic Web annotated domain--s representation, and many more.

artificial intelligence, big data and artificial intelligence, text analytic, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)

Add feedback

How Bots Can Tell When the C-Suite Is Lying

#artificialintelligenceJul-16-2019, 23:06:40 GMT

CEOs and CFOs are decidedly more nervous when fielding questions about China during earnings calls this year. What's more, they are more likely to be deceptive with their answers. "Deception associated with questions on China has skyrocketed this quarter, up about 50% from last quarter and more than double a year ago," according to a study by text analytics provider Amenity Analytics. Amenity Analytics is one of a handful of companies that are applying natural language processing (NLP), sentiment analysis and machine learning to the financial sector, evaluating earnings calls and other public meetings to unearth information of value to an investor. It is also rare technology that offers a clear path to ROI.

artificial intelligence, earnings call, natural language, (16 more...)

#artificialintelligence

Country: Asia > China (0.47)

Genre: Financial News (0.61)

Industry: Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.59)

Add feedback

Multi-modal Sentiment Analysis using Deep Canonical Correlation Analysis

Sun, Zhongkai, Sarma, Prathusha K, Sethares, William, Bucy, Erik P.

arXiv.org Machine LearningJul-15-2019

This paper learns multi-modal embeddings from text, audio, and video views/modes of data in order to improve upon down-stream sentiment classification. The experimental framework also allows investigation of the relative contributions of the individual views in the final multi-modal embedding. Individual features derived from the three views are combined into a multi-modal embedding using Deep Canonical Correlation Analysis (DCCA) in two ways i) One-Step DCCA and ii) Two-Step DCCA. This paper learns text embeddings using BERT, the current state-of-the-art in text encoders. We posit that this highly optimized algorithm dominates over the contribution of other views, though each view does contribute to the final result. Classification tasks are carried out on two benchmark datasets and on a new Debate Emotion data set, and together these demonstrate that the one-Step DCCA outperforms the current state-of-the-art in learning multi-modal embeddings.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1907.08696

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.72)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Maudet, Estelle, Cattan, Oralie, de Seyssel, Maureen, Servan, Christophe

arXiv.org Machine LearningJul-6-2019

Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis.

apprentissage, clinique, corpus, (16 more...)

arXiv.org Machine Learning

1907.0579

Country:

Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Wayfair Walkout, Facebook Data Value, and More News

#artificialintelligenceJun-26-2019, 22:48:48 GMT

Tech employees are taking a stand against migrant detention centers; a proposal asking tech companies to disclose the value of your data; and a live reading of the Mueller report. Here's the news you need to know, in two minutes or less. Want to receive this two-minute roundup as an email every weekday? This afternoon, 550 employees at the Boston-based ecommerce company Wayfair staged a walkout opposing sale of company furniture to migrant detention centers. Last week, Wayfair workers discovered an order for $200,000 worth of beds and other furniture reportedly placed by government contractor BCFS for a new detention center in Carrizo Springs, Texas.

artificial intelligence, facebook data value, natural language, (11 more...)

#artificialintelligence

Country:

North America > United States > Texas > Harris County > Spring (0.26)
North America > United States > Texas > Dimmit County > Carrizo Springs (0.26)

Industry: Information Technology > Services (0.88)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)

Add feedback

Constructing Information-Lossless Biological Knowledge Graphs from Conditional Statements

Jiang, Tianwen, Zhao, Tong, Qin, Bing, Liu, Ting, Chawla, Nitesh V., Jiang, Meng

arXiv.org Artificial IntelligenceJun-26-2019

Conditions are essential in the statements of biological literature. Without the conditions (e.g., environment, equipment) that were precisely specified, the facts (e.g., observations) in the statements may no longer be valid. One biological statement has one or multiple fact(s) and/or condition(s). Their subject and object can be either a concept or a concept's attribute. Existing information extraction methods do not consider the role of condition in the biological statement nor the role of attribute in the subject/object. In this work, we design a new tag schema and propose a deep sequence tagging framework to structure conditional statement into fact and condition tuples from biological text. Experiments demonstrate that our method yields a information-lossless structure of the literature.

artificial intelligence, constructing information-lossless biological knowledge graph, natural language, (13 more...)

arXiv.org Artificial Intelligence

1907.0072

Country:

North America > United States (0.16)
Asia > China (0.16)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.38)

Add feedback

Open Datasets for Machine Learning Lionbridge AI

#artificialintelligenceJun-25-2019, 00:55:59 GMT

Datasets are an integral part of machine learning. Without high quality training datasets, machine learning algorithms would have no way of knowing how to conduct sentiment analysis, categorize products or understand foreign languages. This spreadsheet contains the ultimate list of open datasets for machine learning. Organized by industry and use case, this database contains a diverse range of 300 datasets to train machine learning models.

artificial intelligence, machine learning lionbridge ai, natural language, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.39)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.39)

Add feedback

Event extraction based on open information extraction and ontology

Sahnoun, Sihem

arXiv.org Artificial IntelligenceJun-24-2019

The work presented in this master thesis consists of extracting a set of events from texts written in natural language. For this purpose, we have based ourselves on the basic notions of the information extraction as well as the open information extraction. First, we applied an open information extraction(OIE) system for the relationship extraction, to highlight the importance of OIEs in event extraction, and we used the ontology to the event modeling. We tested the results of our approach with test metrics. As a result, the two-level event extraction approach has shown good performance results but requires a lot of expert intervention in the construction of classifiers and this will take time. In this context we have proposed an approach that reduces the expert intervention in the relation extraction, the recognition of entities and the reasoning which are automatic and based on techniques of adaptation and correspondence. Finally, to prove the relevance of the extracted results, we conducted a set of experiments using different test metrics as well as a comparative study.

data mining, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1907.00692

Country:

Europe (1.00)
North America > United States (0.67)

Genre:

Overview (0.92)
Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(5 more...)

Add feedback