AITopics

2011.00905

Country:

North America > United States > New York (0.04)
North America > United States > Illinois (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Transportation (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.93)
(3 more...)

#artificialintelligenceOct-27-2020, 19:10:04 GMT

How to balance transformation decisions, feature selection, and model tuning vs time in text analytics?

Being to new text analytics, I haven't gotten the hang of my typical ML workflow given how long processes take to run in the commonly large feature space of text analytics. I would like to know what the typical strategy is to balance effort/time in terms of optimizing transformation decision, feature down-selection, and model tuning. In an effort to get a sense of which of the decision points above I should run further tuning on, I ran untuned RF, Logistic, Naive Bayes, SGD, and KNN models on (with cross validation). No clear decision point was commonly "better" in the resulting f-1 scores, and the difference is often noteworthy. As I have no bias towards a particular algorithm type (only the best f-1 score), I'm stuck in a quandry-- I have not successfully narrowed my decision space enough.

artificial intelligence, machine learning, natural language, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.63)

Moradi, Milad, Samwald, Matthias

Explaining black-box text classifiers for disease-treatment information extraction

arXiv.org Artificial IntelligenceOct-21-2020

Deep neural networks and other intricate Artificial Intelligence (AI) models have reached high levels of accuracy on many biomedical natural language processing tasks. However, their applicability in real-world use cases may be limited due to their vague inner working and decision logic. A post-hoc explanation method can approximate the behavior of a black-box AI model by extracting relationships between feature values and outcomes. In this paper, we introduce a post-hoc explanation method that utilizes confident itemsets to approximate the behavior of black-box classifiers for medical information extraction. Incorporating medical concepts and semantics into the explanation process, our explanator finds semantic relations between inputs and outputs in different parts of the decision space of a black-box classifier. The experimental results show that our explanation method can outperform perturbation and decision set based explanators in terms of fidelity and interpretability of explanations produced for predictions on a disease-treatment information extraction task.

data mining, explanation, machine learning, (20 more...)

2010.10873

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Air (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining > Text Mining (0.92)

#artificialintelligenceOct-18-2020, 05:55:37 GMT

NLP, AI, and Machine Learning: What's The Difference?

Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that studies how machines understand human language. Its goal is to build systems that can make sense of text and perform tasks like translation, grammar checking, or topic classification. Companies are increasingly using NLP-equipped tools to gain insights from data and to automate routine tasks. This sentiment analyzer, for instance, can help brands detect emotions in text, such as negative comments on social media. But what exactly is Natural Language Processing?

artificial intelligence, natural language processing, nlp, (12 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.71)

Banerjee, Shubhanker, Jayapal, Arun, Thavareesan, Sajeetha

NUIG-Shubhanker@Dravidian-CodeMix-FIRE2020: Sentiment Analysis of Code-Mixed Dravidian text using XLNet

arXiv.org Artificial IntelligenceOct-15-2020

Social media has penetrated into multilingual societies, however most of them use English to be a preferred language for communication. So it looks natural for them to mix their cultural language with English during conversations resulting in abundance of multilingual data, call this code-mixed data, available in todays' world.Downstream NLP tasks using such data is challenging due to the semantic nature of it being spread across multiple languages.One such Natural Language Processing task is sentiment analysis, for this we use an auto-regressive XLNet model to perform sentiment analysis on code-mixed Tamil-English and Malayalam-English datasets.

large language model, machine learning, sentiment analysis, (18 more...)

2010.07773

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.15)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
Europe > Ireland > Connaught > County Galway > Galway (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.90)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Wu, Zhengxuan, Ong, Desmond C.

Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis

arXiv.org Artificial IntelligenceOct-15-2020

Aspect-based sentiment analysis (ABSA) and Targeted ASBA (TABSA) allow finer-grained inferences about sentiment to be drawn from the same text, depending on context. For example, a given text can have different targets (e.g., neighborhoods) and different aspects (e.g., price or safety), with different sentiment associated with each target-aspect pair. In this paper, we investigate whether adding context to self-attention models improves performance on (T)ABSA. We propose two variants of Context-Guided BERT (CG-BERT) that learn to distribute attention under different contexts. We first adapt a context-aware Transformer to produce a CG-BERT that uses context-guided softmax-attention. Next, we propose an improved Quasi-Attention CG-BERT model that learns a compositional attention that supports subtractive attention. We train both models with pretrained BERT on two (T)ABSA datasets: SentiHood and SemEval-2014 (Task 4). Both models achieve new state-of-the-art results with our QACG-BERT model having the best performance. Furthermore, we provide analyses of the impact of context in the our proposed models. Our work provides more evidence for the utility of adding context-dependencies to pretrained self-attention-based language models for context-based natural language tasks.

artificial intelligence, machine learning, natural language, (15 more...)

2010.07523

Country:

Asia > Singapore (0.04)
North America > Canada (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.75)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.75)

WSJ.com: WSJD - TechnologyOct-14-2020, 22:45:00 GMT

Twitter Data-Breach Case Won't Be Resolved Before Year's End, Ireland's Regulator Says

Helen Dixon, head of Ireland's Data Protection Commission, in May submitted a draft decision to more than two dozen of the bloc's privacy regulators for review, as required under the law. Eleven regulators objected to the proposed ruling, sparking a lengthy dispute-resolution mechanism, she said. The contents of the draft decision haven't been disclosed. Twitter's European operations are based in Dublin. "It's a long process," Ms. Dixon said at The Wall Street Journal's virtual CIO Network conference.

artificial intelligence, determann, natural language, (6 more...)

WSJ.com: WSJD - Technology

Country: Europe > Ireland (0.63)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)

#artificialintelligenceOct-14-2020, 20:45:24 GMT

How to Run Sentiment Analysis in Python using VADER

We have explained how to get a sentiment score for words in Python. Instead of building our own lexicon, we can use a pre-trained one like the VADER which stands from Valence Aware Dictionary and sEntiment Reasoner and is specifically attuned to sentiments expressed in social media. You can install the VADER library using pip like pip install vaderSentiment or you can get it directly from NTLK. You can have a look at VADER documentation. Notice that the pos, neu and neg probabilities add up to 1. Also, the compound score is a very useful metric in case we want a single measure of sentiment.

artificial intelligence, compound score, natural language, (9 more...)

Technology:

Information Technology > Communications > Social Media (0.98)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.47)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.47)

Kerroumi, Mohamed, Sayem, Othmane, Shabou, Aymen

VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach

arXiv.org Artificial IntelligenceOct-13-2020

We introduce a novel approach for scanned document representation to perform field extraction. It allows the simultaneous encoding of the textual, visual and layout information in a 3D matrix used as an input to a segmentation model. We improve the recent Chargrid and Wordgrid models in several ways, first by taking into account the visual modality, then by boosting its robustness in regards to small datasets while keeping the inference time low. Our approach is tested on public and private document-image datasets, showing higher performances compared to the recent state-of-the-art methods.

artificial intelligence, machine learning, natural language, (14 more...)

2010.02358

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > France > Île-de-France > Hauts-de-Seine > Montrouge (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)

#artificialintelligenceOct-10-2020, 07:50:42 GMT

Top 10 Text Analytics Companies to Watch in 2020

Companies typically capture data from diverse sources that are often raw and unstructured information. It can be a challenge for businesses to process and translate that data into more quantitative insights. Text analytics has been recognized as an evolving data processing tool converting voluminous amounts of data into meaningful ones. The technology is being used in an array of applications across different sectors, including retail, BFSI, healthcare, FMCG, telecommunications, and government, among others. Text analytics can also be combined with other advanced analytics. Widely used data sources for text analytics include social networks, internal email, inbound customer email, news articles, online discussion forums and CRM customer service notes.

data mining, natural language, platform, (15 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.05)

Industry: Information Technology > Services (0.70)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)