AITopics | Text Classification

Collaborating Authors

Text Classification

"A text classifier is an automated means of determining some metadata about a document. Text classifiers are used for such diverse needs as spam filtering, suggesting categories for indexing a document created in a content management system, or automatically sorting help desk requests."
– John Graham-Cumming, Naive Bayesian Text Classification. Dr. Dobb's. May 1 2005.

News Overviews Instructional Materials AI-Alerts Classics

Text Classification: Applications and Use Cases

#artificialintelligenceFeb-23-2018, 00:17:40 GMT

Text analysis, as a whole, is an emerging field of study. Fields such as Marketing, Product Management, Academia, and Governance are already leveraging the process of analyzing and extracting information from textual data. We discussed the technology behind Text Classification, one of the essential parts of Text Analysis. Text classification or Text Categorization is the activity of labeling natural language texts with relevant categories from a predefined set. In laymen terms, text classification is a process of extracting generic tags from unstructured text.

artificial intelligence, natural language, text classification, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)

Add feedback

Multi-Class Text Classification with Scikit-Learn – Towards Data Science

#artificialintelligenceFeb-21-2018, 22:36:57 GMT

There are lots of applications of text classification in the commercial world. However, the vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering (spam vs. ham), sentiment analysis (positive vs. negative). In most cases, our real world problem are much more complicated than that. Therefore, this is what we are going to do today: Classifying Consumer Finance Complaints into 12 pre-defined classes. The data can be downloaded from data.gov.

machine learning, natural language, unigram, (18 more...)

#artificialintelligence

Industry: Banking & Finance > Loans (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification

Li, Zheng (Hong Kong University of Science and Technology) | Wei, Ying (Hong Kong University of Science and Technology) | Zhang, Yu (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

AAAI ConferencesFeb-8-2018

Cross-domain sentiment classification aims to leverage useful information in a source domain to help do sentiment classification in a target domain that has no or little supervised information. Existing cross-domain sentiment classification methods cannot automatically capture non-pivots, i.e., the domain-specific sentiment words, and pivots, i.e., the domain-shared sentiment words, simultaneously. In order to solve this problem, we propose a Hierarchical Attention Transfer Network (HATN) for cross-domain sentiment classification. The proposed HATN provides a hierarchical attention transfer mechanism which can transfer attentions for emotions across domains by automatically capturing pivots and non-pivots. Besides, the hierarchy of the attention mechanism mirrors the hierarchical structure of documents, which can help locate the pivots and non-pivots better. The proposed HATN consists of two hierarchical attention networks, with one named P-net aiming to find the pivots and the other named NP-net aligning the non-pivots by using the pivots as a bridge. Specifically, P-net firstly conducts individual attention learning to provide positive and negative pivots for NP-net. Then, P-net and NP-net conduct joint attention learning such that the HATN can simultaneously capture pivots and non-pivots and realize transferring attentions for emotions across domains. Experiments on the Amazon review dataset demonstrate the effectiveness of HATN.

classification, natural language, text classification, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

280 Birds With One Stone: Inducing Multilingual Taxonomies From Wikipedia Using Character-Level Classification

Gupta, Amit (Ecole Polytechnique Fédérale de Lausanne) | Lebret, Rémi (Ecole Polytechnique Fédérale de Lausanne) | Harkous, Hamza (Ecole Polytechnique Fédérale de Lausanne) | Aberer, Karl (Ecole Polytechnique Fédérale de Lausanne)

AAAI ConferencesFeb-8-2018

We propose a novel fully-automated approach towards inducing multilingual taxonomies from Wikipedia. Given an English taxonomy, our approach first leverages the interlanguage links of Wikipedia to automatically construct training datasets for the isa relation in the target language. Character-level classifiers are trained on the constructed datasets, and used in an optimal path discovery framework to induce high-precision, high-coverage taxonomies in other languages. Through experiments, we demonstrate that our approach significantly outperforms the state-of-the-art, heuristics-heavy approaches for six languages. As a consequence of our work, we release presumably the largest and the most accurate multilingual taxonomic resource spanning over 280 languages.

machine learning, natural language, text classification, (22 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > Canada (1.00)
Europe (0.93)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.53)
(2 more...)

Add feedback

EMD Metric Learning

Zhang, Zizhao (Tsinghua University) | Zhang, Yubo (Tsinghua University) | Zhao, Xibin (Tsinghua University) | Gao, Yue (Tsinghua University)

AAAI ConferencesFeb-8-2018

Earth Mover's Distance (EMD), targeting at measuring the many-to-many distances, has shown its superiority and been widely applied in computer vision tasks, such as object recognition, hyperspectral image classification and gesture recognition. However, there is still little effort concentrated on optimizing the EMD metric towards better matching performance. To tackle this issue, we propose an EMD metric learning algorithm in this paper. In our method, the objective is to learn a discriminative distance metric for EMD ground distance matrix generation which can better measure the similarity between compared subjects. More specifically, given a group of labeled data from different categories, we first select a subset of training data and then optimize the metric for ground distance matrix generation. Here, both the EMD metric and the EMD flow-network are alternatively optimized until a steady EMD value can be achieved. This method is able to generate a discriminative ground distance matrix which can further improve the EMD distance measurement. We then apply our EMD metric learning method on two tasks, i.e., multi-view object classification and document classification. The experimental results have shown better performance of our proposed EMD metric learning method compared with the traditional EMD method and the state-of-the-art methods. It is noted that the proposed EMD metric learning method can be also used in other applications.

machine learning, natural language, pattern recognition, (15 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
(3 more...)

Add feedback

[P] Build a text classification model without any training data • r/MachineLearning

@machinelearnbotJan-31-2018, 01:43:08 GMT

Imagine predicting the emotion of a tweet without providing any training examples of tweets with that emotion label.This research discusses the paradigm of Zero-shot learning for Text Classification and the paper is aptly titled as "Train Once, Test Anywhere: Zero-shot Learning For Text Classification". You can read the paper here or try a demo here.

large language model, machine learning, natural language, (5 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)

Add feedback

Automated Text Classification Using Machine Learning

@machinelearnbotJan-30-2018, 19:05:54 GMT

Digitization has changed the way we process and analyze information. There is an exponential increase in online availability of information. From web pages to emails, science journals, e-books, learning content, news and social media are all full of textual data. The idea is to create, analyze and report information fast. This is when automated text classification steps up.

classification, machine learning, natural language, (15 more...)

@machinelearnbot

Industry: Media (0.55)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.79)

Add feedback

Automated Text Classification Using Machine Learning

#artificialintelligenceJan-26-2018, 07:00:05 GMT

classification, machine learning, natural language, (15 more...)

#artificialintelligence

Industry: Media (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.78)

Add feedback

Impact of Batch Size on Stopping Active Learning for Text Classification

Beatty, Garrett, Kochis, Ethan, Bloodgood, Michael

arXiv.org Machine LearningJan-24-2018

When using active learning, smaller batch sizes are typically more efficient from a learning efficiency perspective. However, in practice due to speed and human annotator considerations, the use of larger batch sizes is necessary. While past work has shown that larger batch sizes decrease learning efficiency from a learning curve perspective, it remains an open question how batch size impacts methods for stopping active learning. We find that large batch sizes degrade the performance of a leading stopping method over and above the degradation that results from reduced learning efficiency. We analyze this degradation and find that it can be mitigated by changing the window size parameter of how many past iterations of learning are taken into account when making the stopping decision. We find that when using larger batch sizes, stopping methods are more effective when smaller window sizes are used.

machine learning, natural language, text classification, (17 more...)

arXiv.org Machine Learning

1801.07887

Country:

North America > United States > New Jersey > Mercer County > Ewing (0.17)
North America > United States > California > Orange County > Laguna Hills (0.15)

Genre: Research Report > New Finding (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.42)

Add feedback

TensorFlow -- Text Classification – Machine Learnings

@machinelearnbotJan-22-2018, 16:34:39 GMT

On Nov 9, it's been an official 1 year since TensorFlow released. Looking back there has been a lot of progress done towards making TensorFlow the most used machine learning framework. And as this milestone passed, I realized that still haven't published long promised blog about text classification. Even though examples has been there in TensorFlow repository, they didn't have very good description. Text classification is one of the most important parts of machine learning, as most of people's communication is done via text.

machine learning, natural language, text classification, (11 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback