Concept Labeling: Building Text Classifiers with Minimal Supervision

Chenthamarakshan, Vijil (IBM T J Watson Research Center Yorktown Heights) | Melville, Prem (IBM T J Watson Research Center Yorktown Heights) | Sindhwani, Vikas (IBM T J Watson Research Center Yorktown Heights) | Lawrence, Richard D (IBM T J Watson Research Center Yorktown Heights)

Jul-19-2011–AAAI Conferences

The rapid construction of supervised text classification models is becoming a pervasive need across many modern applications. To reduce human-labeling bottlenecks, many new statistical paradigms (e.g., active, semi-supervised, transfer and multi-task learning) have been vigorously pursued in recent literature with varying degrees of empirical success. Concurrently, the emergence of Web 2.0 platforms in the last decade has enabled a world-wide, collaborative human effort to construct a massive ontology of concepts with very rich, detailed and accurate descriptions. In this paper we propose a new framework to extract supervisory information from such ontologies and complement it with a shift in human effort from direct labeling of examples in the domain of interest to the much more efficient identification of concept-class associations. Through empirical studies on text categorization problems using the Wikipedia ontology, we show that this shift allows very high-quality models to be immediately induced at virtually no cost.

category, classifier, ontology, (15 more...)

AAAI Conferences

Jul-19-2011

Conferences PDF

Add feedback

Country:
- Asia (0.04)
- Africa (0.04)
- North America > United States
  - Wisconsin > Dane County
    - Madison (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Health & Medicine > Therapeutic Area (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language
    - Text Processing (0.51)
    - Text Classification (0.49)
  - Machine Learning
    - Inductive Learning (0.49)
    - Statistical Learning (0.48)
    - Performance Analysis (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found