Explicit Document Modeling through Weighted Multiple-Instance Learning
Pappas, Nikolaos, Popescu-Belis, Andrei
–Journal of Artificial Intelligence Research
Representing documents is a crucial component in many NLP tasks, for instance predicting aspect ratings in reviews. Previous methods for this task treat documents globally, and do not acknowledge that target categories are often assigned by their authors with generally no indication of the specific sentences that motivate them. To address this issue, we adopt a weakly supervised learning model, which jointly learns to focus on relevant parts of a document according to the context along with a classifier for the target categories. Derived from the weighted multiple-instance regression (MIR) framework, the model learns decomposable document vectors for each individual category and thus overcomes the representational bottleneck in previous methods due to a fixed-length document vector. During prediction, the estimated relevance or saliency weights explicitly capture the contribution of each sentence to the predicted rating, thus offering an explanation of the rating. Our model achieves state-of-the-art performance on multi-aspect sentiment analysis, improving over several baselines. Moreover, the predicted saliency weights are close to human estimates obtained by crowdsourcing, and increase the performance of lexical and topical features for review segmentation and summarization.
Journal of Artificial Intelligence Research
Mar-22-2017
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- United States
- District of Columbia > Washington (0.04)
- Maryland > Baltimore (0.04)
- Texas > Travis County
- Austin (0.14)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Colorado > Denver County
- Denver (0.14)
- Ohio > Franklin County
- Columbus (0.04)
- New York
- New York County > New York City (0.04)
- Monroe County > Rochester (0.04)
- Rhode Island > Providence County
- Providence (0.04)
- Virginia > Arlington County
- Arlington (0.04)
- Oregon
- Multnomah County > Portland (0.04)
- Benton County > Corvallis (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Washington > King County
- Seattle (0.14)
- California
- San Francisco County > San Francisco (0.14)
- Los Angeles County > Los Angeles (0.14)
- San Diego County > San Diego (0.04)
- Canada
- Quebec
- Montreal (0.04)
- Capitale-Nationale Region
- Québec (0.04)
- Quebec City (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- Quebec
- United States
- Europe
- Switzerland (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany
- Saarland > Saarbrücken (0.04)
- Berlin (0.04)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- South Korea (0.04)
- Middle East
- China
- Oceania > Australia
- Genre:
- Research Report > New Finding (0.92)
- Industry:
- Education (0.46)
- Government (0.45)
- Technology: