Multi-Instance Multi-Label Learning

Zhou, Zhi-Hua, Zhang, Min-Ling, Huang, Sheng-Jun, Li, Yu-Feng

Oct-23-2011–arXiv.org Artificial Intelligence

Nanjing University, Nanjing 210046, China Abstract In this paper, we propose the MIML (Multi-Instance Multi-Label learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Considering that the degeneration process may lose information, we propose the D-MimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming single-instances into the MIML representation for learning, while SubCod works by transforming single-label examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the single-instances or single-label examples directly. Email: zhouzh@lamda.nju.edu.cn 1 Introduction In traditional supervised learning, an object is represented by an instance, i.e., a feature vector, and associated with a class label. Formally, let X denote the instance space (or feature space) andY the set of class labels. In particular, each object in this framework belongs to only one concept and therefore the corresponding instance is associated with a single class label. However, many real-world objects are complicated, which may belong to multiple concepts simultaneously. For example, an image can belong to several classes simultaneously, e.g., grasslands, lions, Africa, etc.; a text document can be classified to several categories if it is viewed from different aspects, e.g., scientific novel, Jules Verne's writing or even books on traveling;aweb page can be recognized as news page, sports page, soccer page, etc. In a specific real task, maybe only one of the multiple concepts is the right semantic meaning. For example, in image retrieval when a user is interested in an image with lions, s/he may be only interested in the concept lions instead of the other concepts grasslands and Africa associated with that image. The difficulty here is caused by those objects that involve multiple concepts. To choose the right semantic meaning for such objects for a specific scenario is the fundamental difficulty of many tasks.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Oct-23-2011

arXiv.org PDF

Add feedback

Country:
- Africa (0.44)
- South America > Brazil
  - Bahia > Salvador (0.04)
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - Barbados (0.04)
  - United States
    - District of Columbia > Washington (0.04)
    - Nebraska > Lancaster County
      - Lincoln (0.14)
    - Florida
      - Palm Beach County > Boca Raton (0.04)
      - Orange County > Orlando (0.04)
    - Nevada > Clark County
      - Las Vegas (0.04)
    - Tennessee > Davidson County
      - Nashville (0.04)
    - Oregon > Benton County
      - Corvallis (0.04)
    - Wisconsin > Dane County
      - Madison (0.04)
    - Mississippi > Madison County
      - Madison (0.04)
    - Massachusetts
      - Suffolk County > Boston (0.04)
      - Middlesex County
        Cambridge (0.05)
        Reading (0.04)
    - California
      - San Francisco County > San Francisco (0.14)
      - Orange County > Irvine (0.14)
      - Los Angeles County > Los Angeles (0.14)
      - San Diego County > San Diego (0.04)
    - New York > New York County
      - New York City (0.04)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - United Kingdom
    - Scotland > City of Edinburgh
      - Edinburgh (0.04)
    - England > East Sussex
      - Brighton (0.04)
  - Slovenia > Upper Carniola
    - Municipality of Bled > Bled (0.04)
  - Italy
    - Tuscany > Pisa Province
      - Pisa (0.04)
    - Piedmont > Turin Province
      - Turin (0.14)
  - Germany
    - Baden-Württemberg > Freiburg (0.04)
    - Saxony > Leipzig (0.04)
    - North Rhine-Westphalia > Cologne Region
      - Bonn (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
- Asia
  - India (0.04)
  - Middle East > Jordan (0.04)
  - Taiwan > Taiwan Province
    - Taipei (0.04)
  - China > Jiangsu Province
    - Nanjing (0.44)

Genre:
- Research Report
  - New Finding (0.67)
  - Experimental Study (0.67)

Industry:
- Information Technology > Security & Privacy (0.46)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Artificial Intelligence > Machine Learning
    - Inductive Learning (0.88)
    - Statistical Learning > Nearest Neighbor Methods (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found