Jointly learning relevant subgraph patterns and nonlinear models of their indicators

Shirakawa, Ryo, Yokoyama, Yusei, Okazaki, Fumiya, Takigawa, Ichigaku

Jul-9-2018–arXiv.org Machine Learning

Classification and regression in which the inputs are graphs of arbitrary size and shape have been paid attention in various fields such as computational chemistry and bioinformatics. Subgraph indicators are often used as the most fundamental features, but the number of possible subgraph patterns are intractably large due to the combinatorial explosion. We propose a novel efficient algorithm to jointly learn relevant subgraph patterns and nonlinear models of their indicators. Previous methods for such joint learning of subgraph features and models are based on search for single best subgraph features with specific pruning and boosting procedures of adding their indicators one by one, which result in linear models of subgraph indicators. In contrast, the proposed approach is based on directly learning regression trees for graph inputs using a newly derived bound of the total sum of squares for data partitions by a given subgraph feature, and thus can learn nonlinear models through standard gradient boosting. An illustrative example we call the Graph-XOR problem to consider nonlinearity, numerical experiments with real datasets, and scalability comparisons to naive approaches using explicit pattern enumeration are also presented.

artificial intelligence, decision tree learning, machine learning, (15 more...)

arXiv.org Machine Learning

Jul-9-2018

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - District of Columbia > Washington (0.04)
    - Hawaii (0.04)
    - California (0.04)
    - Nevada > Clark County
      - Las Vegas (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
  - Canada > Alberta
    - Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Europe
  - United Kingdom > England
    - Greater London > London (0.05)
  - France > Grand Est
    - Meurthe-et-Moselle > Nancy (0.04)
- Asia > Japan
  - Hokkaidō (0.05)

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine (0.93)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Decision Tree Learning (1.00)
  - Statistical Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found