Fast semi-supervised discriminant analysis for binary classification of large data-sets

Tavernier, Joris, Simm, Jaak, Meerbergen, Karl, Wegner, Joerg Kurt, Ceulemans, Hugo, Moreau, Yves

Mar-1-2018–arXiv.org Artificial Intelligence

High-dimensional data requires scalable algorithms. We propose and analyze three scalable and related algorithms for semi-supervised discriminant analysis (SDA). These methods are based on Krylov subspace methods which exploit the data sparsity and the shift-invariance of Krylov subspaces. In addition, the problem definition was improved by adding centralization to the semi-supervised setting. The proposed methods are evaluated on a industry-scale data set from a pharmaceutical company to predict compound activity on target proteins. The results show that SDA achieves good predictive performance and our methods only require a few seconds, significantly improving computation time on previous state of the art.

artificial intelligence, eigenvalue problem, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Mar-1-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Europe > Belgium
  - Flanders > Flemish Brabant > Leuven (0.04)

Genre:
- Research Report (0.70)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence > Machine Learning
    - Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found