AITopics | Pahikkala, Tapio

Collaborating Authors

Pahikkala, Tapio

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identification of functionally related enzymes by learning-to-rank methods

Stock, Michiel, Fober, Thomas, Hüllermeier, Eyke, Glinca, Serghei, Klebe, Gerhard, Pahikkala, Tapio, Airola, Antti, De Baets, Bernard, Waegeman, Willem

arXiv.org Machine LearningMay-17-2014

Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes.

artificial intelligence, enzyme, health & medicine, (19 more...)

arXiv.org Machine Learning

1405.4394

Country:

Europe > Germany (0.28)
North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

Pahikkala, Tapio, Airola, Antti, Stock, Michiel, De Baets, Bernard, Waegeman, Willem

arXiv.org Machine LearningJun-8-2013

In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance.

inductive learning, relation, us government, (23 more...)

arXiv.org Machine Learning

1209.4825

Country:

North America > United States > New York (0.14)
Europe > Belgium > Flanders (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment > Games (0.93)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(4 more...)

Add feedback

A kernel-based framework for learning graded relations from data

Waegeman, Willem, Pahikkala, Tapio, Airola, Antti, Salakoski, Tapio, Stock, Michiel, De Baets, Bernard

arXiv.org Machine LearningNov-28-2011

Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are often expressed in a graded manner in real-world applications. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and graded relations are considered, and it unifies existing approaches because different types of graded relations can be modeled, including symmetric and reciprocal relations. This framework establishes important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated through various experiments on synthetic and real-world data.

fuzzy logic, health & medicine, relation, (20 more...)

arXiv.org Machine Learning

doi: 10.1109/TFUZZ.2012.2194151

1111.6473

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.67)

Add feedback

Training linear ranking SVMs in linearithmic time using red-black trees

Airola, Antti, Pahikkala, Tapio, Salakoski, Tapio

arXiv.org Machine LearningJan-31-2011

We introduce an efficient method for training the linear ranking support vector machine. The method combines cutting plane optimization with red-black tree based approach to subgradient calculations, and has O(m*s+m*log(m)) time complexity, where m is the number of training examples, and s the average number of non-zero features per example. Best previously known training algorithms achieve the same efficiency only for restricted special cases, whereas the proposed approach allows any real valued utility scores in the training data. Experiments demonstrate the superior scalability of the proposed approach, when compared to the fastest existing RankSVM implementations.

artificial intelligence, implementation, optimization problem, (19 more...)

arXiv.org Machine Learning

doi: 10.1016/j.patrec.2011.03.014

1005.0928

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Linear Time Feature Selection for Regularized Least-Squares

Pahikkala, Tapio, Airola, Antti, Salakoski, Tapio

arXiv.org Machine LearningMar-18-2010

We propose a novel algorithm for greedy forward feature selection for regularized least-squares (RLS) regression and classification, also known as the least-squares support vector machine or ridge regression. The algorithm, which we call greedy RLS, starts from the empty feature set, and on each iteration adds the feature whose addition provides the best leave-one-out cross-validation performance. Our method is considerably faster than the previously proposed ones, since its time complexity is linear in the number of training examples, the number of features in the original data set, and the desired size of the set of selected features. Therefore, as a side effect we obtain a new training algorithm for learning sparse linear RLS predictors which can be used for large scale learning. This speed is possible due to matrix calculus based short-cuts for leave-one-out and feature addition. We experimentally demonstrate the scalability of our algorithm and its ability to find good quality feature sets.

algorithm, artificial intelligence, health & medicine, (18 more...)

arXiv.org Machine Learning

1003.357

Country:

Europe (0.67)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback