AITopics | Mamitsuka, Hiroshi

Collaborating Authors

Mamitsuka, Hiroshi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Wasserstein Gradient Flow over Variational Parameter Space for Variational Inference

Nguyen, Dai Hai, Sakurai, Tetsuya, Mamitsuka, Hiroshi

arXiv.org Machine LearningOct-25-2023

Many machine learning problems involve the challenge of approximating an intractable target distribution, which might only be known up to a normalization constant. Bayesian inference is a typical example, where the intractable and unnormalized target distribution is a result of the product of the prior and likelihood functions (see [11, 18, 4]). Variational Inference (VI), a widely employed across various application domains, seeks to approximate this intractable target distribution by utilizing a variational distribution (see [3, 7, 20] and references therein). VI is typically formulated as an optimization problem, with the objective of maximizing the evidence lower bound objective (ELBO), which is equivalent to minimizing the Kullback-Leiber (KL) divergence between the variational distribution and the target distribution. The conventional method for maximizing the ELBO involves the use of gradient descent, such as black-box VI (BBVI, [16]). The gradient of the ELBO can be expressed as an expectation over the variational distribution, which is typically estimated by Monte Carlo samples from this distribution.

artificial intelligence, gradient flow, machine learning, (13 more...)

arXiv.org Machine Learning

2310.16705

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Central-Smoothing Hypergraph Neural Networks for Predicting Drug-Drug Interactions

Nguyen, Duc Anh, Nguyen, Canh Hao, Mamitsuka, Hiroshi

arXiv.org Artificial IntelligenceApr-4-2023

Predicting drug-drug interactions (DDI) is the problem of predicting side effects (unwanted outcomes) of a pair of drugs using drug information and known side effects of many pairs. This problem can be formulated as predicting labels (i.e. side effects) for each pair of nodes in a DDI graph, of which nodes are drugs and edges are interacting drugs with known labels. State-of-the-art methods for this problem are graph neural networks (GNNs), which leverage neighborhood information in the graph to learn node representations. For DDI, however, there are many labels with complicated relationships due to the nature of side effects. Usual GNNs often fix labels as one-hot vectors that do not reflect label relationships and potentially do not obtain the highest performance in the difficult cases of infrequent labels. In this paper, we formulate DDI as a hypergraph where each hyperedge is a triple: two nodes for drugs and one node for a label. We then present CentSmoothie, a hypergraph neural network that learns representations of nodes and labels altogether with a novel central-smoothing formulation. We empirically demonstrate the performance advantages of CentSmoothie in simulations as well as real datasets.

artificial intelligence, machine learning, side effect, (16 more...)

arXiv.org Artificial Intelligence

2112.07837

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Convex Clustering Solutions

Nguyen, Canh Hao, Mamitsuka, Hiroshi

arXiv.org Machine LearningMay-18-2021

Convex clustering is an attractive clustering algorithm with favorable properties such as efficiency and optimality owing to its convex formulation. It is thought to generalize both k-means clustering and agglomerative clustering. However, it is not known whether convex clustering preserves desirable properties of these algorithms. A common expectation is that convex clustering may learn difficult cluster types such as non-convex ones. Current understanding of convex clustering is limited to only consistency results on well-separated clusters. We show new understanding of its solutions. We prove that convex clustering can only learn convex clusters. We then show that the clusters have disjoint bounding balls with significant gaps. We further characterize the solutions, regularization hyperparameters, inclusterable cases and consistency.

artificial intelligence, convex, machine learning, (16 more...)

arXiv.org Machine Learning

2105.08348

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Manifold-based Similarity Adaptation for Label Propagation

Karasuyama, Masayuki, Mamitsuka, Hiroshi

Neural Information Processing SystemsFeb-14-2020, 17:26:12 GMT

Label propagation is one of the state-of-the-art methods for semi-supervised learning, which estimates labels by propagating label information through a graph. Label propagation assumes that data points (nodes) connected in a graph should have similar labels. Consequently, the label estimation heavily depends on edge weights in a graph which represent similarity of each node pair. We propose a method for a graph to capture the manifold structure of input features using edge weights parameterized by a similarity function. In this approach, edge weights represent both similarity and local reconstruction weight simultaneously, both being reasonable for label propagation.

artificial intelligence, label propagation, machine learning, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Scalable Probabilistic Matrix Factorization with Graph-Based Priors

Strahl, Jonathan, Peltonen, Jaakko, Mamitsuka, Hiroshi, Kaski, Samuel

arXiv.org Machine LearningSep-11-2019

In matrix factorization, available graph side-information may not be well suited for the matrix completion problem, having edges that disagree with the latent-feature relations learnt from the incomplete data matrix. We show that removing these $\textit{contested}$ edges improves prediction accuracy and scalability. We identify the contested edges through a highly-efficient graphical lasso approximation. The identification and removal of contested edges adds no computational complexity to state-of-the-art graph-regularized matrix factorization, remaining linear with respect to the number of non-zeros. Computational load even decreases proportional to the number of edges removed. Formulating a probabilistic generative model and using expectation maximization to extend graph-regularised alternating least squares (GRALS) guarantees convergence. Rich simulated experiments illustrate the desired properties of the resulting algorithm. On real data experiments we demonstrate improved prediction accuracy with fewer graph edges (empirical evidence that graph side-information is often inaccurate). A 300 thousand dimensional graph with three million edges (Yahoo music side-information) can be analyzed in under ten minutes on a standard laptop computer demonstrating the efficiency of our graph update.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1908.09393

Country: Europe > Finland (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback

Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms

Wimalawarne, Kishan, Mamitsuka, Hiroshi

Neural Information Processing SystemsDec-31-2018

Coupled norms have emerged as a convex method to solve coupled tensor completion. A limitation with coupled norms is that they only induce low-rankness using the multilinear rank of coupled tensors. In this paper, we introduce a new set of coupled norms known as coupled nuclear norms by constraining the CP rank of coupled tensors. We propose new coupled completion models using the coupled nuclear norms as regularizers, which can be optimized using computationally efficient optimization methods. We derive excess risk bounds for proposed coupled completion models and show that proposed norms lead to better performance. Through simulation and real-data experiments, we demonstrate that proposed norms achieve better performance for coupled completion compared to existing coupled norms.

artificial intelligence, machine learning, tensor, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Add feedback

Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms

Wimalawarne, Kishan, Mamitsuka, Hiroshi

Neural Information Processing SystemsDec-31-2018

artificial intelligence, nuclear norm, optimization problem, (19 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Add feedback

Learning on Hypergraphs with Sparsity

Nguyen, Canh Hao, Mamitsuka, Hiroshi

arXiv.org Machine LearningApr-3-2018

Hypergraph is a general way of representing high-order relations on a set of objects. It is a generalization of graph, in which only pairwise relations can be represented. It finds applications in various domains where relationships of more than two objects are observed. On a hypergraph, as a generalization of graph, one wishes to learn a smooth function with respect to its topology. A fundamental issue is to find suitable smoothness measures of functions on the nodes of a graph/hypergraph. We show a general framework that generalizes previously proposed smoothness measures and also gives rise to new ones. To address the problem of irrelevant or noisy data, we wish to incorporate sparse learning framework into learning on hypergraphs. We propose sparsely smooth formulations that learn smooth functions and induce sparsity on hypergraphs at both hyperedge and node levels. We show their properties and sparse support recovery results. We conduct experiments to show that our sparsely smooth models have benefits to irrelevant and noisy data, and usually give similar or improved performances compared to dense models.

artificial intelligence, hyperedge, optimization problem, (16 more...)

arXiv.org Machine Learning

1804.00836

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Convex Coupled Matrix and Tensor Completion

Wimalawarne, Kishan, Yamada, Makoto, Mamitsuka, Hiroshi

arXiv.org Machine LearningMay-15-2017

We propose a set of convex low rank inducing norms for a coupled matrices and tensors (hereafter coupled tensors), which shares information between matrices and tensors through common modes. More specifically, we propose a mixture of the overlapped trace norm and the latent norms with the matrix trace norm, and then, we propose a new completion algorithm based on the proposed norms. A key advantage of the proposed norms is that it is convex and can find a globally optimal solution, while existing methods for coupled learning are non-convex. Furthermore, we analyze the excess risk bounds of the completion model regularized by our proposed norms which show that our proposed norms can exploit the low rankness of coupled tensors leading to better bounds compared to uncoupled norms. Through synthetic and real-world data experiments, we show that the proposed completion algorithm compares favorably with existing completion algorithms.

artificial intelligence, optimization problem, tensor, (19 more...)

arXiv.org Machine Learning

1705.05197

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data

Yamada, Makoto, Tang, Jiliang, Lugo-Martinez, Jose, Hodzic, Ermin, Shrestha, Raunak, Saha, Avishek, Ouyang, Hua, Yin, Dawei, Mamitsuka, Hiroshi, Sahinalp, Cenk, Radivojac, Predrag, Menczer, Filippo, Chang, Yi

arXiv.org Machine LearningAug-13-2016

Machine learning methods are used to discover complex nonlinear relationships in biological and medical data. However, sophisticated learning models are computationally unfeasible for data with millions of features. Here we introduce the first feature selection method for nonlinear learning problems that can scale up to large, ultra-high dimensional biological data. More specifically, we scale up the novel Hilbert-Schmidt Independence Criterion Lasso (HSIC Lasso) to handle millions of features with tens of thousand samples. The proposed method is guaranteed to find an optimal subset of maximally predictive features with minimal redundancy, yielding higher predictive power and improved interpretability. Its effectiveness is demonstrated through applications to classify phenotypes based on module expression in human prostate cancer patients and to detect enzymes among protein structures. We achieve high accuracy with as few as 20 out of one million features --- a dimensionality reduction of 99.998%. Our algorithm can be implemented on commodity cloud computing platforms. The dramatic reduction of features may lead to the ubiquitous deployment of sophisticated prediction models in mobile health care applications.

feature selection, health & medicine, oncology, (18 more...)

arXiv.org Machine Learning

1608.04048

Country:

North America > United States > Indiana (0.14)
North America > United States > California (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback