AITopics | Obuchi, Tomoyuki

Collaborating Authors

Obuchi, Tomoyuki

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Analysis of High-dimensional Gaussian Labeled-unlabeled Mixture Model via Message-passing Algorithm

Gu, Xiaosi, Obuchi, Tomoyuki

arXiv.org Machine LearningNov-29-2024

Semi-supervised learning (SSL) is a machine learning methodology that leverages unlabeled data in conjunction with a limited amount of labeled data. Although SSL has been applied in various applications and its effectiveness has been empirically demonstrated, it is still not fully understood when and why SSL performs well. Some existing theoretical studies have attempted to address this issue by modeling classification problems using the so-called Gaussian Mixture Model (GMM). These studies provide notable and insightful interpretations. However, their analyses are focused on specific purposes, and a thorough investigation of the properties of GMM in the context of SSL has been lacking. In this paper, we conduct such a detailed analysis of the properties of the high-dimensional GMM for binary classification in the SSL setting. To this end, we employ the approximate message passing and state evolution methods, which are widely used in high-dimensional settings and originate from statistical mechanics. We deal with two estimation approaches: the Bayesian one and the l2-regularized maximum likelihood estimation (RMLE). We conduct a comprehensive comparison between these two approaches, examining aspects such as the global phase diagram, estimation error for the parameters, and prediction error for the labels. A specific comparison is made between the Bayes-optimal (BO) estimator and RMLE, as the BO setting provides optimal estimation performance and is ideal as a benchmark. Our analysis shows that with appropriate regularizations, RMLE can achieve near-optimal performance in terms of both the estimation error and prediction error, especially when there is a large amount of unlabeled data. These results demonstrate that the l2 regularization term plays an effective role in estimation and prediction in SSL approaches.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2411.19553

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Transfer Learning in $\ell_1$ Regularized Regression: Hyperparameter Selection Strategy based on Sharp Asymptotic Analysis

Okajima, Koki, Obuchi, Tomoyuki

arXiv.org Machine LearningSep-26-2024

Transfer learning techniques aim to leverage information from multiple related datasets to enhance prediction quality against a target dataset. Such methods have been adopted in the context of high-dimensional sparse regression, and some Lasso-based algorithms have been invented: Trans-Lasso and Pretraining Lasso are such examples. These algorithms require the statistician to select hyperparameters that control the extent and type of information transfer from related datasets. However, selection strategies for these hyperparameters, as well as the impact of these choices on the algorithm's performance, have been largely unexplored. To address this, we conduct a thorough, precise study of the algorithm in a high-dimensional setting via an asymptotic analysis using the replica method. Our approach reveals a surprisingly simple behavior of the algorithm: Ignoring one of the two types of information transferred to the fine-tuning stage has little effect on generalization performance, implying that efforts for hyperparameter selection can be significantly reduced. Our theoretical findings are also empirically supported by real-world applications on the IMDb dataset.

artificial intelligence, generalization error, machine learning, (17 more...)

arXiv.org Machine Learning

2409.17704

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

On Model Selection Consistency of Lasso for High-Dimensional Ising Models on Tree-like Graphs

Meng, Xiangming, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

arXiv.org Machine LearningOct-16-2021

We consider the problem of high-dimensional Ising model selection using neighborhood-based least absolute shrinkage and selection operator (Lasso). It is rigorously proved that under some mild coherence conditions on the population covariance matrix of the Ising model, consistent model selection can be achieved with sample sizes $n=\Omega{(d^3\log{p})}$ for any tree-like graph in the paramagnetic phase, where $p$ is the number of variables and $d$ is the maximum node degree. When the same conditions are imposed directly on the sample covariance matrices, it is shown that a reduced sample size $n=\Omega{(d^2\log{p})}$ suffices. The obtained sufficient conditions for consistent model selection with Lasso are the same in the scaling of the sample complexity as that of $\ell_1$-regularized logistic regression. Given the popularity and efficiency of Lasso, our rigorous analysis provides a theoretical backing for its practical use in Ising model selection.

artificial intelligence, machine learning, tanh 2, (15 more...)

arXiv.org Machine Learning

2110.085

Country:

Asia > Japan > Honshū (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Ising Model Selection Using $\ell_{1}$-Regularized Linear Regression

Meng, Xiangming, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

arXiv.org Artificial IntelligenceFeb-7-2021

We theoretically investigate the performance of $\ell_{1}$-regularized linear regression ($\ell_1$-LinR) for the problem of Ising model selection using the replica method from statistical mechanics. The regular random graph is considered under paramagnetic assumption. Our results show that despite model misspecification, the $\ell_1$-LinR estimator can successfully recover the graph structure of the Ising model with $N$ variables using $M=\mathcal{O}\left(\log N\right)$ samples, which is of the same order as that of $\ell_{1}$-regularized logistic regression. Moreover, we provide a computationally efficient method to accurately predict the non-asymptotic performance of the $\ell_1$-LinR estimator with moderate $M$ and $N$. Simulations show an excellent agreement between theoretical predictions and experimental results, which supports our findings.

artificial intelligence, estimator, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2102.03988

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)

Add feedback

Structure Learning in Inverse Ising Problems Using $\ell_2$-Regularized Linear Estimator

Meng, Xiangming, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

arXiv.org Machine LearningNov-23-2020

The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under the assumption that the teacher couplings are sparse, the analysis is conducted using the replica and cavity methods, with a special focus on whether the presence/absence of teacher couplings is correctly inferred or not. The result indicates that despite the model mismatch, one can perfectly identify the network structure using naive linear regression without regularization when the number of spins $N$ is smaller than the dataset size $M$, in the thermodynamic limit $N\to \infty$. Further, to access the underdetermined region $M < N$, we examine the effect of the $\ell_2$ regularization, and find that biases appear in all the coupling estimates, preventing the perfect identification of the network structure. We, however, find that the biases are shown to decay exponentially fast as the distance from the center spin chosen in the pseudolikelihood method grows. Based on this finding, we propose a two-stage estimator: In the first stage, the ridge regression is used and the estimates are pruned by a relatively small threshold; in the second stage the naive linear regression is conducted only on the remaining couplings, and the resultant estimates are again pruned by another relatively large threshold. This estimator with the appropriate regularization coefficient and thresholds is shown to achieve the perfect identification of the network structure even in $0

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1088/1742-5468/abfa10

2008.08342

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

Add feedback

Perfect Reconstruction of Sparse Signals via Greedy Monte-Carlo Search

Hayashi, Kao, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

arXiv.org Machine LearningAug-23-2020

We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorithm is called the greedy Monte-Carlo (GMC) search algorithm. Its performance is examined via numerical experiments, which suggests that in the noiseless case, GMC can achieve perfect reconstruction in undersampling situations of a reasonable level: it can outperform the $\ell_1$ relaxation but does not reach the algorithmic limit of MC-based methods theoretically clarified by an earlier analysis. Additionally, an experiment on a real-world dataset supports the practicality of GMC.

artificial intelligence, gmc, planning & scheduling, (17 more...)

arXiv.org Machine Learning

2008.03175

Country: Asia > Japan > Honshū (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.41)

Add feedback

Empirical Bayes Method for Boltzmann Machines

Yasuda, Muneki, Obuchi, Tomoyuki

arXiv.org Machine LearningJun-13-2019

In this study, we consider an empirical Bayes method for Boltzmann machines and propose an algorithm for it. The empirical Bayes method allows estimation of the values of the hyperparameters of the Boltzmann machine by maximizing a specific likelihood function referred to as the empirical Bayes likelihood function in this study. However, the maximization is computationally hard because the empirical Bayes likelihood function involves intractable integrations of the partition function. The proposed algorithm avoids this computational problem by using the replica method and the Plefka expansion. Our method does not require any iterative procedures and is quite simple and fast, though it introduces a bias to the estimate, which exhibits an unnatural behavior with respect to the size of the dataset. This peculiar behavior is supposed to be due to the approximate treatment by the Plefka expansion. A possible extension to overcome this behavior is also discussed.

artificial intelligence, free energy, machine learning, (14 more...)

arXiv.org Machine Learning

1906.06002

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.83)

Add feedback

Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration

Obuchi, Tomoyuki, Sakata, Ayaka

arXiv.org Machine LearningFeb-27-2019

We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the smoothly clipped absolute deviation (SCAD) penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with $\ell_1$ in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spin-glass theory, and associated phase diagrams are given. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the cross-validation error without actually conducting the cross-validation, which is also applicable to the non-i.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the consistency of the theoretical results and the efficiency of the approximate formula and nonconvexity annealing.

artificial intelligence, linear regression, machine learning, (18 more...)

arXiv.org Machine Learning

1902.10375

Country: Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)

Add feedback

Perfect reconstruction of sparse signals with piecewise continuous nonconvex penalties and nonconvexity control

Sakata, Ayaka, Obuchi, Tomoyuki

arXiv.org Machine LearningFeb-20-2019

We consider compressed sensing formulated as a minimization problem of nonconvex sparse penalties, Smoothly Clipped Absolute deviation (SCAD) and Minimax Concave Penalty (MCP). The nonconvexity of these penalties is controlled by nonconvexity parameters, and L1 penalty is contained as a limit with respect to these parameters. The analytically derived reconstruction limit overcomes that of L1 and the algorithmic limit in the Bayes-optimal setting, when the nonconvexity parameters have suitable values. For the practical usage, we apply the approximate message passing (AMP) to these nonconvex penalties. We show that the performance of AMP is considerably improved by controlling nonconvexity parameters.

artificial intelligence, nonconvexity parameter, reconstruction, (16 more...)

arXiv.org Machine Learning

1902.07436

Country: Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Mean-field theory of graph neural networks in graph partitioning

Kawamoto, Tatsuro, Tsubaki, Masashi, Obuchi, Tomoyuki

Neural Information Processing SystemsDec-31-2018

A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. Moreover, whether the achieved performance is predominately a result of the backpropagation or the architecture itself is a matter of considerable interest. To gain a better insight into these questions, a mean-field theory of a minimal GNN architecture is developed for the graph partitioning problem. This demonstrates a good agreement with numerical experiments.

deep learning, gnn, neural network, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback