Goto

Collaborating Authors

 Takeuchi, Ichiro


Enhancing Exploration in Latent Space Bayesian Optimization

arXiv.org Artificial Intelligence

Latent Space Bayesian Optimization (LSBO) combines generative models, typically Variational Autoencoders (VAE), with Bayesian Optimization (BO) to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor extrapolation capabilities. In this paper, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the BO-VAE mismatch. To address this, we propose the Latent Consistent Aware-Acquisition Function (LCA-AF) that leverages consistent regions in LSBO. Additionally, we present LCA-VAE, a novel VAE method that generates a latent space with increased consistent points, improving BO's extrapolation capabilities. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Experimental evaluations validate the improved performance of LCA-LSBO in image generation and de-novo chemical design tasks, showcasing its enhanced extrapolation capabilities in LSBO. Our approach achieves high sample-efficiency and effective exploration, emphasizing the significance of addressing latent consistency and leveraging LCA-VAE in LSBO.


Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning

arXiv.org Artificial Intelligence

Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the Safe Pattern Pruning (SPP) method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences.


Generalized Low-Rank Update: Model Parameter Bounds for Low-Rank Training Data Modifications

arXiv.org Artificial Intelligence

In this study, we have developed an incremental machine learning (ML) method that efficiently obtains the optimal model when a small number of instances or features are added or removed. This problem holds practical importance in model selection, such as cross-validation (CV) and feature selection. Among the class of ML methods known as linear estimators, there exists an efficient model update framework called the low-rank update that can effectively handle changes in a small number of rows and columns within the data matrix. However, for ML methods beyond linear estimators, there is currently no comprehensive framework available to obtain knowledge about the updated solution within a specific computational complexity. In light of this, our study introduces a method called the Generalized Low-Rank Update (GLRU) which extends the low-rank update framework of linear estimators to ML methods formulated as a certain class of regularized empirical risk minimization, including commonly used methods such as SVM and logistic regression. The proposed GLRU method not only expands the range of its applicability but also provides information about the updated solutions with a computational complexity proportional to the amount of dataset changes. To demonstrate the effectiveness of the GLRU method, we conduct experiments showcasing its efficiency in performing cross-validation and feature selection compared to other baseline methods.


Human-In-the-Loop for Bayesian Autonomous Materials Phase Mapping

arXiv.org Artificial Intelligence

Autonomous experimentation (AE) combines machine learning and research hardware automation in a closed loop, guiding subsequent experiments toward user goals. As applied to materials research, AE can accelerate materials exploration, reducing time and cost compared to traditional Edisonian studies. Additionally, integrating knowledge from diverse sources including theory, simulations, literature, and domain experts can boost AE performance. Domain experts may provide unique knowledge addressing tasks that are difficult to automate. Here, we present a set of methods for integrating human input into an autonomous materials exploration campaign for composition-structure phase mapping. The methods are demonstrated on x-ray diffraction data collected from a thin film ternary combinatorial library. At any point during the campaign, the user can choose to provide input by indicating regions-of-interest, likely phase regions, and likely phase boundaries based on their prior knowledge (e.g., knowledge of the phase map of a similar material system), along with quantifying their certainty. The human input is integrated by defining a set of probabilistic priors over the phase map. Algorithm output is a probabilistic distribution over potential phase maps, given the data, model, and human input. We demonstrate a significant improvement in phase mapping performance given appropriate human input.


Adaptive Defective Area Identification in Material Surface Using Active Transfer Learning-based Level Set Estimation

arXiv.org Artificial Intelligence

In material characterization, identifying defective areas on a material surface is fundamental. The conventional approach involves measuring the relevant physical properties point-by-point at the predetermined mesh grid points on the surface and determining the area at which the property does not reach the desired level. To identify defective areas more efficiently, we propose adaptive mapping methods in which measurement resources are used preferentially to detect the boundaries of defective areas. We interpret this problem as an active-learning (AL) of the level set estimation (LSE) problem. The goal of AL-based LSE is to determine the level set of the physical property function defined on the surface with as small number of measurements as possible. Furthermore, to handle the situations in which materials with similar specifications are repeatedly produced, we introduce a transfer learning approach so that the information of previously produced materials can be effectively utilized. As a proof-of-concept, we applied the proposed methods to the red-zone estimation problem of silicon wafers and demonstrated that we could identify the defective areas with significantly lower measurement costs than those of conventional methods.


Valid P-Value for Deep Learning-Driven Salient Region

arXiv.org Artificial Intelligence

Various saliency map methods have been proposed to interpret and explain predictions of deep learning models. Saliency maps allow us to interpret which parts of the input signals have a strong influence on the prediction results. However, since a saliency map is obtained by complex computations in deep learning models, it is often difficult to know how reliable the saliency map itself is. In this study, we propose a method to quantify the reliability of a salient region in the form of p-values. Our idea is to consider a salient region as a selected hypothesis by the trained deep learning model and employ the selective inference framework. The proposed method can provably control the probability of false positive detections of salient regions. We demonstrate the validity of the proposed method through numerical examples in synthetic and real datasets. Furthermore, we develop a Keras-based framework for conducting the proposed selective inference for a wide class of CNNs without additional implementation cost.


Bayesian Optimization for Distributionally Robust Chance-constrained Problem

arXiv.org Machine Learning

Under the presence of these two types of variables, the goal is to identify the design variables that optimize the black-box function by taking into account the uncertainty of environmental variables. In the past few years, Bayesian Optimization (BO) framework that takes the uncertain environmental variables into considerations have been studied in various setups (see §1.1). In this paper, we study one of such problems called distributionally robust chance-constrained (DRCC) problem. The DRCC problem is an instance of constrained optimization problems in an uncertain environment, which is important in a variety of practical problems in science and engineering. The goal of a CC problem is to identify the design variables that maximize the expectation of the objective function under the constraint that the probability of the constraint function exceeding a given threshold is greater than a certain level. Let f(x, w) and g(x, w) be the unknown objective and constraint functions, respectively, both of which depend on the design variables x X and the environmental variables w Ω.


Bayesian Optimization for Cascade-type Multi-stage Processes

arXiv.org Machine Learning

Complex processes in science and engineering are often formulated as multi-stage decision-making problems. In this paper, we consider a type of multi-stage decision-making process called a cascade process. A cascade process is a multi-stage process in which the output of one stage is used as an input for the next stage. When the cost of each stage is expensive, it is difficult to search for the optimal controllable parameters for each stage exhaustively. To address this problem, we formulate the optimization of the cascade process as an extension of Bayesian optimization framework and propose two types of acquisition functions (AFs) based on credible intervals and expected improvement. We investigate the theoretical properties of the proposed AFs and demonstrate their effectiveness through numerical experiments. In addition, we consider an extension called suspension setting in which we are allowed to suspend the cascade process at the middle of the multi-stage decision-making process that often arises in practical problems. We apply the proposed method in the optimization problem of the solar cell simulator, which was the motivation for this study.


Valid and Exact Statistical Inference for Multi-dimensional Multiple Change-Points by Selective Inference

arXiv.org Machine Learning

In this paper, we study statistical inference of change-points (CPs) in multi-dimensional sequence. In CP detection from a multi-dimensional sequence, it is often desirable not only to detect the location, but also to identify the subset of the components in which the change occurs. Several algorithms have been proposed for such problems, but no valid exact inference method has been established to evaluate the statistical reliability of the detected locations and components. In this study, we propose a method that can guarantee the statistical reliability of both the location and the components of the detected changes. We demonstrate the effectiveness of the proposed method by applying it to the problems of genomic abnormality identification and human behavior analysis.


Exact Statistical Inference for the Wasserstein Distance by Selective Inference

arXiv.org Machine Learning

In this paper, we study statistical inference for the Wasserstein distance, which has attracted much attention and has been applied to various machine learning tasks. Several studies have been proposed in the literature, but almost all of them are based on asymptotic approximation and do not have finite-sample validity. In this study, we propose an exact (non-asymptotic) inference method for the Wasserstein distance inspired by the concept of conditional Selective Inference (SI). To our knowledge, this is the first method that can provide a valid confidence interval (CI) for the Wasserstein distance with finite-sample coverage guarantee, which can be applied not only to one-dimensional problems but also to multi-dimensional problems. We evaluate the performance of the proposed method on both synthetic and real-world datasets.