Goto

Collaborating Authors

 Yamaguchi, Shoichiro


Virtual Human Generative Model: Masked Modeling Approach for Learning Human Characteristics

arXiv.org Artificial Intelligence

Identifying the relationship between healthcare attributes, lifestyles, and personality is vital for understanding and improving physical and mental conditions. Machine learning approaches are promising for modeling their relationships and offering actionable suggestions. In this paper, we propose Virtual Human Generative Model (VHGM), a machine learning model for estimating attributes about healthcare, lifestyles, and personalities. VHGM is a deep generative model trained with masked modeling to learn the joint distribution of attributes conditioned on known ones. Using heterogeneous tabular datasets, VHGM learns more than 1,800 attributes efficiently. We numerically evaluate the performance of VHGM and its training techniques. As a proof-of-concept of VHGM, we present several applications demonstrating user scenarios, such as virtual measurements of healthcare attributes and hypothesis verifications of lifestyles.


Learning-based Collision-free Planning on Arbitrary Optimization Criteria in the Latent Space through cGANs

arXiv.org Artificial Intelligence

We propose a new method for collision-free planning using Conditional Generative Adversarial Networks (cGANs) to transform between the robot's joint space and a latent space that captures only collision-free areas of the joint space, conditioned by an obstacle map. Generating multiple plausible trajectories is convenient in applications such as the manipulation of a robot arm by enabling the selection of trajectories that avoids collision with the robot or surrounding environment. In the proposed method, various trajectories that avoid obstacles can be generated by connecting the start and goal state with arbitrary line segments in this generated latent space. Our method provides this collision-free latent space, after which any planner, using any optimization conditions, can be used to generate the most suitable paths on the fly. We successfully verified this method with a simulated and actual UR5e 6-DoF robotic arm. We confirmed that different trajectories could be generated depending on optimization conditions.


Out-of-Distribution Generalization with Maximal Invariant Predictor

arXiv.org Machine Learning

Out-of-Distribution (OOD) generalization problem is a problem of seeking the predictor function whose performance in the worst environments is optimal. This paper makes two contributions to OOD problem. We first use the basic results of probability to prove Maximal Invariant Predictor(MIP) condition, a theoretical result that can be used to identify the OOD optimal solution. We then use our MIP to derive Inter-environmental Gradient Alignment (IGA) algorithm that can be used to help seek the OOD optimal predictor. Previous studies that have investigated the theoretical aspect of the OOD problem use strong structural assumptions such as causal DAG. However, in cases involving image datasets, for example, the identification of hidden structural relations is itself a difficult problem. Our theoretical results are different from those of many previous studies in that it can be applied to cases in which the underlying structure of dataset is difficult to analyze. We present an extensive comparison of previous theoretical approaches to the OOD problems based on the assumptions they make. We also present an extension of the Colored-MNIST that can more accurately represent the pathological OOD situation than the original version, and demonstrate the superiority of IGA over previous methods on both the original and the extended version of Colored-MNIST.


Data Interpolating Prediction: Alternative Interpretation of Mixup

arXiv.org Machine Learning

Data augmentation by mixing samples, such as Mixup, has widely been used typically for classification tasks. However, this strategy is not always effective due to the gap between augmented samples for training and original samples for testing. This gap may prevent a classifier from learning the optimal decision boundary and increase the generalization error. To overcome this problem, we propose an alternative framework called Data Interpolating Prediction (DIP). Unlike common data augmentations, we encapsulate the sample-mixing process in the hypothesis class of a classifier so that train and test samples are treated equally. We derive the generalization bound and show that DIP helps to reduce the original Rademacher complexity. Also, we empirically demonstrate that DIP can outperform existing Mixup.


Semi-flat minima and saddle points by embedding neural networks to overparameterization

arXiv.org Machine Learning

We theoretically study the landscape of the training error for neural networks in overparameterized cases. We consider three basic methods for embedding a network into a wider one with more hidden units, and discuss whether a minimum point of the narrower network gives a minimum or saddle point of the wider one. Our results show that the networks with smooth and ReLU activation have different partially flat landscapes around the embedded point. We also relate these results to a difference of their generalization abilities in overparameterized realization.


A Differentiable Gaussian-like Distribution on Hyperbolic Space for Gradient-Based Learning

arXiv.org Machine Learning

Hyperbolic space is a geometry that is known to be well-suited for representation learning of data with an underlying hierarchical structure. In this paper, we present a novel hyperbolic distribution called \textit{pseudo-hyperbolic Gaussian}, a Gaussian-like distribution on hyperbolic space whose density can be evaluated analytically and differentiated with respect to the parameters. Our distribution enables the gradient-based learning of the probabilistic models on hyperbolic space that could never have been considered before. Also, we can sample from this hyperbolic probability distribution without resorting to auxiliary means like rejection sampling. As applications of our distribution, we develop a hyperbolic-analog of variational autoencoder and a method of probabilistic word embedding on hyperbolic space. We demonstrate the efficacy of our distribution on various datasets including MNIST, Atari 2600 Breakout, and WordNet.