AITopics

2501.01783

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-29-2023

A Bayesian sparse factor model with adaptive posterior concentration

Ohn, Ilsang, Lin, Lizhen, Kim, Yongdai

In this paper, we propose a new Bayesian inference method for a high-dimensional sparse factor model that allows both the factor dimensionality and the sparse structure of the loading matrix to be inferred. The novelty is to introduce a certain dependence between the sparsity level and the factor dimensionality, which leads to adaptive posterior concentration while keeping computational tractability. We show that the posterior distribution asymptotically concentrates on the true factor dimensionality, and more importantly, this posterior consistency is adaptive to the sparsity level of the true loading matrix and the noise variance. We also prove that the proposed Bayesian model attains the optimal detection rate of the factor dimensionality in a more general situation than those found in the literature. Moreover, we obtain a near-optimal posterior concentration rate of the covariance matrix. Numerical studies are conducted and show the superiority of the proposed method compared with other competitors.

artificial intelligence, factor dimensionality, machine learning, (19 more...)

2305.18488

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

arXiv.org Artificial IntelligenceMay-24-2023

Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference

Kong, Insung, Yang, Dongyoon, Lee, Jongjin, Ohn, Ilsang, Baek, Gyuseung, Kim, Yongdai

Bayesian approaches for learning deep neural networks (BNN) have been received much attention and successfully applied to various applications. Particularly, BNNs have the merit of having better generalization ability as well as better uncertainty quantification. For the success of BNN, search an appropriate architecture of the neural networks is an important task, and various algorithms to find good sparse neural networks have been proposed. In this paper, we propose a new node-sparse BNN model which has good theoretical properties and is computationally feasible. We prove that the posterior concentration rate to the true model is near minimax optimal and adaptive to the smoothness of the true model. In particular the adaptiveness is the first of its kind for node-sparse BNNs. In addition, we develop a novel MCMC algorithm which makes the Bayesian inference of the node-sparse BNN model feasible in practice.

artificial intelligence, machine learning, mbnn, (14 more...)

2305.14765

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceMay-23-2023

Masked Bayesian Neural Networks : Computation and Optimality

Kong, Insung, Yang, Dongyoon, Lee, Jongjin, Ohn, Ilsang, Kim, Yongdai

As data size and computing power increase, the architectures of deep neural networks (DNNs) have been getting more complex and huge, and thus there is a growing need to simplify such complex and huge DNNs. In this paper, we propose a novel sparse Bayesian neural network (BNN) which searches a good DNN with an appropriate complexity. We employ the masking variables at each node which can turn off some nodes according to the posterior distribution to yield a nodewise sparse DNN. We devise a prior distribution such that the posterior distribution has theoretical optimalities (i.e. minimax optimality and adaptiveness), and develop an efficient MCMC algorithm. By analyzing several benchmark datasets, we illustrate that the proposed BNN performs well compared to other existing methods in the sense that it discovers well condensed DNN architectures with similar prediction accuracy and uncertainty quantification compared to large DNNs.

artificial intelligence, machine learning, masked bayesian neural network, (1 more...)

2206.00853

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceFeb-16-2023

Intrinsic and extrinsic deep learning on manifolds

Fang, Yihao, Ohn, Ilsang, Gupta, Vijay, Lin, Lizhen

We propose extrinsic and intrinsic deep neural network architectures as general frameworks for deep learning on manifolds. Specifically, extrinsic deep neural networks (eDNNs) preserve geometric features on manifolds by utilizing an equivariant embedding from the manifold to its image in the Euclidean space. Moreover, intrinsic deep neural networks (iDNNs) incorporate the underlying intrinsic geometry of manifolds via exponential and log maps with respect to a Riemannian structure. Consequently, we prove that the empirical risk of the empirical risk minimizers (ERM) of eDNNs and iDNNs converge in optimal rates. Overall, The eDNNs framework is simple and easy to compute, while the iDNNs framework is accurate and fast converging. To demonstrate the utilities of our framework, various simulation studies, and real data analyses are presented with eDNNs and iDNNs.

artificial intelligence, machine learning, manifold, (17 more...)

2302.08606

Country: Europe (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningFeb-17-2022

Learning fair representation with a parametric integral probability metric

Kim, Dongha, Kim, Kunwoong, Kong, Insung, Ohn, Ilsang, Kim, Yongdai

As they have a vital effect on social decision-making, AI algorithms should be not only accurate but also fair. Among various algorithms for fairness AI, learning fair representation (LFR), whose goal is to find a fair representation with respect to sensitive variables such as gender and race, has received much attention. For LFR, the adversarial training scheme is popularly employed as is done in the generative adversarial network type algorithms. The choice of a discriminator, however, is done heuristically without justification. In this paper, we propose a new adversarial training scheme for LFR, where the integral probability metric (IPM) with a specific parametric family of discriminators is used. The most notable result of the proposed LFR algorithm is its theoretical guarantee about the fairness of the final prediction model, which has not been considered yet. That is, we derive theoretical relations between the fairness of representation and the fairness of the prediction model built on the top of the representation (i.e., using the representation as the input). Moreover, by numerical experiments, we show that our proposed LFR algorithm is computationally lighter and more stable, and the final prediction model is competitive or superior to other LFR algorithms using more complex discriminators.

artificial intelligence, machine learning, parametric integral probability metric, (1 more...)

2202.02943

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

arXiv.org Machine LearningFeb-7-2022

SLIDE: a surrogate fairness constraint to ensure fairness consistency

Kim, Kunwoong, Ohn, Ilsang, Kim, Sara, Kim, Yongdai

As they have a vital effect on social decision makings, AI algorithms should be not only accurate and but also fair. Among various algorithms for fairness AI, learning a prediction model by minimizing the empirical risk (e.g., cross-entropy) subject to a given fairness constraint has received much attention. To avoid computational difficulty, however, a given fairness constraint is replaced by a surrogate fairness constraint as the 0-1 loss is replaced by a convex surrogate loss for classification problems. In this paper, we investigate the validity of existing surrogate fairness constraints and propose a new surrogate fairness constraint called SLIDE, which is computationally feasible and asymptotically valid in the sense that the learned model satisfies the fairness constraint asymptotically and achieves a fast convergence rate. Numerical experiments confirm that the SLIDE works well for various benchmark datasets.

artificial intelligence, machine learning, surrogate fairness constraint, (1 more...)

2202.03165

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.87)

arXiv.org Machine LearningSep-7-2021

Adaptive variational Bayes: Optimality, computation and applications

Ohn, Ilsang, Lin, Lizhen

In this paper, we explore adaptive inference based on variational Bayes. Although a number of studies have been conducted to analyze contraction properties of variational posteriors, there is still a lack of a general and computationally tractable variational Bayes method that can achieve adaptive optimal contraction of the variational posterior. We propose a novel variational Bayes framework, called adaptive variational Bayes, which can operate on a collection of models with varying dimensions and structures. The proposed framework combines variational posteriors over individual models with certain weights to obtain a variational posterior over the entire model. It turns out that this combined variational posterior minimizes the Kullback-Leibler divergence to the original posterior distribution. We show that the proposed variational posterior achieves optimal contraction rates adaptively under very general conditions and attains model selection consistency when the true model structure exists. We apply the general results obtained for the adaptive variational Bayes to several examples including deep learning models and derive some new and adaptive inference results. Moreover, we consider the use of quasi-likelihood in our framework. We formulate conditions on the quasi-likelihood to ensure the adaptive optimality and discuss specific applications to stochastic block models and nonparametric regression with sub-Gaussian errors.

deep learning, neural network, variational posterior, (18 more...)

2109.03204

Genre: Research Report (1.00)

arXiv.org Machine LearningJun-17-2019

Smooth function approximation by deep neural networks with general activation functions

Ohn, Ilsang, Kim, Yongdai

There has been a growing interest in expressivity of deep neural networks. But most of existing work about this topic focus only on the specific activation function such as ReLU or sigmoid. In this paper, we investigate the approximation ability of deep neural networks with a quite general class of activation functions. This class of activation functions includes most of frequently used activation functions. We derive the required depth, width and sparsity of a deep neural network to approximate any H\"older smooth function upto a given approximation error for the large class of activation functions. Based on our approximation error analysis, we derive the minimax optimality of the deep neural network estimators with the general activation functions in both regression and classification problems.

activation function, deep learning, neural network, (16 more...)

1906.06903

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningDec-9-2018

Fast convergence rates of deep neural networks for classification

Kim, Yongdai, Ohn, Ilsang, Kim, Dongha

We derive the fast convergence rates of a deep neural network (DNN) classifier with the rectified linear unit (ReLU) activation function learned using the hinge loss. We consider three cases for a true model: (1) a smooth decision boundary, (2) smooth conditional class probability, and (3) the margin condition (i.e., the probability of inputs near the decision boundary is small). We show that the DNN classifier learned using the hinge loss achieves fast rate convergences for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity). is carefully selected. An important implication is that DNN architectures are very flexible for use in various cases without much modification. In addition, we consider a DNN classifier learned by minimizing the cross-entropy, and show that the DNN classifier achieves a fast convergence rate under the condition that the conditional class probabilities of most data are sufficiently close to either 1 or zero. This assumption is not unusual for image recognition because human beings are extremely good at recognizing most images. To confirm our theoretical explanation, we present the results of a small numerical study conducted to compare the hinge loss and cross-entropy.

convergence rate, deep learning, neural network, (14 more...)

1812.03599

Country: Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)