Goto

Collaborating Authors

 Regression


A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation

arXiv.org Machine Learning

The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train over $14000$ models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on eight data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, different evaluation metrics do not always agree on what should be considered "disentangled" and exhibit systematic differences in the estimation. Finally, increased disentanglement does not seem to necessarily lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.


Sparse Symmetric Tensor Regression for Functional Connectivity Analysis

arXiv.org Machine Learning

Tensor regression models, such as CP regression and Tucker regression, have many successful applications in neuroimaging analysis where the covariates are of ultrahigh dimensionality and possess complex spatial structures. The high-dimensional covariate arrays, also known as tensors, can be approximated by low-rank structures and fit into the generalized linear models. The resulting tensor regression achieves a significant reduction in dimensionality while remaining efficient in estimation and prediction. Brain functional connectivity is an essential measure of brain activity and has shown significant association with neurological disorders such as Alzheimer's disease. The symmetry nature of functional connectivity is a property that has not been explored in previous tensor regression models. In this work, we propose a sparse symmetric tensor regression that further reduces the number of free parameters and achieves superior performance over symmetrized and ordinary CP regression, under a variety of simulation settings. We apply the proposed method to a study of Alzheimer's disease (AD) and normal ageing from the Berkeley Aging Cohort Study (BACS) and detect two regions of interest that have been identified important to AD.


Information theoretic limits of learning a sparse rule

arXiv.org Machine Learning

We consider generalized linear models in regimes where the number of nonzero components of the signal and accessible data points are sublinear with respect to the size of the signal. We prove a variational formula for the asymptotic mutual information per sample when the system size grows to infinity. This result allows us to derive an expression for the minimum mean-square error (MMSE) of the Bayesian estimator when the signal entries have a discrete distribution with finite support. We find that, for such signals and suitable vanishing scalings of the sparsity and sampling rate, the MMSE is nonincreasing piecewise constant. In specific instances the MMSE even displays an all-or-nothing phase transition, that is, the MMSE sharply jumps from its maximum value to zero at a critical sampling rate. The all-or-nothing phenomenon has previously been shown to occur in high-dimensional linear regression. Our analysis goes beyond the linear case and applies to learning the weights of a perceptron with general activation function in a teacher-student scenario. In particular, we discuss an all-or-nothing phenomenon for the generalization error with a sublinear set of training examples.


Using a Binary Classification Model to Predict the Likelihood of Enrolment to the Undergraduate Program of a Philippine University

arXiv.org Artificial Intelligence

With the recent implementation of the K to 12 Program, academic institutions, specifically, Colleges and Universities in the Philippines have been faced with difficulties in determining projected freshmen enrollees vis-a-vis decision-making factors for efficient resource management. Enrollment targets directly impacts success factors of Higher Education Institutions. This study covered an analysis of various characteristics of freshmen applicants affecting their admission status in a Philippine university. A predictive model was developed using Logistic Regression to evaluate the probability that an admitted student will pursue to enroll in the Institution or not. The dataset used was acquired from the University Admissions Office. The office designed an online application form to capture applicants' details. The online form was distributed to all student applicants, and most often, students, tend to provide incomplete information. Despite this fact, student characteristics, as well as geographic and demographic data based on the students' location are significant predictors of enrollment decision. The results of the study show that given limited information about prospective students, Higher Education Institutions can implement machine learning techniques to supplement management decisions and provide estimates of class sizes, in this way, it will allow the institution to optimize the allocation of resources and will have better control over net tuition revenue.


Expectile Neural Networks for Genetic Data Analysis of Complex Diseases

arXiv.org Machine Learning

The genetic etiologies of common diseases are highly complex and heterogeneous. Classic statistical methods, such as linear regression, have successfully identified numerous genetic variants associated with complex diseases. Nonetheless, for most complex diseases, the identified variants only account for a small proportion of heritability. Challenges remain to discover additional variants contributing to complex diseases. Expectile regression is a generalization of linear regression and provides completed information on the conditional distribution of a phenotype of interest. While expectile regression has many nice properties and holds great promise for genetic data analyses (e.g., investigating genetic variants predisposing to a high-risk population), it has been rarely used in genetic research. In this paper, we develop an expectile neural network (ENN) method for genetic data analyses of complex diseases. Similar to expectile regression, ENN provides a comprehensive view of relationships between genetic variants and disease phenotypes and can be used to discover genetic variants predisposing to sub-populations (e.g., high-risk groups). We further integrate the idea of neural networks into ENN, making it capable of capturing non-linear and non-additive genetic effects (e.g., gene-gene interactions). Through simulations, we showed that the proposed method outperformed an existing expectile regression when there exist complex relationships between genetic variants and disease phenotypes. We also applied the proposed method to the genetic data from the Study of Addiction: Genetics and Environment(SAGE), investigating the relationships of candidate genes with smoking quantity.


Ordinal Neural Network Transformation Models: Deep and interpretable regression models for ordinal outcomes

arXiv.org Machine Learning

Outcomes with a natural order commonly occur in prediction tasks and oftentimes the available input data are a mixture of complex data, like images, and tabular predictors. Although deep Learning (DL) methods have shown outstanding performance on image classification, most models treat ordered outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models yield interpretable predictor effects but are limited to tabular input data. Here, we present the highly modular class of ordinal neural network transformation models (ONTRAMs). Transformation models use a parametric transformation function and a simple distribution to trade off flexibility and interpretability of individual model components. In ONTRAMs, this trade-off is achieved by additively decomposing the transformation function into terms for the tabular and image data using a set of jointly trained neural networks. We show that the most flexible ONTRAMs achieve on-par performance with DL classifiers while outperforming them in training speed. We discuss how to interpret components of ONTRAMs in general and in the case of correlated tabular and image data. Taken together, ONTRAMs join benefits of DL and distributional regression to create interpretable prediction models for ordinal outcomes.


Estimating heterogeneous survival treatment effect in observational data using machine learning

arXiv.org Machine Learning

Methods for estimating heterogeneous treatment effect in observational data have largely focused on continuous or binary outcomes, and have been relatively less vetted with survival outcomes. Using flexible machine learning methods in the counterfactual framework is a promising approach to address challenges due to complex individual characteristics, to which treatments need to be tailored. To evaluate the operating characteristics of recent survival machine learning methods for the estimation of TEH and inform better practice, we carry out a comprehensive simulation study representing a variety of confounded heterogeneous survival treatment effect settings and varying degrees of covariate overlap. Our results indicate that the nonparametric Bayesian Additive Regression Trees within the framework of accelerated failure time model (AFT-BART-NP) consistently carries the best performance, both in terms of bias and precision. Moreover, the credible interval estimators from AFT-BART-NP provide close to nominal frequentist coverage for the individual survival treatment effect when the covariate overlap is at least moderate. Under lack of overlap, where accurate estimation of the average treatment effect becomes challenging, the credible intervals from AFT-BART-NP still provide nominal frequentist coverage among units near the centroid of the propensity score distribution. Finally, we demonstrate the application of these machine learning methods through a comprehensive case study examining the heterogeneous survival effects of two radiotherapy approaches for localized high-risk prostate cancer.


How to code Logistic Regression from scratch with NumPy

#artificialintelligence

Let's first think of the underlying math that we want to use. In the above equations, X is the input matrix that contains observations on the row axis and features on the column axis; y is a column vector that contains the classification labels (0 or 1); f is the sum of squared errors loss function; h is the loss function for the MLE method. So, this is our goal: translate the above equations into code. We plan to use an object-oriented approach for implementation. We'll create a LogisticRegression class with 3 public methods: fit(), predict(), and accuracy().


Linear Regression and Logistic Regression using R Studio

#artificialintelligence

In this section we will learn - What does Machine Learning mean. What are the meanings or different terms associated with machine learning? You will see some examples so that you understand what machine learning actually is. It also contains steps involved in building a machine learning model, not just linear models, any machine learning model.


Latent Network Estimation and Variable Selection for Compositional Data via Variational EM

arXiv.org Machine Learning

Network estimation and variable selection have been extensively studied in the statistical literature, but only recently have those two challenges been addressed simultaneously. In this paper, we seek to develop a novel method to simultaneously estimate network interactions and associations to relevant covariates for count data, and specifically for compositional data, which have a fixed sum constraint. We use a hierarchical Bayesian model with latent layers and employ spike-and-slab priors for both edge and covariate selection. For posterior inference, we develop a variational inference scheme with an expectation maximization step, to enable efficient estimation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of network recovery. We show the practical utility of our model via an application to microbiome data. The human microbiome has been shown to contribute to many of the functions of the human body, and also to be linked with a number of diseases. In our application, we seek to better understand the interaction between microbes and relevant covariates, as well as the interaction of microbes with each other. We provide a Python implementation of our algorithm, called SINC (Simultaneous Inference for Networks and Covariates), available online.