AITopics

2006.10092

Country:

North America > United States > Virginia > Fairfax County (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
North America > United States > Florida > Volusia County > Daytona Beach (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry: Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Bhattacharyya, Arnab, Desai, Rathin, Nagarajan, Sai Ganesh, Panageas, Ioannis

Efficient Statistics for Sparse Graphical Models from Truncated Samples

arXiv.org Machine LearningJun-17-2020

In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. (i) For Gaussian graphical models, suppose $d$-dimensional samples ${\bf x}$ are generated from a Gaussian $N(\mu,\Sigma)$ and observed only if they belong to a subset $S \subseteq \mathbb{R}^d$. We show that ${\mu}$ and ${\Sigma}$ can be estimated with error $\epsilon$ in the Frobenius norm, using $\tilde{O}\left(\frac{\textrm{nz}({\Sigma}^{-1})}{\epsilon^2}\right)$ samples from a truncated $\mathcal{N}({\mu},{\Sigma})$ and having access to a membership oracle for $S$. The set $S$ is assumed to have non-trivial measure under the unknown distribution but is otherwise arbitrary. (ii) For sparse linear regression, suppose samples $({\bf x},y)$ are generated where $y = {\bf x}^\top{{\Omega}^*} + \mathcal{N}(0,1)$ and $({\bf x}, y)$ is seen only if $y$ belongs to a truncation set $S \subseteq \mathbb{R}$. We consider the case that ${\Omega}^*$ is sparse with a support set of size $k$. Our main result is to establish precise conditions on the problem dimension $d$, the support size $k$, the number of observations $n$, and properties of the samples and the truncation that are sufficient to recover the support of ${\Omega}^*$. Specifically, we show that under some mild assumptions, only $O(k^2 \log d)$ samples are needed to estimate ${\Omega}^*$ in the $\ell_\infty$-norm up to a bounded error. For both problems, our estimator minimizes the sum of the finite population negative log-likelihood function and an $\ell_1$-regularization term.

artificial intelligence, machine learning, vec, (18 more...)

2006.09735

Country:

Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Plassier, Vincent, Portier, François, Segers, Johan

Risk bounds when learning infinitely many response functions by ordinary linear regression

Consider the problem of learning a large number of response functions simultaneously based on the same input variables. The training data consist of a single independent random sample of the input variables drawn from a common distribution together with the associated responses. The input variables are mapped into a high-dimensional linear space, called the feature space, and the response functions are modelled as linear functionals of the mapped features, with coefficients calibrated via ordinary least squares. We provide convergence guarantees on the worst-case excess prediction risk by controlling the convergence rate of the excess risk uniformly in the response function. The dimension of the feature map is allowed to tend to infinity with the sample size. The collection of response functions, although potentiallyinfinite, is supposed to have a finite Vapnik-Chervonenkis dimension. The bound derived can be applied when building multiple surrogate models in a reasonable computing time.

artificial intelligence, excess prediction risk, machine learning, (18 more...)

2006.09223

Country:

North America > United States > New York (0.04)
Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.64)

Pain Intensity Estimation from Mobile Video Using 2D and 3D Facial Keypoints

Lee, Matthew, Kennedy, Lyndon, Girgensohn, Andreas, Wilcox, Lynn, Lee, John Song En, Tan, Chin Wen, Sng, Ban Leong

For the more than 300 million surgeries performed worldwide every year, managing post-surgical pain is critical for successful surgical outcomes. Pain is the most prominent post-surgical concern, with an estimated 86% of surgical patients in the United States experiencing pain after surgery, with 75% of these patients reporting at least moderate to extreme pain [1]. Higher postoperative pain is associated with more postoperative complications [2], indicating the importance of pain management. Furthermore, the use of opioid analgesics is a powerful tool for managing pain but can pose risks of adverse drug events (experienced by 10% of surgical patients), leading to prolonged length of stay, high hospitalization costs, and potentially addiction [3]. Thus, regular and careful pain assessment is important for balancing between pain relief and potential side effects of powerful opioid analgesics [4]. However, one of the challenges of pain management is accurately assessing the pain level of patients. Pain is inherently subjective, and the personal experience of pain is difficult to observe and measure objectively by those not experiencing it [5] (e.g., care providers). The standard practice used in clinical care requires patients to self-report their pain intensity level using a numeric or visual scale, such as the popular Numerical Pain Rating Scale [6].

artificial intelligence, machine learning, sequence, (9 more...)

2006.12246

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Surgery (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science (0.93)
(4 more...)

Lederer, Armin, Conejo, Alejandro Jose Ordonez, Maier, Korbinian, Xiao, Wenxin, Hirche, Sandra

Real-Time Regression with Dividing Local Gaussian Processes

The increased demand for online prediction and the growing availability of large data sets drives the need for computationally efficient models. While exact Gaussian process regression shows various favorable theoretical properties (uncertainty estimate, unlimited expressive power), the poor scaling with respect to the training set size prohibits its application in big data regimes in real-time. Therefore, this paper proposes dividing local Gaussian processes, which are a novel, computationally efficient modeling approach based on Gaussian process regression. Due to an iterative, data-driven division of the input space, they achieve a sublinear computational complexity in the total number of training points in practice, while providing excellent predictive distributions. A numerical evaluation on real-world data sets shows their advantages over other state-of-the-art methods in terms of accuracy as well as prediction and update speed.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

2006.09446

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Costa Rica > Cartago Province > Cartago (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.48)

Industry:

Information Technology (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Coleman, Benjamin, Shrivastava, Anshumali

A One-Pass Private Sketch for Most Machine Learning Tasks

Differential privacy (DP) is a compelling privacy definition that explains the privacy-utility tradeoff via formal, provable guarantees. Inspired by recent progress toward general-purpose data release algorithms, we propose a private sketch, or small summary of the dataset, that supports a multitude of machine learning tasks including regression, classification, density estimation, near-neighbor search, and more. Our sketch consists of randomized contingency tables that are indexed with locality-sensitive hashing and constructed with an efficient one-pass algorithm. We prove competitive error bounds for DP kernel density estimation. Existing methods for DP kernel density estimation scale poorly, often exponentially slower with an increase in dimensions. In contrast, our sketch can quickly run on large, high-dimensional datasets in a single pass. Exhaustive experiments show that our generic sketch delivers a similar privacy-utility tradeoff when compared to existing DP methods at a fraction of the computation cost. We expect that our sketch will enable differential privacy in distributed, large-scale machine learning settings.

artificial intelligence, machine learning, privacy, (16 more...)

2006.09352

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Vansteelandt, Stijn, Dukes, Oliver

Assumption-lean inference for generalised linear model parameters

arXiv.org Machine LearningJun-15-2020

Inference for the parameters indexing generalised linear models is routinely based on the assumption that the model is correct and a priori specified. This is unsatisfactory because the chosen model is usually the result of a data-adaptive model selection process, which may induce excess uncertainty that is not usually acknowledged. Moreover, the assumptions encoded in the chosen model rarely represent some a priori known, ground truth, making standard inferences prone to bias, but also failing to give a pure reflection of the information that is contained in the data. Inspired by developments on assumption-free inference for so-called projection parameters, we here propose novel nonparametric definitions of main effect estimands and effect modification estimands. These reduce to standard main effect and effect modification parameters in generalised linear models when these models are correctly specified, but have the advantage that they continue to capture respectively the primary (conditional) association between two variables, or the degree to which two variables interact (in a statistical sense) in their effect on outcome, even when these models are misspecified. We achieve an assumption-lean inference for these estimands (and thus for the underlying regression parameters) by deriving their influence curve under the nonparametric model and invoking flexible data-adaptive (e.g., machine learning) procedures.

artificial intelligence, estimand, machine learning, (19 more...)

2006.08402

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

arXiv.org Machine LearningJun-15-2020

Cryptotree: fast and accurate predictions on encrypted structured data

Huynh, Daniel

Applying machine learning algorithms to private data, such as financial or medical data, while preserving their confidentiality, is a difficult task. Homomorphic Encryption (HE) is acknowledged for its ability to allow computation on encrypted data, where both the input and output are encrypted, which therefore enables secure inference on private data. Nonetheless, because of the constraints of HE, such as its inability to evaluate non-polynomial functions or to perform arbitrary matrix multiplication efficiently, only inference of linear models seem usable in practice in the HE paradigm so far. In this paper, we propose Cryptotree, a framework that enables the use of Random Forests (RF), a very powerful learning procedure compared to linear regression, in the context of HE. To this aim, we first convert a regular RF to a Neural RF, then adapt this to fit the HE scheme CKKS, which allows HE operations on real values. Through SIMD operations, we are able to have quick inference and prediction results better than the original RF on encrypted data.

artificial intelligence, machine learning, multiplication, (19 more...)

2006.08299

Country: Europe > France (0.04)

Genre: Research Report (0.51)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

#artificialintelligenceJun-14-2020, 22:33:11 GMT

Polynomial Regression -- explained

This article requires the knowledge of Linear Regression. If you haven't heard of it, then please check out an article on Linear Regression before you proceed here. Till now we assumed that the relationship between independent variable X and dependent Y can be represented with a straight line. But what if when we can't represent the relationship in a straight line because the data might not be linearly separable? In such kind of scenario we can look for polynomial regression.

artificial intelligence, machine learning, regression, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.77)

Meyer, Raphael A., Musco, Christopher

The Statistical Cost of Robust Kernel Hyperparameter Tuning

arXiv.org Machine LearningJun-14-2020

This paper studies the statistical complexity of kernel hyperparameter tuning in the setting of active regression under adversarial noise. We consider the problem of finding the best interpolant from a class of kernels with unknown hyperparameters, assuming only that the noise is square-integrable. We provide finite-sample guarantees for the problem, characterizing how increasing the complexity of the kernel class increases the complexity of learning kernel hyperparameters. For common kernel classes (e.g. squared-exponential kernels with unknown lengthscale), our results show that hyperparameter optimization increases sample complexity by just a logarithmic factor, in comparison to the setting where optimal parameters are known in advance. Our result is based on a subsampling guarantee for linear regression under multiple design matrices, combined with an {\epsilon}-net argument for discretizing kernel parameterizations.

hyperparameter, kernel, sample complexity, (13 more...)

2006.08035

Country:

North America > Canada > Quebec > Montreal (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > Richmond County > New York City (0.04)
(10 more...)

Genre: Research Report > New Finding (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)