AITopics

arXiv.org Artificial IntelligenceJun-18-2020

AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

Udrescu, Silviu-Marian, Tan, Andrew, Feng, Jiahai, Neto, Orisvaldo, Wu, Tailin, Tegmark, Max

We present an improved method for symbolic regression that seeks to fit data to formulas that are Pareto-optimal, in the sense of having the best accuracy for a given complexity. It improves on the previous state-of-the-art by typically being orders of magnitude more robust toward noise and bad data, and also by discovering many formulas that stumped previous methods. We develop a method for discovering generalized symmetries (arbitrary modularity in the computational graph of a formula) from gradient properties of a neural network fit. We use normalizing flows to generalize our symbolic regression method to probability distributions from which we only have samples, and employ statistical hypothesis testing to accelerate robust brute-force search.

artificial intelligence, machine learning, regression, (16 more...)

arXiv.org Artificial Intelligence

2006.10782

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Davoodnia, Vandad, Slinowsky, Monet, Etemad, Ali

Deep Multitask Learning for Pervasive BMI Estimation and Identity Recognition in Smart Beds

arXiv.org Artificial IntelligenceJun-18-2020

Smart devices in the Internet of Things (IoT) paradigm provide a variety of unobtrusive and pervasive means for continuous monitoring of bio-metrics and health information. Furthermore, automated personalization and authentication through such smart systems can enable better user experience and security. In this paper, simultaneous estimation and monitoring of body mass index (BMI) and user identity recognition through a unified machine learning framework using smart beds is explored. To this end, we utilize pressure data collected from textile-based sensor arrays integrated onto a mattress to estimate the BMI values of subjects and classify their identities in different positions by using a deep multitask neural network. First, we filter and extract 14 features from the data and subsequently employ deep neural networks for BMI estimation and subject identification on two different public datasets. Finally, we demonstrate that our proposed solution outperforms prior works and several machine learning benchmarks by a considerable margin, while also estimating users' BMI in a 10-fold cross-validation scheme.

accuracy, dataset, posture, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s12652-020-02210-9

2006.10453

Country:

Oceania > New Zealand (0.04)
North America > United States > Hawaii (0.04)
North America > Canada > Ontario > Kingston (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Therapeutic Area > Endocrinology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Feature Space Saturation during Training

Shenk, Justin, Richter, Mats L., Byttner, Wolf, Arpteg, Anders, Huss, Mikael

We propose layer saturation - a simple, online-computable method for analyzing the information processing in neural networks. First, we show that a layer's output can be restricted to the eigenspace of its variance matrix without performance loss. We propose a computationally lightweight method for approximating the variance matrix during training. From the dimension of its lossless eigenspace we derive layer saturation - the ratio between the eigenspace dimension and layer width. We show that saturation seems to indicate which layers contribute to network performance. We demonstrate how to alter layer saturation in a neural network by changing network depth, filter sizes and input resolution. Furthermore, we show that well-chosen input resolution increases network performance by distributing the inference process more evenly across the network.

input resolution, resolution, tex class file, (16 more...)

2006.08679

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Morioka, Hiroshi, Hyvärinen, Aapo

Independent innovation analysis for nonlinear vector autoregressive process

The nonlinear vector autoregressive (NVAR) model provides an appealing framework to analyze multivariate time series obtained from a nonlinear dynamical system. However, the innovation (or error), which plays a key role by driving the dynamics, is almost always assumed to be additive. Additivity greatly limits the generality of the model, hindering analysis of general NVAR process which have nonlinear interactions between the innovations. Here, we propose a new general framework called independent innovation analysis (IIA), which estimates the innovations from completely general NVAR. We assume mutual independence of the innovations as well as their modulation by a fully observable auxiliary variable (which is often taken as the time index and simply interpreted as nonstationarity). We show that IIA guarantees the identifiability of the innovations with arbitrary nonlinearities, up to a permutation and component-wise invertible nonlinearities. We propose two practical estimation methods, both of which can be easily implemented by ordinary neural network training. We thus provide the first rigorous identifiability result for general NVAR, as well as very general tools for learning such models.

artificial intelligence, innovation, machine learning, (17 more...)

2006.10944

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Robust Meta-learning for Mixed Linear Regression with Small Batches

Kong, Weihao, Somani, Raghav, Kakade, Sham, Oh, Sewoong

A common challenge faced in practical supervised learning, such as medical image processing and robotic interactions, is that there are plenty of tasks but each task cannot afford to collect enough labeled examples to be learned in isolation. However, by exploiting the similarities across those tasks, one can hope to overcome such data scarcity. Under a canonical scenario where each task is drawn from a mixture of k linear regressions, we study a fundamental question: can abundant small-data tasks compensate for the lack of big-data tasks? Existing second moment based approaches show that such a trade-off is efficiently achievable, with the help of medium-sized tasks with $\Omega(k^{1/2})$ examples each. However, this algorithm is brittle in two important scenarios. The predictions can be arbitrarily bad (i) even with only a few outliers in the dataset; or (ii) even if the medium-sized tasks are slightly smaller with $o(k^{1/2})$ examples each. We introduce a spectral approach that is simultaneously robust under both scenarios. To this end, we first design a novel outlier-robust principal component analysis algorithm that achieves an optimal accuracy. This is followed by a sum-of-squares algorithm to exploit the information from higher order moments. Together, this approach is robust against outliers and achieves a graceful statistical trade-off; the lack of $\Omega(k^{1/2})$-size tasks can be compensated for with smaller tasks, which can now be as small as $O(\log k)$.

algorithm, artificial intelligence, machine learning, (17 more...)

2006.09702

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Tukan, Murad, Maalouf, Alaa, Feldman, Dan

Coresets for Near-Convex Functions

Coreset is usually a small weighted subset of $n$ input points in $\mathbb{R}^d$, that provably approximates their loss function for a given set of queries (models, classifiers, etc.). Coresets become increasingly common in machine learning since existing heuristics or inefficient algorithms may be improved by running them possibly many times on the small coreset that can be maintained for streaming distributed data. Coresets can be obtained by sensitivity (importance) sampling, where its size is proportional to the total sum of sensitivities. Unfortunately, computing the sensitivity of each point is problem dependent and may be harder to compute than the original optimization problem at hand. We suggest a generic framework for computing sensitivities (and thus coresets) for wide family of loss functions which we call near-convex functions. This is by suggesting the $f$-SVD factorization that generalizes the SVD factorization of matrices to functions. Example applications include coresets that are either new or significantly improves previous results, such as SVM, Logistic regression, M-estimators, and $\ell_z$-regression. Experimental results and open source are also provided.

artificial intelligence, machine learning, sensitivity, (17 more...)

2006.05482

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Spain > Canary Islands (0.04)
(3 more...)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Transfer Learning for High-dimensional Linear Regression: Prediction, Estimation, and Minimax Optimality

Li, Sai, Cai, T. Tony, Li, Hongzhe

This paper considers the estimation and prediction of a high-dimensional linear regression in the setting of transfer learning, using samples from the target model as well as auxiliary samples from different but possibly related regression models. When the set of "informative" auxiliary samples is known, an estimator and a predictor are proposed and their optimality is established. The optimal rates of convergence for prediction and estimation are faster than the corresponding rates without using the auxiliary samples. This implies that knowledge from the informative auxiliary samples can be transferred to improve the learning performance of the target problem. In the case that the set of informative auxiliary samples is unknown, we propose a data-driven procedure for transfer learning, called Trans-Lasso, and reveal its robustness to non-informative auxiliary samples and its efficiency in knowledge transfer. The proposed procedures are demonstrated in numerical studies and are applied to a dataset concerning the associations among gene expressions. It is shown that Trans-Lasso leads to improved performance in gene expression prediction in a target tissue by incorporating the data from multiple different tissues as auxiliary samples.

artificial intelligence, auxiliary sample, machine learning, (17 more...)

2006.10593

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.93)

Agrawal, Akshay, Barratt, Shane, Boyd, Stephen

Learning Convex Optimization Models

A convex optimization model predicts an output from an input by solving a convex optimization problem. The class of convex optimization models is large, and includes as special cases many well-known models like linear and logistic regression. We propose a heuristic for learning the parameters in a convex optimization model given a dataset of input-output pairs, using recently developed methods for differentiating the solution of a convex optimization problem with respect to its parameters. We describe three general classes of convex optimization models, maximum a posteriori (MAP) models, utility maximization models, and agent models, and present a numerical experiment for each.

artificial intelligence, convex optimization model, machine learning, (13 more...)

2006.04248

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

#artificialintelligenceJun-17-2020, 07:27:42 GMT

How Much Math do I need in Data Science?

Can I become a data scientist with little or no math background? What essential math skills are important in data science? There are so many good packages that can be used for building predictive models or for producing data visualizations. Thanks to these packages, anyone can build a model or produce a data visualization. However, very solid background knowledge in mathematics is essential for fine-tuning your models to produce reliable models with optimal performance.

artificial intelligence, data science and machine learning, machine learning, (12 more...)

#artificialintelligence

Genre: Research Report > Experimental Study (0.31)

Industry: Education (0.51)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)