AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Machine Learning Classification Methods and Factor Investing

#artificialintelligenceFeb-2-2019, 08:56:31 GMT

Regression predicts a continuous value: for example, the return on an asset. Classification predicts a discrete value: for example, will a stock outperform next period? This is a binary classification problem, predicting a yes/no response. Another example: Which quartile will a stock's performance fall into next month? This is multinomial classification, predicting a categorical variable with 4 possible outcomes.

artificial intelligence, classification, machine learning, (18 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning

So, Jinhyun, Guler, Basak, Avestimehr, A. Salman, Mohassel, Payman

arXiv.org Machine LearningFeb-2-2019

How to train a machine learning model while keeping the data private and secure? We present CodedPrivateML, a fast and scalable approach to this critical problem. CodedPrivateML keeps both the data and the model information-theoretically private, while allowing efficient parallelization of training across distributed workers. We characterize CodedPrivateML's privacy threshold and prove its convergence for logistic (and linear) regression. Furthermore, via experiments over Amazon EC2, we demonstrate that CodedPrivateML can provide an order of magnitude speedup (up to $\sim 34\times$) over the state-of-the-art cryptographic approaches.

codedprivateml, computation, dataset, (11 more...)

arXiv.org Machine Learning

1902.00641

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Spain > Canary Islands (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

Learning Direct and Inverse Transmission Matrices

Ancora, Daniele, Leuzzi, Luca

arXiv.org Machine LearningFeb-2-2019

Linear problems appear in a variety of disciplines and their application for the transmission matrix recovery is one of the most stimulating challenges in biomedical imaging. Its knowledge turns any random media into an optical tool that can focus or transmit an image through disorder. Here, converting an input-output problem into a statistical mechanical formulation, we investigate how inference protocols can learn the transmission couplings by pseudolikelihood maximization. Bridging linear regression and thermodynamics let us propose an innovative framework to pursue the solution of the scattering-riddle. A major interest in biomedical imaging is the comprehension ofthe light scattering through disordered media: many recent studies have achieved light-focusing and image reconstructioneven through complex biological tissues [1,2].

matrix, noise, transmission matrix, (14 more...)

arXiv.org Machine Learning

1901.04816

Country: Europe > Italy > Lazio > Rome (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models

Wu, Shanshan, Sanghavi, Sujay, Dimakis, Alexandros G.

arXiv.org Machine LearningFeb-2-2019

We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. samples. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is $\ell_1$-constrained logistic regression, while for more general alphabets an $\ell_{2,1}$ group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an $\ell_1$ constraint (or $\ell_{2,1}$ constraint) and the sample vector has an $\ell_{\infty}$ constraint (or $\ell_{2, \infty}$ constraint). We also show that the proposed convex programs can be efficiently solved in $\tilde{O}(n^2)$ running time (where $n$ is the number of variables) under the same statistical guarantees. We provide experimental results to support our analysis.

algorithm, logistic regression, probability, (16 more...)

arXiv.org Machine Learning

1810.11905

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback

Understanding Composition of Word Embeddings via Tensor Decomposition

Frandsen, Abraham, Ge, Rong

arXiv.org Machine LearningFeb-1-2019

Word embedding is a powerful tool in natural language processing. In this paper we consider the problem of word embedding composition \--- given vector representations of two words, compute a vector for the entire phrase. We give a generative model that can capture specific syntactic relations between words. Under our model, we prove that the correlations between three words (measured by their PMI) form a tensor that has an approximate low rank Tucker decomposition. The result of the Tucker decomposition gives the word embeddings as well as a core tensor, which can be used to produce better compositions of the word embeddings. We also complement our theoretical results with experiments that verify our assumptions, and demonstrate the effectiveness of the new composition method.

exp, tensor, vector, (15 more...)

arXiv.org Machine Learning

1902.00613

Country:

Asia > Russia (0.28)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

The Spatially-Conscious Machine Learning Model

Kiely, Timothy J., Bastian, Nathaniel D.

arXiv.org Machine LearningFeb-1-2019

Successfully predicting gentrification could have many social and commercial applications; however, real estate sales are difficult to predict because they belong to a chaotic system comprised of intrinsic and extrinsic characteristics, perceived value, and market speculation. Using New York City real estate as our subject, we combine modern techniques of data science and machine learning with traditional spatial analysis to create robust real estate prediction models for both classification and regression tasks. We compare several cutting edge machine learning algorithms across spatial, semi-spatial and non-spatial feature engineering techniques, and we empirically show that spatially-conscious machine learning models outperform non-spatial models when married with advanced prediction techniques such as feed-forward artificial neural networks and gradient boosting machine models.

algorithm, gentrification, numeric 12, (15 more...)

arXiv.org Machine Learning

1902.00562

Country:

North America > United States > New York > Bronx County > New York City (0.04)
North America > United States > Virginia > Fairfax County (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Machine Learning: Choosing a Machine Learning Model

#artificialintelligenceJan-31-2019, 13:54:58 GMT

As more companies look to leverage their data using the predictive capabilities of machine learning, they find that there is no one size fits all approach to this exciting technology. The machine learning algorithm you choose depends on the size, quality, and type of data as well as the project timeline and your overall goals. Choosing the proper machine learning algorithm lends context to the insights gained from the resulting predictions. Accuracy: Is the goal of your project to determine the most accurate result or will an approximation satisfy your project needs? Approximating outputs can reduce processing time and keep performance high for large datasets.

artificial intelligence, machine learning, prediction, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)

Add feedback

Agnostic Federated Learning

Mohri, Mehryar, Sivek, Gary, Suresh, Ananda Theertha

arXiv.org Machine LearningJan-31-2019

A key learning scenario in large-scale applications is that of federated learning, where a centralized model is trained based on data originating from a large number of clients. We argue that, with the existing training and inference, federated models can be biased towards different clients. Instead, we propose a new framework of agnostic federated learning, where the centralized model is optimized for any target distribution formed by a mixture of the client distributions. We further show that this framework naturally yields a notion of fairness. We present data-dependent Rademacher complexity guarantees for learning with this objective, which guide the definition of an algorithm for agnostic federated learning. We also give a fast stochastic optimization algorithm for solving the corresponding optimization problem, for which we prove convergence bounds, assuming a convex loss function and hypothesis set. We further empirically demonstrate the benefits of our approach in several datasets. Beyond federated learning, our framework and algorithm can be of interest to other learning scenarios such as cloud computing, domain adaptation, drifting, and other contexts where the training and test distributions do not coincide.

algorithm, federated learning, gradient, (17 more...)

arXiv.org Machine Learning

1902.00146

Country:

North America > United States > New York (0.04)
Europe > Czechia > Prague (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Gaussian Conditional Random Fields for Classification

Petrović, Andrija, Nikolić, Mladen, Jovanović, Miloš, Delibašić, Boris

arXiv.org Machine LearningJan-31-2019

Gaussian conditional random fields (GCRF) are a well-known used structured model for continuous outputs that uses multiple unstructured predictors to form its features and at the same time exploits dependence structure among outputs, which is provided by a similarity measure. In this paper, a Gaussian conditional random fields model for structured binary classification (GCRFBC) is proposed. The model is applicable to classification problems with undirected graphs, intractable for standard classification CRFs. The model representation of GCRFBC is extended by latent variables which yield some appealing properties. Thanks to the GCRF latent structure, the model becomes tractable, efficient and open to improvements previously applied to GCRF regression models. In addition, the model allows for reduction of noise, that might appear if structures were defined directly between discrete outputs. Additionally, two different forms of the algorithm are presented: GCRFBCb (GCRGBC - Bayesian) and GCRFBCnb (GCRFBC - non Bayesian). The extended method of local variational approximation of sigmoid function is used for solving empirical Bayes in Bayesian GCRFBCb variant, whereas MAP value of latent variables is the basis for learning and inference in the GCRFBCnb variant. The inference in GCRFBCb is solved by Newton-Cotes formulas for one-dimensional integration. Both models are evaluated on synthetic data and real-world data. It was shown that both models achieve better prediction performance than unstructured predictors. Furthermore, computational and memory complexity is evaluated. Advantages and disadvantages of the proposed GCRFBCb and GCRFBCnb are discussed in detail.

conditional random field, latent variable, random field, (13 more...)

arXiv.org Machine Learning

1902.00045

Country:

Europe > Serbia > Central Serbia > Belgrade (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Norfolk County > Wellesley (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective

Vemula, Anirudh, Sun, Wen, Bagnell, J. Andrew

arXiv.org Machine LearningJan-31-2019

Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem. We examine these black-box methods closely to identify situations in which they are worse than action space exploration methods and those in which they are superior. Through simple theoretical analyses, we prove that complexity of exploration in parameter space depends on the dimensionality of parameter space, while complexity of exploration in action space depends on both the dimensionality of action space and horizon length. This is also demonstrated empirically by comparing simple exploration methods on several model problems, including Contextual Bandit, Linear Regression and Reinforcement Learning in continuous control.

contrasting exploration, experiment, exploration, (13 more...)

arXiv.org Machine Learning

1901.11503

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Togo (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Transportation > Air (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback