AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Probabilistic Auto-Associative Models and Semi-Linear PCA

arXiv.org Machine LearningSep-20-2012

Auto-Associative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic Semi-Linear Auto- Associative model in a Gaussian setting. We show it is a generalization of the PCA model to the semi-linear case. Numerical experiments on simulated datasets and a real astronomical application highlight the interest of this approach

artificial intelligence, machine learning, matrix, (14 more...)

arXiv.org Machine Learning

1209.4551

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

Learning Parameterized Skills

Da Silva, Bruno, Konidaris, George, Barto, Andrew

arXiv.org Machine LearningSep-3-2012

We introduce a method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems. The method draws example tasks from a distribution of interest and uses the corresponding learned policies to estimate the topology of the lower-dimensional piecewise-smooth manifold on which the skill policies lie. This manifold models how policy parameters change as task parameters vary. The method identifies the number of charts that compose the manifold and then applies non-linear regression in each chart to construct a parameterized skill by predicting policy parameters from task parameters. We evaluate our method on an underactuated simulated robotic arm tasked with learning to accurately throw darts at a parameterized target location.

artificial intelligence, machine learning, parameterized skill, (17 more...)

arXiv.org Machine Learning

1206.6398

Country: North America > United States > Massachusetts (0.46)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Efficient Algorithm for Extremely Large Multi-task Regression with Massive Structured Sparsity

Lee, Seunghak, Xing, Eric P.

arXiv.org Machine LearningAug-14-2012

We develop a highly scalable optimization method called "hierarchical group-thresholding" for solving a multi-task regression model with complex structured sparsity constraints on both input and output spaces. Despite the recent emergence of several efficient optimization algorithms for tackling complex sparsity-inducing regularizers, true scalability in practical high-dimensional problems where a huge amount (e.g., millions) of sparsity patterns need to be enforced remains an open challenge, because all existing algorithms must deal with ALL such patterns exhaustively in every iteration, which is computationally prohibitive. Our proposed algorithm addresses the scalability problem by screening out multiple groups of coefficients simultaneously and systematically. We employ a hierarchical tree representation of group constraints to accelerate the process of removing irrelevant constraints by taking advantage of the inclusion relationships between group sparsities, thereby avoiding dealing with all constraints in every optimization step, and necessitating optimization operation only on a small number of outstanding coefficients. In our experiments, we demonstrate the efficiency of our method on simulation datasets, and in an application of detecting genetic variants associated with gene expression traits.

artificial intelligence, coefficient, machine learning, (19 more...)

arXiv.org Machine Learning

1208.3014

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Detecting Events and Patterns in Large-Scale User Generated Textual Streams with Statistical Learning Methods

Lampos, Vasileios

arXiv.org Machine LearningAug-13-2012

A vast amount of textual web streams is influenced by events or phenomena emerging in the real world. The social web forms an excellent modern paradigm, where unstructured user generated content is published on a regular basis and in most occasions is freely distributed. The present Ph.D. Thesis deals with the problem of inferring information - or patterns in general - about events emerging in real life based on the contents of this textual stream. We show that it is possible to extract valuable information about social phenomena, such as an epidemic or even rainfall rates, by automatic analysis of the content published in Social Media, and in particular Twitter, using Statistical Machine Learning methods. An important intermediate task regards the formation and identification of features which characterise a target event; we select and use those textual features in several linear, non-linear and hybrid inference approaches achieving a significantly good performance in terms of the applied loss function. By examining further this rich data set, we also propose methods for extracting various types of mood signals revealing how affective norms - at least within the social web's population - evolve during the day and how significant events emerging in the real world are influencing them. Lastly, we present some preliminary findings showing several spatiotemporal characteristics of this textual information as well as the potential of using it to tackle tasks such as the prediction of voting intentions.

artificial intelligence, autocorrelation confidence bound autocorrelation, machine learning, (15 more...)

arXiv.org Machine Learning

1208.2873

Country:

North America > United States (1.00)
Asia (0.92)
Africa (0.67)
Europe > United Kingdom > England > Greater London > London (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > News (1.00)
Leisure & Entertainment > Sports (1.00)
Information Technology > Services (1.00)
(12 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)

Add feedback

Fast global convergence of gradient methods for high-dimensional statistical recovery

Agarwal, Alekh, Negahban, Sahand N., Wainwright, Martin J.

arXiv.org Machine LearningJul-25-2012

Many statistical $M$-estimators are based on convex optimization problems formed by the combination of a data-dependent loss function with a norm-based regularizer. We analyze the convergence rates of projected gradient and composite gradient methods for solving such problems, working within a high-dimensional framework that allows the data dimension $\pdim$ to grow with (and possibly exceed) the sample size $\numobs$. This high-dimensional structure precludes the usual global assumptions---namely, strong convexity and smoothness conditions---that underlie much of classical optimization analysis. We define appropriately restricted versions of these conditions, and show that they are satisfied with high probability for various statistical models. Under these conditions, our theory guarantees that projected gradient descent has a globally geometric rate of convergence up to the \emph{statistical precision} of the model, meaning the typical distance between the true unknown parameter $\theta^*$ and an optimal solution $\hat{\theta}$. This result is substantially sharper than previous convergence results, which yielded sublinear convergence, or linear convergence only up to the noise level. Our analysis applies to a wide range of $M$-estimators and statistical models, including sparse linear regression using Lasso ($\ell_1$-regularized regression); group Lasso for block sparsity; log-linear models with regularization; low-rank matrix recovery using nuclear norm regularization; and matrix decomposition. Overall, our analysis reveals interesting connections between statistical precision and computational efficiency in high-dimensional estimation.

artificial intelligence, inequality, machine learning, (19 more...)

arXiv.org Machine Learning

1104.4824

Country:

North America > United States > Massachusetts (0.27)
North America > United States > California (0.27)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Conditional mean embeddings as regressors - supplementary

Grünewälder, Steffen, Lever, Guy, Baldassarre, Luca, Patterson, Sam, Gretton, Arthur, Pontil, Massimilano

arXiv.org Machine LearningJul-24-2012

We demonstrate an equivalence between reproducing kernel Hilbert space (RKHS) embeddings of conditional distributions and vector-valued regressors. This connection introduces a natural regularized loss function which the RKHS embeddings minimise, providing an intuitive understanding of the embeddings and a justification for their use. Furthermore, the equivalence allows the application of vector-valued regression methods and results to the problem of learning conditional distributions. Using this link we derive a sparse version of the embedding by considering alternative formulations. Further, by applying convergence results for vector-valued regression to the embedding problem we derive minimax convergence rates which are O(\log(n)/n) -- compared to current state of the art rates of O(n^{-1/4}) -- and are valid under milder and more intuitive assumptions. These minimax upper rates coincide with lower rates up to a logarithmic factor, showing that the embedding method achieves nearly optimal rates. We study our sparse embedding algorithm in a reinforcement learning task where the algorithm shows significant improvement in sparsity over an incomplete Cholesky decomposition.

artificial intelligence, assumption, machine learning, (16 more...)

arXiv.org Machine Learning

1205.4656

Country: Europe > United Kingdom (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Efficient Online Learning for Large-Scale Sparse Kernel Logistic Regression

Zhang, Lijun (Zhejiang University) | Jin, Rong (Michigan State University) | Chen, Chun (Zhejiang University) | Bu, Jiajun (Zhejiang University) | He, Xiaofei (Zhejiang University)

AAAI ConferencesJul-21-2012

In this paper, we study the problem of large-scale Kernel Logistic Regression (KLR). A straightforward approach is to apply stochastic approximation to KLR. We refer to this approach as non-conservative online learning algorithm because it updates the kernel classifier after every received training example, leading to a dense classifier. To improve the sparsity of the KLR classifier, we propose two conservative online learning algorithms that update the classifier in a stochastic manner and generate sparse solutions. With appropriately designed updating strategies, our analysis shows that the two conservative algorithms enjoy similar theoretical guarantee as that of the non-conservative algorithm. Empirical studies on several benchmark data sets demonstrate that compared to batch-mode algorithms for KLR, the proposed conservative online learning algorithms are able to produce sparse KLR classifiers, and achieve similar classification accuracy but with significantly shorter training time. Furthermore, both the sparsity and classification accuracy of our methods are comparable to those of the online kernel SVM.

algorithm, classifier, training example, (15 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Ingham County > Lansing (0.04)
(2 more...)

Genre: Research Report > New Finding (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.62)

Add feedback

Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching

Treeratpituk, Pucktada (Pennsylvania State University) | Giles, C. Lee (Pennsylvania State University)

AAAI ConferencesJul-21-2012

Personal names are important and common information in many data sources, ranging from social networks and news articles to patient records and scientific documents.They are often used as queries for retrieving records and also as key information for linking documents from multiple sources. Matching personal names can be challenging due to variations in spelling and various formatting of names. While many approximated name matching techniques have been proposed, most are generic string-matching algorithms. Unlike other types of proper names, personal names are highly cultural. Many ethnicities have their own unique naming systems and identifiable characteristics. In this paper we explore such relationships between ethnicities and personal names to improve the name matching performance. First, we propose a name-ethnicity classifier based on the multinomial logistic regression. Our model can effectively identify name-ethnicity from personal names in Wikipedia, which we use to define name-ethnicity, to within 85\% accuracy.Next, we propose a novel alignment-based name matching algorithm, based on Smith–Waterman algorithm and logistic regression.Different name matching models are then trained for different name-ethnicity groups.Our preliminary experimental result on DBLP's disambiguated author dataset yields a performance of 99\% precision and 89\% recall.Surprisingly, textual features carry more weight than phonetic ones in name-ethnicity classification.

artificial intelligence, ethnicity, machine learning, (17 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

Asia > India > Madhya Pradesh > Bhopal (0.05)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
Europe > France (0.04)
North America > Canada > Quebec (0.04)

Genre:

Research Report > New Finding (0.56)
Research Report > Experimental Study (0.56)

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

Classification of Sparse Time Series via Supervised Matrix Factorization

Grabocka, Josif (University of Hildesheim) | Nanopoulos, Alexandros (University of Hildesheim ) | Schmidt-Thieme, Lars (University of Hildesheim)

AAAI ConferencesJul-21-2012

Data sparsity is an emerging real-world problem observed in a various domains ranging from sensor networks to medical diagnosis. Consecutively, numerous machine learning methods were modeled to treat missing values. Nevertheless, sparsity, defined as missing segments, has not been thoroughly investigated in the context of time series classification. We propose a novel principle for classifying time series, which in contrast to existing approaches, avoids reconstructing the missing segments in time series and operates solely on the observed ones. Based on the proposed principle, we develop a method that prevents adding noise that incurs during the reconstruction of the original time series. Ourmethod adapts supervised matrix factorization by projecting time series in a latent space through stochasticlearning. Furthermore the projected data is built in a supervised fashion via a logistic regression. Abundant experiments on a large collection of 37 data sets demonstrate the superiority of our method, which in the majority of cases outperforms a set of baselines that do not follow our proposed principle.

artificial intelligence, factorization, machine learning, (16 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

Improved brain pattern recovery through ranking approaches

Pedregosa, Fabian, Gramfort, Alexandre, Varoquaux, Gaël, Thirion, Bertrand, Pallier, Christophe, Cauvet, Elodie

arXiv.org Machine LearningJul-15-2012

The prediction of behavioral information or cognitive states from brain activation images such as those obtained with fMRI can be used to assess the specificity of several brain regions for certain cognitive or perceptual functions. This kind of analysis is implemented by learning a classifier or regression function that fits a given target variable given fMRI activations. The accuracy of this prediction depends on whether it uses the relevant variables i.e. the correct brain regions. Recovering the truly predictive pattern has proven to be challenging from a statistical point of view: the high dimensionality of the data together with the limited number of images makes the problem of brain pattern recovery an ill-posed problem. So far, the approaches proposed to address this issue have relied on linear models, with univariate, i.e. voxel-based, Anova (analysis of variance) for hypothesis testing, or, for predictive modeling, with the choice of a regularizer using a priori domain-specific knowledge, such as the l

artificial intelligence, loss function, machine learning, (17 more...)

arXiv.org Machine Learning

1207.352

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (0.74)
Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.48)

Add feedback