AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Online Identification and Tracking of Subspaces from Highly Incomplete Information

Balzano, Laura, Nowak, Robert, Recht, Benjamin

arXiv.org Machine LearningJul-12-2011

This work presents GROUSE (Grassmanian Rank-One Update Subspace Estimation), an efficient online algorithm for tracking subspaces from highly incomplete observations. GROUSE requires only basic linear algebraic manipulations at each iteration, and each subspace update can be performed in linear time in the dimension of the subspace. The algorithm is derived by analyzing incremental gradient descent on the Grassmannian manifold of subspaces. With a slight modification, GROUSE can also be used as an online incremental algorithm for the matrix completion problem of imputing missing entries of a low-rank matrix. GROUSE performs exceptionally well in practice both in tracking subspaces and as an online algorithm for matrix completion.

artificial intelligence, machine learning, subspace, (14 more...)

arXiv.org Machine Learning

1006.4046

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Transfer Learning by Reusing Structured Knowledge

Yang, Qiang (Hong Kong University of Science and Technology) | Zheng, Vincent W. (Hong Kong University of Science and Technology) | Li, Bin (Institute TELECOM SudParis) | Zhuo, Hankz Hankui (Sun Yat-sen University)

AI MagazineJul-9-2011

Transfer learning aims to solve new learning problems by extracting and making use of the common knowledge found in related domains. A key element of transfer learning is to identify structured knowledge to enable the knowledge transfer. Structured knowledge comes in different forms, depending on the nature of the learning problem and characteristics of the domains. In this article, we describe three of our recent works on transfer learning in a progressively more sophisticated order of the structured knowledge being transferred. We show that optimization methods, and techniques inspired by the concerns of data reuse can be applied to extract and transfer deep structural knowledge between a variety of source and target problems. In our examples, this knowledge spans explicit data labels, model parameters, relations between data clusters and relational action descriptions.

action model, knowledge, proceedings, (16 more...)

AI Magazine

Country:

North America > United States > New York (0.05)
North America > United States > California > San Mateo County > Menlo Park (0.05)
Asia > China > Hong Kong (0.04)
(9 more...)

Genre: Research Report (0.68)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Loss-sensitive Training of Probabilistic Conditional Random Fields

Volkovs, Maksims N., Larochelle, Hugo, Zemel, Richard S.

arXiv.org Machine LearningJul-9-2011

We consider the problem of training probabilistic conditional random fields (CRFs) in the context of a task where performance is measured using a specific loss function. While maximum likelihood is the most common approach to training CRFs, it ignores the inherent structure of the task's loss function. We describe alternatives to maximum likelihood which take that loss into account. These include a novel adaptation of a loss upper bound from the structured SVMs literature to the CRF context, as well as a new loss-inspired KL divergence objective which relies on the probabilistic nature of CRFs. These loss-sensitive objectives are compared to maximum likelihood using ranking as a benchmark task. This comparison confirms the importance of incorporating loss information in the probabilistic training of CRFs, with the loss-inspired KL outperforming all other objectives.

artificial intelligence, machine learning, objective, (18 more...)

arXiv.org Machine Learning

1107.1805

Country:

North America > United States (0.93)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.77)

Add feedback

A Preliminary Evaluation of Machine Learning in Algorithm Selection for Search Problems

Kotthoff, Lars (University of St. Andrews) | Gent, Ian P. (University of St. Andrews) | Miguel, Ian (University of St. Andrews)

AAAI ConferencesJul-5-2011

Machine learning is an established method of selecting algorithms to solve hard search problems. Despite this, to date no systematic comparison and evaluation of the different techniques has been performed and the performance of existing systems has not been critically compared to other approaches. We compare machine learning techniques for algorithm selection on real-world data sets of hard search problems. In addition to well-established approaches, for the first time we also apply statistical relational learning to this problem. We demonstrate that most machine learning techniques and existing systems perform less well than one might expect. To guide practitioners, we close by giving clear recommendations as to which machine learning techniques are likely to perform well based on our experiments.

algorithm, majority predictor, portfolio, (15 more...)

AAAI Conferences

Fourth Annual Symposium on Combinatorial Search

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

Proximal Methods for Hierarchical Sparse Coding

Jenatton, Rodolphe, Mairal, Julien, Obozinski, Guillaume, Bach, Francis

arXiv.org Machine LearningJul-5-2011

Sparse coding consists in representing signals as sparse linear combinations of atoms selected from a dictionary. We consider an extension of this framework where the atoms are further assumed to be embedded in a tree. This is achieved using a recently introduced tree-structured sparse regularization norm, which has proven useful in several applications. This norm leads to regularized problems that are difficult to optimize, and we propose in this paper efficient algorithms for solving them. More precisely, we show that the proximal operator associated with this norm is computable exactly via a dual approach that can be viewed as the composition of elementary proximal operators. Our procedure has a complexity linear, or close to linear, in the number of atoms, and allows the use of accelerated gradient techniques to solve the tree-structured sparse approximation problem at the same computational cost as traditional ones using the L1-norm. Our method is efficient and scales gracefully to millions of variables, which we illustrate in two types of applications: first, we consider fixed hierarchical dictionaries of wavelets to denoise natural images. Then, we apply our optimization tools in the context of dictionary learning, where learned dictionary elements naturally organize in a prespecified arborescent structure, leading to a better performance in reconstruction of natural image patches. When applied to text documents, our method learns hierarchies of topics, thus providing a competitive alternative to probabilistic topic models.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

1009.2139

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

Add feedback

Law of Connectivity in Machine Learning

Dundas, Jitesh

arXiv.org Artificial IntelligenceJul-1-2011

We present in this paper our law that there is always a connection present between two entities, with a selfconnection being present at least in each node. An entity is an object, physical or imaginary, that is connected by a path (or connection) and which is important for achieving the desired result of the scenario. In machine learning, we state that for any scenario, a subject entity is always, directly or indirectly, connected and affected by single or multiple independent / dependent entities, and their impact on the subject entity is dependent on various factors falling into the categories such as the existenc

artificial intelligence, interaction, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1107.0194

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms

Derbeko, P., El-Yaniv, R., Meir, R.

arXiv.org Artificial IntelligenceJun-30-2011

Inductive learning is based on inferring a general rule from a finite data set and using it to label new data. In transduction one attempts to solve the problem of using a labeled training set to label a set of unlabeled points, which are given to the learner prior to learning. Although transduction seems at the outset to be an easier task than induction, there have not been many provably useful algorithms for transduction. Moreover, the precise relation between induction and transduction has not yet been determined. The main theoretical developments related to transduction were presented by Vapnik more than twenty years ago. One of Vapnik's basic results is a rather tight error bound for transductive classification based on an exact computation of the hypergeometric tail. While tight, this bound is given implicitly via a computational routine. Our first contribution is a somewhat looser but explicit characterization of a slightly extended PAC-Bayesian version of Vapnik's transductive bound. This characterization is obtained using concentration inequalities for the tail of sums of random variables obtained by sampling without replacement. We then derive error bounds for compression schemes such as (transductive) support vector machines and for transduction algorithms based on clustering. The main observation used for deriving these new error bounds and algorithms is that the unlabeled test points, which in the transductive setting are known in advance, can be used in order to construct useful data dependent prior distributions over the hypothesis space.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1417

1107.0046

Country:

North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

The Rate of Convergence of AdaBoost

Mukherjee, Indraneel, Rudin, Cynthia, Schapire, Robert E.

arXiv.org Artificial IntelligenceJun-29-2011

The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration $t$, the exponential loss of AdaBoost's computed parameter vector will be at most $\epsilon$ more than that of any parameter vector of $\ell_1$-norm bounded by $B$ in a number of rounds that is at most a polynomial in $B$ and $1/\epsilon$. We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within $C/\epsilon$ iterations, AdaBoost achieves a value of the exponential loss that is at most $\epsilon$ more than the best possible value, where $C$ depends on the dataset. We show that this dependence of the rate on $\epsilon$ is optimal up to constant factors, i.e., at least $\Omega(1/\epsilon)$ rounds are necessary to achieve within $\epsilon$ of the optimal exponential loss.

adaboost, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1106.6024

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

A Dirty Model for Multiple Sparse Regression

Jalali, Ali, Ravikumar, Pradeep, Sanghavi, Sujay

arXiv.org Machine LearningJun-28-2011

Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \ell_1/\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theoretically and empirically, our method strictly and noticeably outperforms both \ell_1 or \ell_1/\ell_q methods, over the entire range of possible overlaps (except at boundary cases, where we match the best method). We also provide theoretical guarantees that the method performs well under high-dimensional scaling.

artificial intelligence, machine learning, probability, (17 more...)

arXiv.org Machine Learning

1106.5826

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Proceedings of the 2011 New York Workshop on Computer, Earth and Space Science

Way, Michael J., Naud, Catherine

arXiv.org Machine LearningJun-27-2011

The purpose of the New York Workshop on Computer, Earth and Space Sciences is to bring together the New York area's finest Astronomers, Statisticians, Computer Scientists, Space and Earth Scientists to explore potential synergies between their respective fields. The 2011 edition (CESS2011) was a great success, and we would like to thank all of the presenters and participants for attending. This year was also special as it included authors from the upcoming book titled "Advances in Machine Learning and Data Mining for Astronomy". Over two days, the latest advanced techniques used to analyze the vast amounts of information now available for the understanding of our universe and our planet were presented. These proceedings attempt to provide a small window into what the current state of research is in this vast interdisciplinary field and we'd like to thank the speakers who spent the time to contribute to this volume.

new york, upstream oil & gas, us government, (24 more...)

arXiv.org Machine Learning

1104.158

Country:

North America > United States > California (0.28)
North America > United States > New York > New York County > New York City (0.15)
North America > Mexico (0.14)
(8 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Space Agency (0.69)
Health & Medicine (0.67)
Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
(3 more...)

Add feedback