AITopics

doi: 10.1214/09-AOS776

0808.0711

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

McDonald, Daniel J., Shalizi, Cosma Rohilla, Schervish, Mark

Estimating $\beta$-mixing coefficients

arXiv.org Machine LearningMar-4-2011

The literature on statistical learning for time series assumes the asymptotic independence or ``mixing' of the data-generating process. These mixing assumptions are never tested, nor are there methods for estimating mixing rates from data. We give an estimator for the $\beta$-mixing rate based on a single stationary sample path and show it is $L_1$-risk consistent.

coefficient, estimator, sequence, (16 more...)

1103.0941

Country:

North America > United States > New York (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Sprechmann, Pablo, Ramírez, Ignacio, Sapiro, Guillermo, Eldar, Yonina

C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework

arXiv.org Machine LearningMar-4-2011

Sparse modeling is a powerful framework for data analysis and processing. Traditionally, encoding in this framework is performed by solving an L1-regularized linear regression problem, commonly referred to as Lasso or Basis Pursuit. In this work we combine the sparsity-inducing property of the Lasso model at the individual feature level, with the block-sparsity property of the Group Lasso model, where sparse groups of features are jointly encoded, obtaining a sparsity pattern hierarchically structured. This results in the Hierarchical Lasso (HiLasso), which shows important practical modeling advantages. We then extend this approach to the collaborative case, where a set of simultaneously coded signals share the same sparsity pattern at the higher (group) level, but not necessarily at the lower (inside the group) level, obtaining the collaborative HiLasso model (C-HiLasso). Such signals then share the same active groups, or classes, but not necessarily the same active set. This model is very well suited for applications such as source identification and separation. An efficient optimization procedure, which guarantees convergence to the global optimum, is developed for these new models. The underlying presentation of the new framework and optimization approach is complemented with experimental examples and theoretical results regarding recovery guarantees for the proposed models.

artificial intelligence, c-hilasso, machine learning, (19 more...)

doi: 10.1109/TSP.2011.2157912

1006.1346

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Kloft, Marius, Blanchard, Gilles

The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning

arXiv.org Machine LearningMar-3-2011

We derive an upper bound on the local Rademacher complexity of $\ell_p$-norm multiple kernel learning, which yields a tighter excess risk bound than global approaches. Previous local approaches aimed at analyzed the case $p=1$ only while our analysis covers all cases $1\leq p\leq\infty$, assuming the different feature mappings corresponding to the different kernels to be uncorrelated. We also show a lower bound that shows that the bound is tight, and derive consequences regarding excess loss, namely fast convergence rates of the order $O(n^{-\frac{\alpha}{1+\alpha}})$, where $\alpha$ is the minimum eigenvalue decay rate of the individual kernels.

complexity, kernel, rademacher complexity, (12 more...)

1103.079

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Germany > Brandenburg > Potsdam (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Zhou, Tianyi, Tao, Dacheng

Multi-label Learning via Structured Decomposition and Group Sparsity

In multi-label learning, each sample is associated with several labels. Existing works indicate that exploring correlations between labels improve the prediction performance. However, embedding the label correlations into the training process significantly increases the problem size. Moreover, the mapping of the label structure in the feature space is not clear. In this paper, we propose a novel multi-label learning method "Structured Decomposition + Group Sparsity (SDGS)". In SDGS, we learn a feature subspace for each label from the structured decomposition of the training data, and predict the labels of a new sample from its group sparse representation on the multi-subspace obtained from the structured decomposition. In particular, in the training stage, we decompose the data matrix $X\in R^{n\times p}$ as $X=\sum_{i=1}^kL^i+S$, wherein the rows of $L^i$ associated with samples that belong to label $i$ are nonzero and consist a low-rank matrix, while the other rows are all-zeros, the residual $S$ is a sparse matrix. The row space of $L_i$ is the feature subspace corresponding to label $i$. This decomposition can be efficiently obtained via randomized optimization. In the prediction stage, we estimate the group sparse representation of a new sample on the multi-subspace via group \emph{lasso}. The nonzero representation coefficients tend to concentrate on the subspaces of labels that the sample belongs to, and thus an effective prediction can be obtained. We evaluate SDGS on several real datasets and compare it with popular methods. Results verify the effectiveness and efficiency of SDGS.

classification, decomposition, sdg, (11 more...)

1103.0102

Country:

Oceania > Australia (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Nott, David J., Tran, Minh-Ngoc, Leng, Chenlei

Variational approximation for heteroscedastic linear models and matching pursuit algorithms

Modern statistical applications involving large data sets have focused attention on statistical methodologies which are both efficient computationally and able to deal with the screening of large numbers of different candidate models. Here we consider computationally efficient variational Bayes approaches to inference in high-dimensional heteroscedastic linear regression, where both the mean and variance are described in terms of linear functions of the predictors and where the number of predictors can be larger than the sample size. We derive a closed form variational lower bound on the log marginal likelihood useful for model selection, and propose a novel fast greedy search algorithm on the model space which makes use of one step optimization updates to the variational lower bound in the current model for screening large numbers of candidate predictor variables for inclusion/exclusion in a computationally thrifty way. We show that the model search strategy we suggest is related to widely used orthogonal matching pursuit algorithms for model search but yields a framework for potentially extending these algorithms to more complex models. The methodology is applied in simulations and in two real examples involving prediction for food constituents using NIR technology and prediction of disease progression in diabetes.

artificial intelligence, bayesian inference, machine learning, (20 more...)

doi: 10.1007/s11222-011-9243-2

1011.4832

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Tomioka, Ryota, Hayashi, Kohei, Kashima, Hisashi

Estimation of low-rank tensors via convex optimization

In this paper, we propose three approaches for the estimation of the Tucker decomposition of multi-way arrays (tensors) from partial observations. All approaches are formulated as convex minimization problems. Therefore, the minimum is guaranteed to be unique. The proposed approaches can automatically estimate the number of factors (rank) through the optimization. Thus, there is no need to specify the rank beforehand. The key technique we employ is the trace norm regularization, which is a popular approach for the estimation of low-rank matrices. In addition, we propose a simple heuristic to improve the interpretability of the obtained factorization. The advantages and disadvantages of three proposed approaches are demonstrated through numerical experiments on both synthetic and real world datasets. We show that the proposed convex optimization based approaches are more accurate in predictive performance, faster, and more reliable in recovering a known multilinear structure than conventional approaches.

data mining, machine learning, tensor, (16 more...)

1010.0789

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Stochastic Stepwise Ensembles for Variable Selection

Xin, Lu, Zhu, Mu

The ensemble approach for statistical modelling was first made popular by such algorithms as boosting (Freund and Schapire 1996; Friedman et al. 2000), bagging (Breiman 1996), random forest (Breiman 2001), and the gradient boosting machine (Friedman 2001). They are powerful algorithms for solving prediction problems. This article is concerned with using the ensemble approach for a different problem, variable selection. We shall use the terms "prediction ensemble" and "variableselection ensemble" to differentiate ensembles used for these different purposes.

artificial intelligence, machine learning, selection, (18 more...)

1003.593

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceMar-1-2011

A hybrid model for bankruptcy prediction using genetic algorithm, fuzzy c-means and mars

Martin, A., Gayathri, V., Saranya, G., Gayathri, P., Venkatesan, Prasanna

Bankruptcy prediction is very important for all the organization since it affects the economy and rise many social problems with high costs. There are large number of techniques have been developed to predict the bankruptcy, which helps the decision makers such as investors and financial analysts. One of the bankruptcy prediction models is the hybrid model using Fuzzy C-means clustering and MARS, which uses static ratios taken from the bank financial statements for prediction, which has its own theoretical advantages. The performance of existing bankruptcy model can be improved by selecting the best features dynamically depend on the nature of the firm. This dynamic selection can be accomplished by Genetic Algorithm and it improves the performance of prediction model..

evolutionary algorithm, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

1103.211

Country:

Asia > India (0.15)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Rigollet, Philippe, Tong, Xin

Neyman-Pearson classification, convexity and stochastic constraints

arXiv.org Machine LearningFeb-28-2011

Motivated by problems of anomaly detection, this paper implements the Neyman-Pearson paradigm to deal with asymmetric errors in binary classification with a convex loss. Given a finite collection of classifiers, we combine them and obtain a new classifier that satisfies simultaneously the two following properties with high probability: (i) its probability of type I error is below a pre-specified level and (ii), it has probability of type II error close to the minimum possible. The proposed classifier is obtained by solving an optimization problem with an empirical objective and an empirical constraint. New techniques to handle such problems are developed and have consequences on chance constrained programming.

classifier, data mining, machine learning, (17 more...)

1102.575

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
(2 more...)