AITopics

Many modern big data applications feature large scale in both numbers of responses and predictors. Better statistical efficiency and scientific insights can be enabled by understanding the large-scale response-predictor association network structures via layers of sparse latent factors ranked by importance. Yet sparsity and orthogonality have been two largely incompatible goals. To accommodate both features, in this paper we suggest the method of sparse orthogonal factor regression (SOFAR) via the sparse singular value decomposition with orthogonality constrained optimization to learn the underlying association networks, with broad applications to both unsupervised and supervised learning tasks such as biclustering with sparse singular value decomposition, sparse principal component analysis, sparse factor analysis, and spare vector autoregression analysis. Exploiting the framework of convexity-assisted nonconvex optimization, we derive nonasymptotic error bounds for the suggested procedure characterizing the theoretical advantages. The statistical guarantees are powered by an efficient SOFAR algorithm with convergence property. Both computational and theoretical advantages of our procedure are demonstrated with several simulation and real data examples.

artificial intelligence, data mining, machine learning, (20 more...)

1704.08349

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Pruning variable selection ensembles

Zhang, Chunxia, Wu, Yilei, Zhu, Mu

In the context of variable selection, ensemble learning has gained increasing interest due to its great potential to improve selection accuracy and to reduce false discovery rate. A novel ordering-based selective ensemble learning strategy is designed in this paper to obtain smaller but more accurate ensembles. In particular, a greedy sorting strategy is proposed to rearrange the order by which the members are included into the integration process. Through stopping the fusion process early, a smaller subensemble with higher selection accuracy can be obtained. More importantly, the sequential inclusion criterion reveals the fundamental strength-diversity trade-off among ensemble members. By taking stability selection (abbreviated as StabSel) as an example, some experiments are conducted with both simulated and real-world data to examine the performance of the novel algorithm. Experimental results demonstrate that pruned StabSel generally achieves higher selection accuracy and lower false discovery rates than StabSel and several other benchmark methods.

artificial intelligence, ensemble, machine learning, (16 more...)

1704.08265

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)

Exploiting random projections and sparsity with random forests and gradient boosting methods -- Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity

Joly, Arnaud

artificial intelligence, computational statistic & data analysis, machine learning, (20 more...)

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested $if-then-else$ questions, the testing nodes, leading to a set of predictions, the leaf nodes. Several of such trees are often combined together for state-of-the-art performance: random forest ensembles average the predictions of randomized decision trees trained independently in parallel, while tree boosting ensembles train decision trees sequentially to refine the predictions made by the previous ones. The emergence of new applications requires scalable supervised learning algorithms in terms of computational power and memory space with respect to the number of inputs, outputs, and observations without sacrificing accuracy. In this thesis, we identify three main areas where decision tree methods could be improved for which we provide and evaluate original algorithmic solutions: (i) learning over high dimensional output spaces, (ii) learning with large sample datasets and stringent memory constraints at prediction time and (iii) learning over high dimensional sparse input spaces.

1704.08067

Country:

Europe (1.00)
North America > United States > California (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.92)
Education (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
(3 more...)

Javanmard, Adel, Lee, Jason D.

A Flexible Framework for Hypothesis Testing in High-dimensions

Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$) and assume that the high-dimensional parameters vector is $s_0$ sparse. We develop a general and flexible $\ell_\infty$ projection statistic for hypothesis testing in this model. Our framework encompasses testing whether the parameter lies in a convex cone, testing the signal strength, testing arbitrary functionals of the parameter, and testing adaptive hypothesis. We show that the proposed procedure controls the type I error under the standard assumption of $s_0 (\log p)/\sqrt{n}\to 0$, and also analyze the power of the procedure. Our numerical experiments confirms our theoretical findings and demonstrate that we control false positive rate (type I error) near the nominal level, and have high power.

artificial intelligence, confidence interval, machine learning, (16 more...)

1704.07971

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

International Business TimesApr-25-2017, 17:32:01 GMT

WWE Payback 2017: Predictions, Match Card For First 'Monday Night Raw' PPV After WrestleMania

WWE's first pay-per-view since WrestleMania 33 is set for Sunday night in San Jose, California. Payback 2017 will feature mostly members of the "Monday Night Raw," roster, though a couple of "SmackDown Live" wrestlers are on the card. WWE Champion Randy Orton is still with the blue brand after the Superstar Shake-up, but he's concluding his feud with Bray Wyatt at Payback. Kevin Owens has appeared on "SmackDown Live" twice since WrestleMania 33, and he'll put his United States Championship on the line against Chris Jericho. Living up to its name, Payback features several rematches.

artificial intelligence, machine learning, smackdown live, (14 more...)

International Business Times

Country:

North America > United States > California > Santa Clara County > San Jose (0.25)
North America > United States > New York (0.05)

Industry: Leisure & Entertainment > Sports > Martial Arts (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.45)

arXiv.org Machine LearningApr-25-2017

Fisher consistency for prior probability shift

Tasche, Dirk

We introduce Fisher consistency in the sense of unbiasedness as a desirable property for estimators of class prior probabilities. Lack of Fisher consistency could be used as a criterion to dismiss estimators that are unlikely to deliver precise estimates in test datasets under prior probability and more general dataset shift. The usefulness of this unbiasedness concept is demonstrated with three examples of classifiers used for quantification: Adjusted Classify & Count, EM-algorithm and CDE-Iterate. We find that Adjusted Classify & Count and EM-algorithm are Fisher consistent. A counter-example shows that CDE-Iterate is not Fisher consistent and, therefore, cannot be trusted to deliver reliable estimates of class probabilities.

artificial intelligence, dataset shift, machine learning, (17 more...)

1701.05512

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Epidemiology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Pananjady, Ashwin, Wainwright, Martin J., Courtade, Thomas A.

Denoising Linear Models with Permuted Data

arXiv.org Machine LearningApr-24-2017

The multivariate linear regression model with shuffled data and additive Gaussian noise arises in various correspondence estimation and matching problems. Focusing on the denoising aspect of this problem, we provide a characterization the minimax error rate that is sharp up to logarithmic factors. We also analyze the performance of two versions of a computationally efficient estimator, and establish their consistency for a large range of input parameters. Finally, we provide an exact algorithm for the noiseless problem and demonstrate its performance on an image point-cloud matching task. Our analysis also extends to datasets with outliers.

estimator, inequality, matrix, (16 more...)

1704.07461

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Anirudh, Rushil, Thiagarajan, Jayaraman J.

Bootstrapping Graph Convolutional Neural Networks for Autism Spectrum Disorder Classification

arXiv.org Machine LearningApr-24-2017

Using predictive models to identify patterns that can act as biomarkers for different neuropathoglogical conditions is becoming highly prevalent. In this paper, we consider the problem of Autism Spectrum Disorder (ASD) classification. While non-invasive imaging measurements, such as the rest state fMRI, are typically used in this problem, it can be beneficial to incorporate a wide variety of non-imaging features, including personal and socio-cultural traits, into predictive modeling. We propose to employ a graph-based approach for combining both types of feature, where a contextual graph encodes the traits of a larger population while the brain activity patterns are defined as a multivariate function at the nodes of the graph. Since the underlying graph dictates the performance of the resulting predictive models, we explore the use of different graph construction strategies. Furthermore, we develop a bootstrapped version of graph convolutional neural networks (G-CNNs) that utilizes an ensemble of weakly trained G-CNNs to avoid overfitting and also reduce the sensitivity of the models on the choice of graph construction. We demonstrate its effectiveness on the Autism Brain Imaging Data Exchange (ABIDE) dataset and show that the proposed approach outperforms state-of-the-art approaches for this problem.

artificial intelligence, graph, machine learning, (16 more...)

1704.07487

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Cai, Weixin, Hejazi, Nima S., Hubbard, Alan E.

Data-adaptive statistics for multiple hypothesis testing in high-dimensional settings

arXiv.org Machine LearningApr-23-2017

Current statistical inference problems in areas like astronomy, genomics, and marketing routinely involve the simultaneous testing of thousands -- even millions -- of null hypotheses. For high-dimensional multivariate distributions, these hypotheses may concern a wide range of parameters, with complex and unknown dependence structures among variables. In analyzing such hypothesis testing procedures, gains in efficiency and power can be achieved by performing variable reduction on the set of hypotheses prior to testing. We present in this paper an approach using data-adaptive multiple testing that serves exactly this purpose. This approach applies data mining techniques to screen the full set of covariates on equally sized partitions of the whole sample via cross-validation. This generalized screening procedure is used to create average ranks for covariates, which are then used to generate a reduced (sub)set of hypotheses, from which we compute test statistics that are subsequently subjected to standard multiple testing corrections. The principal advantage of this methodology lies in its providing valid statistical inference without the \textit{a priori} specifying which hypotheses will be tested. Here, we present the theoretical details of this approach, confirm its validity via a simulation study, and exemplify its use by applying it to the analysis of data on microRNA differential expression.

st 0, statistics, test statistics, (16 more...)

1704.07008

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre:

Research Report > Experimental Study (0.74)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Pawlowski, Nick, Jaques, Miguel, Glocker, Ben

Efficient variational Bayesian neural network ensembles for outlier detection

arXiv.org Machine LearningApr-22-2017

In this work we perform outlier detection using ensembles of neural networks obtained by variational approximation of the posterior in a Bayesian neural network setting. The variational parameters are obtained by sampling from the true posterior by gradient descent. We show our outlier detection results are comparable to those obtained using other efficient ensembling methods.

artificial intelligence, data mining, machine learning, (18 more...)

1703.06749

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)