AITopics

1505.01866

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.85)

arXiv.org Machine LearningMay-6-2015

Re-scale boosting for regression and classification

Lin, Shaobo, Wang, Yao, Xu, Lin

Boosting is a learning scheme that combines weak prediction rules to produce a strong composite estimator, with the underlying intuition that one can obtain accurate prediction rules by combining "rough" ones. Although boosting is proved to be consistent and overfitting-resistant, its numerical convergence rate is relatively slow. The aim of this paper is to develop a new boosting strategy, called the re-scale boosting (RBoosting), to accelerate the numerical convergence rate and, consequently, improve the learning performance of boosting. Our studies show that RBoosting possesses the almost optimal numerical convergence rate in the sense that, up to a logarithmic factor, it can reach the minimax nonlinear approximation rate. We then use RBoosting to tackle both the classification and regression problems, and deduce a tight generalization error estimate. The theoretical and experimental results show that RBoosting outperforms boosting in terms of generalization.

artificial intelligence, machine learning, rboosting, (18 more...)

1505.01371

Country: Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Gay, Dominique, Guigourès, Romain, Boullé, Marc, Clérot, Fabrice

Cats & Co: Categorical Time Series Coclustering

arXiv.org Machine LearningMay-6-2015

We suggest a novel method of clustering and exploratory analysis of temporal event sequences data (also known as categorical time series) based on three-dimensional data grid models. A data set of temporal event sequences can be represented as a data set of three-dimensional points, each point is defined by three variables: a sequence identifier, a time value and an event value. Instantiating data grid models to the 3D-points turns the problem into 3D-coclustering. The sequences are partitioned into clusters, the time variable is discretized into intervals and the events are partitioned into clusters. The cross-product of the univariate partitions forms a multivariate partition of the representation space, i.e., a grid of cells and it also represents a nonparametric estimator of the joint distribution of the sequences, time and events dimensions. Thus, the sequences are grouped together because they have similar joint distribution of time and events, i.e., similar distribution of events along the time dimension. The best data grid is computed using a parameter-free Bayesian model selection approach. We also suggest several criteria for exploiting the resulting grid through agglomerative hierarchies, for interpreting the clusters of sequences and characterizing their components through insightful visualizations. Extensive experiments on both synthetic and real-world data sets demonstrate that data grid models are efficient, effective and discover meaningful underlying patterns of categorical time series data.

artificial intelligence, data mining, machine learning, (19 more...)

1505.013

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

arXiv.org Machine LearningMay-6-2015

Semi-Orthogonal Multilinear PCA with Relaxed Start

Shi, Qiquan, Lu, Haiping

Principal component analysis (PCA) is an unsupervised method for learning low-dimensional features with orthogonal projections. Multilinear PCA methods extend PCA to deal with multidimensional data (tensors) directly via tensor-to-tensor projection or tensor-to-vector projection (TVP). However, under the TVP setting, it is difficult to develop an effective multilinear PCA method with the orthogonality constraint. This paper tackles this problem by proposing a novel Semi-Orthogonal Multilinear PCA (SO-MPCA) approach. SO-MPCA learns low-dimensional features directly from tensors via TVP by imposing the orthogonality constraint in only one mode. This formulation results in more captured variance and more learned features than full orthogonality. For better generalization, we further introduce a relaxed start (RS) strategy to get SO-MPCA-RS by fixing the starting projection vectors, which increases the bias and reduces the variance of the learning model. Experiments on both face (2D) and gait (3D) data demonstrate that SO-MPCA-RS outperforms other competing algorithms on the whole, and the relaxed start strategy is also effective for other TVP-based PCA methods.

data mining, machine learning, variance, (17 more...)

1504.08142

Country: Asia > China (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Wei, Changshuai, Schaid, Daniel J., Lu, Qing

Trees Assembling Mann Whitney Approach for Detecting Genome-wide Joint Association among Low Marginal Effect loci

Common complex diseases are likely influenced by the interplay of hundreds, or even thousands, of genetic variants. Converging evidence shows that genetic variants with low marginal effects (LME) play an important role in disease development. Despite their potential significance, discovering LME genetic variants and assessing their joint association on high dimensional data (e.g., genome wide association studies) remain a great challenge. To facilitate joint association analysis among a large ensemble of LME genetic variants, we proposed a computationally efficient and powerful approach, which we call Trees Assembling Mann whitney (TAMW). Through simulation studies and an empirical data application, we found that TAMW outperformed multifactor dimensionality reduction (MDR) and the likelihood ratio based Mann whitney approach (LRMW) when the underlying complex disease involves multiple LME loci and their interactions. For instance, in a simulation with 20 interacting LME loci, TAMW attained a higher power (power=0.931) than both MDR (power=0.599) and LRMW (power=0.704). In an empirical study of 29 known Crohn's disease (CD) loci, TAMW also identified a stronger joint association with CD than those detected by MDR and LRMW. Finally, we applied TAMW to Wellcome Trust CD GWAS to conduct a genome wide analysis. The analysis of 459K single nucleotide polymorphisms was completed in 40 hours using parallel computing, and revealed a joint association predisposing to CD (p-value=2.763e-19). Further analysis of the newly discovered association suggested that 13 genes, such as ATG16L1 and LACC1, may play an important role in CD pathophysiological and etiological processes.

artificial intelligence, interaction, machine learning, (18 more...)

doi: 10.1002/gepi.21693

1505.01206

Country: North America > United States > Michigan > Ingham County (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Achieving a Hyperlocal Housing Price Index: Overcoming Data Sparsity by Bayesian Dynamical Modeling of Multiple Data Streams

Ren, You, Fox, Emily B., Bruce, Andrew

Understanding how housing values evolve over time is important to policy makers, consumers and real estate professionals. Existing methods for constructing housing indices are computed at a coarse spatial granularity, such as metropolitan regions, which can mask or distort price dynamics apparent in local markets, such as neighborhoods and census tracts. A challenge in moving to estimates at, for example, the census tract level is the sparsity of spatiotemporally localized house sales observations. Our work aims at addressing this challenge by leveraging observations from multiple census tracts discovered to have correlated valuation dynamics. Our proposed Bayesian nonparametric approach builds on the framework of latent factor models to enable a flexible, data-driven method for inferring the clustering of correlated census tracts. We explore methods for scalability and parallelizability of computations, yielding a housing valuation index at the level of census tract rather than zip code, and on a monthly basis rather than quarterly. Our analysis is provided on a large Seattle metropolitan housing dataset.

artificial intelligence, machine learning, modeling & simulation, (12 more...)

1505.01164

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Banking & Finance > Real Estate (1.00)
Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Modeling & Simulation (0.67)

Travis-Lumer, Yael, Goldberg, Yair

Support Vector Machines for Current Status Data

In this paper we aim to develop a general, model free, method for analyzing current status data using machine learning techniques. In particular, we propose a support vector machine (SVM) learning method for estimation of the failure time expectation for current status data. SVM was originally introduced by Vapnik in the 1990's and is firmly related to statistical learning theory (Vapnik, 1999). The choice of SVMs for current status data is motivated by the fact that SVMs can be implemented easily, have fast training speed, produce decision functions that have a strong generalization ability and can guarantee convergence to the optimal solution, under some weak assumptions (Shivaswamy et al., 2007). Current status data is a data format where the failure timeT is restricted to knowledge of whether or notT exceeds a random monitoring timeC . This data format is quite common and includes examples from various fields. Jewell and van der Laan (2004) mention a few examples including: studying the distribution of the age of a child at weaning given observation points; when conducting a partner study of HIV infection over a number of clinic visits; and when a tumor under investigation is occult and an animal is sacrificed at a certain time point in order to determine presence or absence of the tumor.

artificial intelligence, machine learning, tra vis-lumer and goldberg svm, (13 more...)

1505.00991

Country:

Asia (0.28)
North America > United States > New York (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Generalized Low Rank Models

Udell, Madeleine, Horn, Corinne, Zadeh, Reza, Boyd, Stephen

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, $k$-means, $k$-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

artificial intelligence, machine learning, optimization problem, (18 more...)

1410.0342

Country: North America > United States (1.00)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.45)

Industry:

Health & Medicine (1.00)
Education (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Tolpin, David, van de Meent, Jan Willem, Paige, Brooks, Wood, Frank

Output-Sensitive Adaptive Metropolis-Hastings for Probabilistic Programs

arXiv.org Artificial IntelligenceMay-5-2015

We introduce an adaptive output-sensitive Metropolis-Hastings algorithm for probabilistic models expressed as programs, Adaptive Lightweight Metropolis-Hastings (AdLMH). The algorithm extends Lightweight Metropolis-Hastings (LMH) by adjusting the probabilities of proposing random variables for modification to improve convergence of the program output. We show that AdLMH converges to the correct equilibrium distribution and compare convergence of AdLMH to that of LMH on several test problems to highlight different aspects of the adaptation scheme. We observe consistent improvement in convergence on the test problems.

artificial intelligence, machine learning, probabilistic program, (17 more...)

arXiv.org Artificial Intelligence

1501.05677

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Gaynanova, Irina, Booth, James, Wells, Martin T.

Penalized versus constrained generalized eigenvalue problems

arXiv.org Machine LearningMay-4-2015

We investigate the difference between using an $\ell_1$ penalty versus an $\ell_1$ constraint in generalized eigenvalue problems, such as principal component analysis and discriminant analysis. Our main finding is that an $\ell_1$ penalty may fail to provide very sparse solutions; a severe disadvantage for variable selection that can be remedied by using an $\ell_1$ constraint. Our claims are supported both by empirical evidence and theoretical analysis. Finally, we illustrate the advantages of an $\ell_1$ constraint in the context of discriminant analysis and principal component analysis.

artificial intelligence, machine learning, tibshirani, (16 more...)

1410.6131

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.75)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)