AITopics

1408.6218

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

arXiv.org Machine LearningAug-26-2014

LARSEN-ELM: Selective Ensemble of Extreme Learning Machines using LARS for Blended Data

Han, Bo, He, Bo, Nian, Rui, Ma, Mengmeng, Zhang, Shujing, Li, Minghui, Lendasse, Amaury

Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data. We present a new machine learning framework called LARSEN-ELM for overcoming this problem. In our paper, we would like to show two key steps in LARSEN-ELM. In the first step, preprocessing, we select the input variables highly related to the output using least angle regression (LARS). In the second step, training, we employ Genetic Algorithm (GA) based selective ensemble and original ELM. In the experiments, we apply a sum of two sines and four datasets from UCI repository to verify the robustness of our approach. The experimental results show that compared with original ELM and other methods such as OP-ELM, GASEN-ELM and LSBoost, LARSEN-ELM significantly improve robustness performance while keeping a relatively high speed.

artificial intelligence, elm, machine learning, (19 more...)

doi: 10.1016/j.neucom.2014.01.069

1408.2003

Country:

Europe (0.46)
Asia > China (0.29)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Bowman, R. Sean, Heisterkamp, Douglas, Johnson, Jesse, O'Donnol, Danielle

An application of topological graph clustering to protein function prediction

arXiv.org Machine LearningAug-24-2014

We use a semisupervised learning algorithm based on a topological data analysis approach to assign functional categories to yeast proteins using similarity graphs. This new approach to analyzing biological networks yields results that are as good as or better than state of the art existing approaches.

artificial intelligence, bioinformatics, machine learning, (17 more...)

1408.5634

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.41)

arXiv.org Machine LearningAug-22-2014

Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time

Wang, Zhaoran, Lu, Huanran, Liu, Han

Sparse principal component analysis (PCA) involves nonconvex optimization for which the global solution is hard to obtain. To address this issue, one popular approach is convex relaxation. However, such an approach may produce suboptimal estimators due to the relaxation effect. To optimally estimate sparse principal subspaces, we propose a two-stage computational framework named "tighten after relax": Within the 'relax' stage, we approximately solve a convex relaxation of sparse PCA with early stopping to obtain a desired initial estimator; For the 'tighten' stage, we propose a novel algorithm called sparse orthogonal iteration pursuit (SOAP), which iteratively refines the initial estimator by directly solving the underlying nonconvex problem. A key concept of this two-stage framework is the basin of attraction. It represents a local region within which the `tighten' stage has desired computational and statistical guarantees. We prove that, the initial estimator obtained from the 'relax' stage falls into such a region, and hence SOAP geometrically converges to a principal subspace estimator which is minimax-optimal within a certain model class. Unlike most existing sparse PCA estimators, our approach applies to the non-spiked covariance models, and adapts to non-Gaussianity as well as dependent data settings. Moreover, through analyzing the computational complexity of the two stages, we illustrate an interesting phenomenon that larger sample size can reduce the total iteration complexity. Our framework motivates a general paradigm for solving many complex statistical problems which involve nonconvex optimization with provable guarantees.

artificial intelligence, machine learning, probability, (18 more...)

1408.5352

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry:

Semiconductors & Electronics (1.00)
Information Technology (1.00)
Health & Medicine (1.00)
Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Duan, Leo L., Clancy, John P., Szczesniak, Rhonda D.

Joint Hierarchical Gaussian Process Model with Application to Forecast in Medical Monitoring

arXiv.org Machine LearningAug-22-2014

A novel extrapolation method is proposed for longitudinal forecasting. A hierarchical Gaussian process model is used to combine nonlinear population change and individual memory of the past to make prediction. The prediction error is minimized through the hierarchical design. The method is further extended to joint modeling of continuous measurements and survival events. The baseline hazard, covariate and joint effects are conveniently modeled in this hierarchical structure. The estimation and inference are implemented in fully Bayesian framework using the objective and shrinkage priors. In simulation studies, this model shows robustness in latent estimation, correlation detection and high accuracy in forecasting. The model is illustrated with medical monitoring data from cystic fibrosis (CF) patients. Estimation and forecasts are obtained in the measurement of lung function and records of acute respiratory events. Keyword: Extrapolation, Joint Model, Longitudinal Model, Hierarchical Gaussian Process, Cystic Fibrosis, Medical Monitoring

artificial intelligence, machine learning, modeling & simulation, (18 more...)

1408.466

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

A Case Study in Text Mining: Interpreting Twitter Data From World Cup Tweets

Godfrey, Daniel, Johns, Caley, Meyer, Carl, Race, Shaina, Sadek, Carol

Cluster analysis is a field of data analysis that extracts underlying patterns in data. One application of cluster analysis is in text-mining, the analysis of large collections of text to find similarities between documents. We used a collection of about 30,000 tweets extracted from Twitter just before the World Cup started. A common problem with real world text data is the presence of linguistic noise. In our case it would be extraneous tweets that are unrelated to dominant themes. To combat this problem, we created an algorithm that combined the DBSCAN algorithm and a consensus matrix. This way we are left with the tweets that are related to those dominant themes. We then used cluster analysis to find those topics that the tweets describe. We clustered the tweets using k-means, a commonly used clustering algorithm, and Non-Negative Matrix Factorization (NMF) and compared the results. The two algorithms gave similar results, but NMF proved to be faster and provided more easily interpreted results. We explored our results using two visualization tools, Gephi and Wordle.

artificial intelligence, machine learning, social media, (15 more...)

1408.5427

Country: North America > United States > North Carolina (0.46)

Genre: Research Report (0.70)

Industry:

Information Technology > Services (0.84)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Uniform Sampling for Matrix Approximation

Cohen, Michael B., Lee, Yin Tat, Musco, Cameron, Musco, Christopher, Peng, Richard, Sidford, Aaron

Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances. We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear regression directly, it is enough to compute a better approximation. This observation leads to simple iterative row sampling algorithms for matrix approximation that run in input-sparsity time and preserve row structure and sparsity at all intermediate steps. In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows.

artificial intelligence, leverage score, machine learning, (17 more...)

1408.5099

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Rudi, Alessandro, Canas, Guille D., Rosasco, Lorenzo

On the Sample Complexity of Subspace Learning

A large number of algorithms in machine learning, from principal component analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral embedding and support estimation methods, rely on estimating a linear subspace from samples. In this paper we introduce a general formulation of this problem and derive novel learning error estimates. Our results rely on natural assumptions on the spectral properties of the covariance operator associated to the data distribu- tion, and hold for a wide class of metrics between subspaces. As special cases, we discuss sharp error estimates for the reconstruction properties of PCA and spectral support estimation. Key to our analysis is an operator theoretic approach that has broad applicability to spectral learning methods.

artificial intelligence, machine learning, subspace, (16 more...)

1408.5032

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Yun, Hyokun, Raman, Parameswaran, Vishwanathan, S. V. N.

Ranking via Robust Binary Classification and Parallel Parameter Estimation in Large-Scale Data

We propose RoBiRank, a ranking algorithm that is motivated by observing a close connection between evaluation metrics for learning to rank and loss functions for robust classification. It shows competitive performance on standard benchmark datasets against a number of other representative algorithms in the literature. We also discuss extensions of RoBiRank to large scale problems where explicit feature vectors and scores are not given. We show that RoBiRank can be efficiently parallelized across a large number of machines; for a task that requires 386, 133 49, 824, 519 pairwise interactions between items to be ranked, RoBi-Rank finds solutions that are of dramatically higher quality than that can be found by a state-of-the-art competitor algorithm, given the same amount of wall-clock time for computation.

algorithm, artificial intelligence, machine learning, (17 more...)

1402.2676

Country: North America > United States > Indiana > Tippecanoe County (0.46)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Barreto, A. M. S., Pineau, J., Precup, D.

Policy Iteration Based on Stochastic Factorization

Journal of Artificial Intelligence ResearchAug-20-2014

When a transition probability matrix is represented as the product of two stochastic matrices, one can swap the factors of the multiplication to obtain another transition matrix that retains some fundamental characteristics of the original. Since the derived matrix can be much smaller than its precursor, this property can be exploited to create a compact version of a Markov decision process (MDP), and hence to reduce the computational cost of dynamic programming. Building on this idea, this paper presents an approximate policy iteration algorithm called policy iteration based on stochastic factorization, or PISF for short. In terms of computational complexity, PISF replaces standard policy iteration's cubic dependence on the size of the MDP with a function that grows only linearly with the number of states in the model. The proposed algorithm also enjoys nice theoretical properties: it always terminates after a finite number of iterations and returns a decision policy whose performance only depends on the quality of the stochastic factorization. In particular, if the approximation error in the factorization is sufficiently small, PISF computes the optimal value function of the MDP. The paper also discusses practical ways of factoring an MDP and illustrates the usefulness of the proposed algorithm with an application involving a large-scale decision problem of real economical interest.

algorithm, factorization, mdp, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4301

AI Access Foundation

10897

Journal of Artificial Intelligence Research

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report (0.46)
Overview (0.46)

Industry: Energy > Power Industry (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)