AITopics | Boutsidis, Christos

Collaborating Authors

Boutsidis, Christos

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Sparse Linear Encoders and Sparse PCA

Magdon-Ismail, Malik, Boutsidis, Christos

Neural Information Processing SystemsDec-31-2016

Principal components analysis~(PCA) is the optimal linear encoder of data. Sparse linear encoders (e.g., sparse PCA) produce more interpretable features that can promote better generalization. (\rn{1}) Given a level of sparsity, what is the best approximation to PCA? (\rn{2}) Are there efficient algorithms which can achieve this optimal combinatorial tradeoff? We answer both questions by providing the first polynomial-time algorithms to construct \emph{optimal} sparse linear auto-encoders; additionally, we demonstrate the performance of our algorithms on real data.

algorithm, artificial intelligence, neural network, (18 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Industry: Government (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)

Add feedback

Optimal Sparse Linear Auto-Encoders and Sparse PCA

Magdon-Ismail, Malik, Boutsidis, Christos

arXiv.org Artificial IntelligenceFeb-23-2015

Principal components analysis (PCA) is the optimal linear auto-encoder of data, and it is often used to construct features. Enforcing sparsity on the principal components can promote better generalization, while improving the interpretability of the features. We study the problem of constructing optimal sparse linear auto-encoders. Two natural questions in such a setting are: i) Given a level of sparsity, what is the best approximation to PCA that can be achieved? ii) Are there low-order polynomial-time algorithms which can asymptotically achieve this optimal tradeoff between the sparsity and the approximation quality? In this work, we answer both questions by giving efficient low-order polynomial-time algorithms for constructing asymptotically \emph{optimal} linear auto-encoders (in particular, sparse features with near-PCA reconstruction error) and demonstrate the performance of our algorithms on real data.

algorithm, artificial intelligence, neural network, (18 more...)

arXiv.org Artificial Intelligence

1502.06626

Country: North America > United States > New York (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Provable Deterministic Leverage Score Sampling

Papailiopoulos, Dimitris, Kyrillidis, Anastasios, Boutsidis, Christos

arXiv.org Machine LearningJun-2-2014

We explain theoretically a curious empirical phenomenon: "Approximating a matrix by deterministically selecting a subset of its columns with the corresponding largest leverage scores results in a good low-rank matrix surrogate". To obtain provable guarantees, previous work requires randomized sampling of the columns with probabilities proportional to their leverage scores. In this work, we provide a novel theoretical analysis of deterministic leverage score sampling. We show that such deterministic sampling can be provably as accurate as its randomized counterparts, if the leverage scores follow a moderately steep power-law decay. We support this power-law assumption by providing empirical evidence that such decay laws are abundant in real-world data sets. We then demonstrate empirically the performance of deterministic leverage score sampling, which many times matches or outperforms the state-of-the-art techniques.

algorithm, artificial intelligence, health & medicine, (16 more...)

arXiv.org Machine Learning

1404.153

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Communications (0.68)
Information Technology > Data Science (0.68)

Add feedback

Random Projections for Linear Support Vector Machines

Paul, Saurabh, Boutsidis, Christos, Magdon-Ismail, Malik, Drineas, Petros

arXiv.org Machine LearningApr-17-2014

Let X be a data matrix of rank \rho, whose rows represent n points in d-dimensional space. The linear support vector machine constructs a hyperplane separator that maximizes the 1-norm soft margin. We develop a new oblivious dimension reduction technique which is precomputed and can be applied to any input matrix X. We prove that, with high probability, the margin and minimum enclosing ball in the feature space are preserved to within \epsilon-relative error, ensuring comparable generalization as in the original space in the case of classification. For regression, we show that the margin is preserved to \epsilon-relative error with high probability. We present extensive experiments with real and synthetic data to support our theory.

artificial intelligence, health & medicine, matrix, (14 more...)

arXiv.org Machine Learning

1211.6085

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Sparse Features for PCA-Like Linear Regression

Boutsidis, Christos, Drineas, Petros, Magdon-Ismail, Malik

Neural Information Processing SystemsDec-31-2011

Principal Components Analysis~(PCA) is often used as a feature extraction procedure. Given a matrix $X \in \mathbb{R}^{n \times d}$, whose rows represent $n$ data points with respect to $d$ features, the top $k$ right singular vectors of $X$ (the so-called \textit{eigenfeatures}), are arbitrary linear combinations of all available features. The eigenfeatures are very useful in data analysis, including the regularization of linear regression. Enforcing sparsity on the eigenfeatures, i.e., forcing them to be linear combinations of only a \textit{small} number of actual features (as opposed to all available features), can promote better generalization error and improve the interpretability of the eigenfeatures. We present deterministic and randomized algorithms that construct such sparse eigenfeatures while \emph{provably} achieving in-sample performance comparable to regularized linear regression. Our algorithms are relatively simple and practically efficient, and we demonstrate their performance on several data sets.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.81)

Add feedback

Random Projections for $k$-means Clustering

Boutsidis, Christos, Zouzias, Anastasios, Drineas, Petros

Neural Information Processing SystemsDec-31-2010

This paper discusses the topic of dimensionality reduction for $k$-means clustering. We prove that any set of $n$ points in $d$ dimensions (rows in a matrix $A \in \RR^{n \times d}$) can be projected into $t = \Omega(k / \eps^2)$ dimensions, for any $\eps \in (0,1/3)$, in $O(n d \lceil \eps^{-2} k/ \log(d) \rceil )$ time, such that with constant probability the optimal $k$-partition of the point set is preserved within a factor of $2+\eps$. The projection is done by post-multiplying $A$ with a $d \times t$ random matrix $R$ having entries $+1/\sqrt{t}$ or $-1/\sqrt{t}$ with equal probability. A numerical implementation of our technique and experiments on a large face images dataset verify the speed and the accuracy of our theoretical results.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)

Add feedback

Random Projections for $k$-means Clustering

Boutsidis, Christos, Zouzias, Anastasios, Drineas, Petros

arXiv.org Artificial IntelligenceNov-20-2010

algorithm, artificial intelligence, data mining, (18 more...)

arXiv.org Artificial Intelligence

1011.4632

Country: North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback

Unsupervised Feature Selection for the $k$-means Clustering Problem

Boutsidis, Christos, Drineas, Petros, Mahoney, Michael W.

Neural Information Processing SystemsDec-31-2009

We present a novel feature selection algorithm for the $k$-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter $\epsilon \in (0,1)$, selects and appropriately rescales in an unsupervised manner $\Theta(k \log(k / \epsilon) / \epsilon^2)$ features from a dataset of arbitrary dimensions. We prove that, if we run any $\gamma$-approximate $k$-means algorithm ($\gamma \geq 1$) on the features selected using our method, we can find a $(1+(1+\epsilon)\gamma)$-approximate partition with high probability.

algorithm, health & medicine, oncology, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.14)
North America > United States > California > Santa Clara County (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback