AITopics | Ngo, Hung Q.

Collaborating Authors

Ngo, Hung Q.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rk-means: Fast Clustering for Relational Data

Curtin, Ryan, Moseley, Ben, Ngo, Hung Q., Nguyen, XuanLong, Olteanu, Dan, Schleich, Maximilian

arXiv.org Machine LearningOct-10-2019

Conventional machine learning algorithms cannot be applied until a data matrix is available to process. When the data matrix needs to be obtained from a relational database via a feature extraction query, the computation cost can be prohibitive, as the data matrix may be (much) larger than the total input relation size. This paper introduces Rk-means, or relational k -means algorithm, for clustering relational data tuples without having to access the full data matrix. As such, we avoid having to run the expensive feature extraction query and storing its output. Our algorithm leverages the underlying structures in relational data. It involves construction of a small {\it grid coreset} of the data matrix for subsequent cluster construction. This gives a constant approximation for the k -means objective, while having asymptotic runtime improvements over standard approaches of first running the database query and then clustering. Empirical results show orders-of-magnitude speedup, and Rk-means can run faster on the database than even just computing the data matrix.

algorithm, artificial intelligence, survey article, (18 more...)

arXiv.org Machine Learning

1910.04939

Country: North America > United States (0.93)

Genre:

Research Report (0.70)
Workflow (0.48)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

On Optimality Conditions for Auto-Encoder Signal Recovery

Arpit, Devansh, Zhou, Yingbo, Ngo, Hung Q., Napp, Nils, Govindaraju, Venu

arXiv.org Machine LearningJul-13-2017

Auto-Encoders are unsupervised models that aim to learn patterns from observed data by minimizing a reconstruction cost. The useful representations learned are often found to be sparse and distributed. On the other hand, compressed sensing and sparse coding assume a data generating process, where the observed data is generated from some true latent signal source, and try to recover the corresponding signal from measurements. Looking at auto-encoders from this \textit{signal recovery perspective} enables us to have a more coherent view of these techniques. In this paper, in particular, we show that the \textit{true} hidden representation can be approximately recovered if the weight matrices are highly incoherent with unit $ \ell^{2} $ row length and the bias vectors takes the value (approximately) equal to the negative of the data mean. The recovery also becomes more and more accurate as the sparsity in hidden signals increases. Additionally, we empirically demonstrate that auto-encoders are capable of recovering the data generating dictionary when only data samples are given.

artificial intelligence, neural network, recovery, (16 more...)

arXiv.org Machine Learning

1605.07145

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Parallel Feature Selection Inspired by Group Testing

Zhou, Yingbo, Porwal, Utkarsh, Zhang, Ce, Ngo, Hung Q., Nguyen, XuanLong, Ré, Christopher, Govindaraju, Venu

Neural Information Processing SystemsDec-31-2014

This paper presents a parallel feature selection method for classification that scales up to very high dimensions and large data sizes. Our original method is inspired by group testing theory, under which the feature selection procedure consists of a collection of randomized tests to be performed in parallel. Each test corresponds to a subset of features, for which a scoring function may be applied to measure the relevance of the features in a classification task. We develop a general theory providing sufficient conditions under which true features are guaranteed to be correctly identified. Superior performance of our method is demonstrated on a challenging relation extraction task from a very large data set that have both redundant features and sample size in the order of millions. We present comprehensive comparisons with state-of-the-art feature selection methods on a range of data sets, for which our method exhibits competitive performance in terms of running time and accuracy. Moreover, it also yields substantial speedup when used as a pre-processing step for most other existing methods.

artificial intelligence, feature selection, health & medicine, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.49)

Industry: Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback