AITopics | bidmach

Collaborating Authors

bidmach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Parallelizing Word2Vec in Shared and Distributed Memory

Ji, Shihao, Satish, Nadathur, Li, Sheng, Dubey, Pradeep

arXiv.org Machine LearningAug-8-2016

Word2Vec is a widely used algorithm for extracting low-dimensional vector representations of words. It generated considerable excitement in the machine learning and natural language processing (NLP) communities recently due to its exceptional performance in many NLP applications such as named entity recognition, sentiment analysis, machine translation and question answering. State-of-the-art algorithms including those by Mikolov et al. have been parallelized for multi-core CPU architectures but are based on vector-vector operations that are memory-bandwidth intensive and do not efficiently use computational resources. In this paper, we improve reuse of various data structures in the algorithm through the use of minibatching, hence allowing us to express the problem using matrix multiply operations. We also explore different techniques to distribute word2vec computation across nodes in a compute cluster, and demonstrate good strong scalability up to 32 nodes. In combination, these techniques allow us to scale up the computation near linearly across cores and nodes, and process hundreds of millions of words per second, which is the fastest word2vec implementation to the best of our knowledge.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1604.04661

Country: Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

BIDData/BIDMach

#artificialintelligenceApr-3-2016, 19:45:39 GMT

We recently ran some fresh benchmarks for Spark v1.1 and v1.2 and Graphlab clusters, and included some updated numbers from other recent published benchmarks. RCV1-v2 (Reuters news data, LYR2004 distribution) benchmarks are for OAA (One Against All) classification, since RCV1-v2 has 103 independent topic labels. RCV1-v2 is a small dataset (0.5 GB). BIDMach was run on a single Amazon g2.xlarge instance, while Spark was run on a cluster of m3.2xlarge high-memory instances. The other systems were run on an 8-code Intel E-2660 system.

artificial intelligence, bidmach, machine learning, (18 more...)

#artificialintelligence

Country:

North America > United States (0.15)
Asia > Middle East > Jordan (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fast Parallel SAME Gibbs Sampling on General Discrete Bayesian Networks

Seita, Daniel, Chen, Haoyu, Canny, John

arXiv.org Machine LearningNov-19-2015

A fundamental task in machine learning and related fields is to perform inference on Bayesian networks. Since exact inference takes exponential time in general, a variety of approximate methods are used. Gibbs sampling is one of the most accurate approaches and provides unbiased samples from the posterior but it has historically been too expensive for large models. In this paper, we present an optimized, parallel Gibbs sampler augmented with state replication (SAME or State Augmented Marginal Estimation) to decrease convergence time. We find that SAME can improve the quality of parameter estimates while accelerating convergence. Experiments on both synthetic and real data show that our Gibbs sampler is substantially faster than the state of the art sampler, JAGS, without sacrificing accuracy. Our ultimate objective is to introduce the Gibbs sampler to researchers in many fields to expand their range of feasible inference problems.

artificial intelligence, machine learning, sampler, (18 more...)

arXiv.org Machine Learning

1511.06416

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry: Education (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback