AITopics

1410.7812

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.38)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)

arXiv.org Machine LearningDec-4-2014

LightLDA: Big Topic Models on Modest Compute Clusters

Yuan, Jinhui, Gao, Fei, Ho, Qirong, Dai, Wei, Wei, Jinliang, Zheng, Xun, Xing, Eric P., Liu, Tie-Yan, Ma, Wei-Ying

When building large-scale machine learning (ML) programs, such as big topic models or deep neural nets, one usually assumes such tasks can only be attempted with industrial-sized clusters with thousands of nodes, which are out of reach for most practitioners or academic researchers. We consider this challenge in the context of topic modeling on web-scale corpora, and show that with a modest cluster of as few as 8 machines, we can train a topic model with 1 million topics and a 1-million-word vocabulary (for a total of 1 trillion parameters), on a document collection with 200 billion tokens -- a scale not yet reported even with thousands of machines. Our major contributions include: 1) a new, highly efficient O(1) Metropolis-Hastings sampling algorithm, whose running cost is (surprisingly) agnostic of model size, and empirically converges nearly an order of magnitude faster than current state-of-the-art Gibbs samplers; 2) a structure-aware model-parallel scheme, which leverages dependencies within the topic model, yielding a sampling strategy that is frugal on machine memory and network communication; 3) a differential data-structure for model storage, which uses separate data structures for high- and low-frequency words to allow extremely large models to fit in memory, while maintaining high inference speed; and 4) a bounded asynchronous data-parallel scheme, which allows efficient distributed processing of massive data via a parameter server. Our distribution strategy is an instance of the model-and-data-parallel programming model underlying the Petuum framework for general distributed ML, and was implemented on top of the Petuum open-source system. We provide experimental evidence showing how this development puts massive models within reach on a small cluster while still enjoying proportional time cost reductions with increasing cluster size, in comparison with alternative options.

lightlda, machine learning, natural language, (20 more...)

1412.1576

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Doshi-Velez, Finale, Wallace, Byron, Adams, Ryan

Graph-Sparse LDA: A Topic Model with Structured Sparsity

arXiv.org Machine LearningNov-21-2014

Originally designed to model text, topic modeling has become a powerful tool for uncovering latent structure in domains including medicine, finance, and vision. The goals for the model vary depending on the application: in some cases, the discovered topics may be used for prediction or some other downstream task. In other cases, the content of the topic itself may be of intrinsic scientific interest. Unfortunately, even using modern sparse techniques, the discovered topics are often difficult to interpret due to the high dimensionality of the underlying space. To improve topic interpretability, we introduce Graph-Sparse LDA, a hierarchical topic model that leverages knowledge of relationships between words (e.g., as encoded by an ontology). In our model, topics are summarized by a few latent concept-words from the underlying graph that explain the observed words. Graph-Sparse LDA recovers sparse, interpretable summaries on two real-world biomedical datasets while matching state-of-the-art prediction performance.

graph-sparse lda, machine learning, natural language, (19 more...)

1410.451

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.49)

arXiv.org Machine LearningNov-9-2014

Model-Parallel Inference for Big Topic Models

Zheng, Xun, Kim, Jin Kyu, Ho, Qirong, Xing, Eric P.

In real world industrial applications of topic modeling, the ability to capture gigantic conceptual space by learning an ultra-high dimensional topical representation, i.e., the so-called "big model", is becoming the next desideratum after enthusiasms on "big data", especially for fine-grained downstream tasks such as online advertising, where good performances are usually achieved by regression-based predictors built on millions if not billions of input features. The conventional data-parallel approach for training gigantic topic models turns out to be rather inefficient in utilizing the power of parallelism, due to the heavy dependency on a centralized image of "model". Big model size also poses another challenge on the storage, where available model size is bounded by the smallest RAM of nodes. To address these issues, we explore another type of parallelism, namely model-parallelism, which enables training of disjoint blocks of a big topic model in parallel. By integrating data-parallelism with model-parallelism, we show that dependencies between distributed elements can be handled seamlessly, achieving not only faster convergence but also an ability to tackle significantly bigger model size. We describe an architecture for model-parallel inference of LDA, and present a variant of collapsed Gibbs sampling algorithm tailored for it. Experimental results demonstrate the ability of this system to handle topic modeling with unprecedented amount of 200 billion model variables only on a low-end cluster with very limited computational resources and bandwidth.

artificial intelligence, machine learning, natural language, (19 more...)

1411.2305

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Bansal, Trapit, Bhattacharyya, Chiranjib, Kannan, Ravindran

A provable SVD-based algorithm for learning topics in dominant admixture corpus

arXiv.org Machine LearningNov-4-2014

Topic models, such as Latent Dirichlet Allocation (LDA), posit that documents are drawn from admixtures of distributions over words, known as topics. The inference problem of recovering topics from admixtures, is NP-hard. Assuming separability, a strong assumption, [4] gave the first provable algorithm for inference. For LDA model, [6] gave a provable algorithm using tensor-methods. But [4,6] do not learn topic vectors with bounded $l_1$ error (a natural measure for probability vectors). Our aim is to develop a model which makes intuitive and empirically supported assumptions and to design an algorithm with natural, simple components such as SVD, which provably solves the inference problem for the model with bounded $l_1$ error. A topic in LDA and other models is essentially characterized by a group of co-occurring words. Motivated by this, we introduce topic specific Catchwords, group of words which occur with strictly greater frequency in a topic than any other topic individually and are required to have high frequency together rather than individually. A major contribution of the paper is to show that under this more realistic assumption, which is empirically verified on real corpora, a singular value decomposition (SVD) based algorithm with a crucial pre-processing step of thresholding, can provably recover the topics from a collection of documents drawn from Dominant admixtures. Dominant admixtures are convex combination of distributions in which one distribution has a significantly higher contribution than others. Apart from the simplicity of the algorithm, the sample complexity has near optimal dependence on $w_0$, the lowest probability that a topic is dominant, and is better than [4]. Empirical evidence shows that on several real world corpora, both Catchwords and Dominant admixture assumptions hold and the proposed algorithm substantially outperforms the state of the art [5].

artificial intelligence, machine learning, natural language, (18 more...)

1410.6991

Country:

North America > United States (0.93)
Asia > Middle East (0.93)
Europe (0.67)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (1.00)
Government > Regional Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Rahimtoroghi, Elahe (University of California, Santa Cruz) | Corcoran, Thomas (University of California, Santa Cruz) | Swanson, Reid (University of California, Santa Cruz) | Walker, Marilyn A. (University of California, Santa Cruz) | Sagae, Kenji (Institute for Creative Technologies, University of Southern California) | Gordon, Andrew (Institute for Creative Technologies, University of Southern California)

Minimal Narrative Annotation Schemes and Their Applications

AAAI ConferencesNov-1-2014

The increased use of large corpora in narrative research has created new opportunities for empirical research and intelligent narrative technologies. To best exploit the value of these corpora, several research groups are eschewing complex discourse analysis techniques in favor of high-level minimalist narrative annotation schemes that can be quickly applied, achieve high inter-rater agreement, and are amenable to automation using machine-learning techniques. In this paper we compare different annotation schemes that have been employed by two groups of researchers to annotate large corpora of narrative text. Using a dual-annotation methodology, we investigate the correlation between narrative clauses distinguished by their structural role (orientation, action, evaluation), their subjectivity, and their narrative level within the discourse. We find that each simple narrative annotation scheme captures a structurally distinct characteristic of real-world narratives, and each combination of labels is evident in a corpus of 19 weblog narratives (951 narrative clauses). We discuss several potential applications of minimalist narrative annotation schemes, noting the combination of label across these two annotation schemes that best support each task.

annotation scheme, narrative, proceedings, (16 more...)

Seventh Intelligent Narrative Technologies Workshop

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(18 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Freedman, Richard Gabriel (University of Massachusetts Amherst) | Jung, Hee-Tae (University of Massachusetts Amherst) | Zilberstein, Shlomo (University of Massachusetts Amherst)

Temporal and Object Relations in Plan and Activity Recognition for Robots Using Topic Models

AAAI ConferencesNov-1-2014

For robots to effectively interact with human users, it is necessary that they recognize what people in the environment are doing. This is especially the case when robots are performing complementary tasks since the human users are not following any specific process. There is much uncertainty in how people act and the duration of time they need to perform their actions. In this work, we discuss the use of topic models for such plan and activity recognition tasks. We begin with the development of a domain-independent representation of human postural information obtained from RGB-D sensor data. This representation may be used with Latent Dirichlet Allocation (LDA) topic models as an integration of plan and activity recognition. This is followed by a proposition of extensions to LDA that allow temporal and object relational information to also be used in plan and activity recognition tasks.

artificial intelligence, natural language, proceedings, (13 more...)

2014 AAAI Fall Symposium Series

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Asia > Middle East > Jordan (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > New York > New York County > New York City (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

AAAI ConferencesNov-1-2014

Humanoid Robots and Spoken Dialog Systems for Brief Health Interventions

Abeyruwan, Saminda (University of Miami) | Baral, Ramesh (Florida International University) | Yasavur, Ugan (Florida International University) | Lisetti, Christine (Florida International University) | Visser, Ubbo (University of Miami)

We combined a spoken dialog system that we developed to deliver brief health interventions with the fully autonomous humanoid robot (NAO). The dialog system is based on a framework facilitating Markov decision processes (MDP). It is optimized using reinforcement learning (RL) algorithms with data we collected from real user interactions. The system begins to learn optimal dialog strategies for initiative selection and for the type of confirmations that it uses during theinteraction. The health intervention, delivered by a 3D character instead of the NAO, has already been evaluated, with positive results in terms of task completion, ease of use, and future intention to use the system. The current spoken dialog system for the humanoid robot is a novelty and exists so far as a proof ofconcept.

artificial intelligence, machine learning, natural language, (15 more...)

2014 AAAI Fall Symposium Series

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.16)
North America > United States > Florida > Miami-Dade County > Coral Gables (0.05)
North America > United States > Florida > Hillsborough County > University (0.05)

Industry: Health & Medicine > Therapeutic Area (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

AAAI ConferencesOct-31-2014

Combining Non-Expert and Expert Crowd Work to Convert Web APIs to Dialog Systems

Huang, Ting-Hao K. (Carnegie Mellon University) | Lasecki, Walter S. (University of Rochester) | Ritter, Alan L. (The Ohio State University) | Bigham, Jeffrey P. (Carnegie Mellon University)

Thousands of web APIs expose data and services that would be useful to access with natural dialog, from weather and sports to Twitter and movies. The process of adapting each API to a robust dialog system is difficult and time-consuming, as it requires not only programming but also anticipating what is mostly likely to be asked and how it is likely to be asked. We present a crowd-powered system able to generate a natural languageinterface for arbitrary web APIs from scratch without domain-dependent training data or knowledge.Our approach combines two types of crowd workers: non-expert Mechanical Turk workers interpret the functions of the API and elicit information from the user, and expert oDesk workers provide a minimal sufficient scaffolding around the API to allow us to make general queries.We describe our multi-stage process and present results for each stage.

artificial intelligence, dialog system, natural language, (16 more...)

Second AAAI Conference on Human Computation and Crowdsourcing

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.16)
North America > United States > Ohio > Franklin County > Columbus (0.05)
North America > United States > New York > Monroe County > Rochester (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.76)

Hoo, Wai Lam, Chan, Chee Seng

Zero-Shot Object Recognition System based on Topic Model

arXiv.org Machine LearningOct-14-2014

Object recognition systems usually require fully complete manually labeled training data to train the classifier. In this paper, we study the problem of object recognition where the training samples are missing during the classifier learning stage, a task also known as zero-shot learning. We propose a novel zero-shot learning strategy that utilizes the topic model and hierarchical class concept. Our proposed method advanced where cumbersome human annotation stage (i.e. attribute-based classification) is eliminated. We achieve comparable performance with state-of-the-art algorithms in four public datasets: PubFig (67.09%), Cifar-100 (54.85%), Caltech-256 (52.14%), and Animals with Attributes (49.65%) when unseen classes exist in the classification task.

codebook, dataset, hic concept, (13 more...)

doi: 10.1109/THMS.2014.2358649

1410.3748

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Tennis (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
(2 more...)