AITopics | Discourse & Dialogue

Collaborating Authors

Discourse & Dialogue

Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).

News Overviews Instructional Materials AI-Alerts Classics

A Topic Model for Melodic Sequences

Spiliopoulou, Athina, Storkey, Amos

arXiv.org Machine LearningJun-27-2012

We examine the problem of learning a probabilistic model for melody directly from musical sequences belonging to the same genre. This is a challenging task as one needs to capture not only the rich temporal structure evident in music, but also the complex statistical dependencies among different music components. To address this problem we introduce the Variable-gram Topic Model, which couples the latent topic formalism with a systematic model for contextual information. We evaluate the model on next-step prediction. Additionally, we present a novel way of model evaluation, where we directly compare model samples with data sequences using the Maximum Mean Discrepancy of string kernels, to assess how close is the model distribution to the data distribution. We show that the model has the highest performance under both evaluation measures when compared to LDA, the Topic Bigram and related non-topic models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1206.6441

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Rethinking Collapsed Variational Bayes Inference for LDA

Sato, Issei, Nakagawa, Hiroshi

arXiv.org Machine LearningJun-27-2012

We propose a novel interpretation of the collapsed variational Bayes inference with a zero-order Taylor expansion approximation, called CVB0 inference, for latent Dirichlet allocation (LDA). We clarify the properties of the CVB0 inference by using the alpha-divergence. We show that the CVB0 inference is composed of two different divergence projections: alpha=1 and -1. This interpretation will help shed light on CVB0 works.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

1206.6435

Country: Asia (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.35)

Add feedback

Sparse Stochastic Inference for Latent Dirichlet allocation

Mimno, David, Hoffman, Matt, Blei, David

arXiv.org Machine LearningJun-27-2012

We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference. We used our algorithm to analyze a corpus of 1.2 million books (33 billion words) with thousands of topics. Our approach reduces the bias of variational inference and generalizes to many Bayesian hidden-variable models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1206.6425

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Nonparametric Bayes Pachinko Allocation

Li, Wei, Blei, David, McCallum, Andrew

arXiv.org Machine LearningJun-20-2012

Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific dataset. In this paper, we propose a nonparametric Bayesian prior for PAM based on a variant of the hierarchical Dirichlet process (HDP). Although the HDP can capture topic correlations defined by nested data structure, it does not automatically discover such correlations from unstructured data. By assuming an HDP-based prior for PAM, we are able to learn both the number of topics and how the topics are correlated. We evaluate our model on synthetic and real-world text datasets, and show that nonparametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.

dirichlet process, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1206.527

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.56)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

Dirichlet Process with Mixed Random Measures: A Nonparametric Topic Model for Labeled Data

Kim, Dongwoo, Kim, Suin, Oh, Alice

arXiv.org Machine LearningJun-18-2012

We describe a nonparametric topic model for labeled data. The model uses a mixture of random measures (MRM) as a base distribution of the Dirichlet process (DP) of the HDP framework, so we call it the DP-MRM. To model labeled data, we define a DP distributed random measure for each label, and the resulting model generates an unbounded number of topics for each label. We apply DP-MRM on single-labeled and multi-labeled corpora of documents and compare the performance on label prediction with MedLDA, LDA-SVM, and Labeled-LDA. We further enhance the model by incorporating ddCRP and modeling multi-labeled images for image segmentation and object labeling, comparing the performance with nCuts and rddCRP.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1206.4658

Country: Asia (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)

Add feedback

Latent Topic Models for Hypertext

Gruber, Amit, Rosen-Zvi, Michal, Weiss, Yair

arXiv.org Machine LearningJun-13-2012

Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collection such as the Internet, there has also been great interest in extending these approaches to hypertext [6, 9]. These approaches typically model links in an analogous fashion to how they model words - the document-link co-occurrence matrix is modeled in the same way that the document-word co-occurrence matrix is modeled in standard topic models. In this paper we present a probabilistic generative model for hypertext document collections that explicitly models the generation of links. Specifically, links from a word w to a document d depend directly on how frequent the topic of w is in d, in addition to the in-degree of d. We show how to perform EM learning on this model efficiently. By not modeling links as analogous to words, we end up using far fewer free parameters and obtain better link prediction results.

artificial intelligence, information management, natural language, (16 more...)

arXiv.org Machine Learning

1206.3254

Country:

Asia > Middle East > Israel (0.29)
North America > United States > Pennsylvania (0.28)
North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Recognizing Effective and Student-Adaptive Tutor Moves in Task-Oriented Tutorial Dialogue

Mitchell, Christopher Michael (North Carolina State University) | Ha, Eun Young (North Carolina State University) | Boyer, Kristy Elizabeth (North Carolina State University) | Lester, James C. (North Carolina State University)

AAAI ConferencesMay-20-2012

One-on-one tutoring is significantly more effective than traditional classroom instruction. In recent years, automated tutoring systems are approaching that level of effectiveness by engaging students in rich natural language dialogue that contributes to learning. A promising approach for further improving the effectiveness of tutorial dialogue systems is to model the differential effectiveness of tutorial strategies, identifying which dialogue moves or combinations of dialogue moves are associated with learning. It is also important to model the ways in which experienced tutors adapt to learner characteristics. This paper takes a corpus- based approach to these modeling tasks, presenting the results of a study in which task-oriented, textual tutorial dialogue was collected from remote one-on-one human tutoring sessions. The data reveal patterns of dialogue moves that are correlated with learning, and can directly inform the design of student-adaptive tutorial dialogue management systems.

dialogue, student, tutor, (13 more...)

AAAI Conferences

Twenty-Fifth International FLAIRS Conference

Country:

North America > United States > North Carolina > Wake County > Raleigh (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Add feedback

SenticNet 2: A Semantic and Affective Resource for Opinion Mining and Sentiment Analysis

Cambria, Erik (National University of Singapore) | Havasi, Catherine (MIT Media Lab) | Hussain, Amir (University of Stirling)

AAAI ConferencesMay-20-2012

Web 2.0 has changed the ways people communicate, collaborate, and express their opinions and sentiments. But despite social data on the Web being perfectly suitable for human consumption, they remain hardly accessible to machines. To bridge the cognitive and affective gap between word-level natural language data and the concept-level sentiments conveyed by them, we developed SenticNet 2, a publicly available semantic and affective resource for opinion mining and sentiment analysis. SenticNet 2 is built by means of sentic computing, a new paradigm that exploits both AI and Semantic Web techniques to better recognize, interpret, and process natural language opinions. By providing the semantics and sentics (that is, the cognitive and affective information) associated with over 14,000 concepts, SenticNet 2 represents one of the most comprehensive semantic resources for the development of affect-sensitive applications in fields such as social data mining, multimodal affective HCI, and social media marketing.

information, proceedings, senticnet 2, (16 more...)

AAAI Conferences

Twenty-Fifth International FLAIRS Conference

Country:

Europe > United Kingdom (0.29)
Asia > Singapore (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Health & Medicine > Health Care Providers & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Multilingual Topic Models for Unaligned Text

Boyd-Graber, Jordan, Blei, David

arXiv.org Machine LearningMay-9-2012

We develop the multilingual topic model for unaligned text (MuTo), a probabilistic model of text that is designed to analyze corpora composed of documents in two languages. From these documents, MuTo uses stochastic EM to simultaneously discover both a matching between the languages and multilingual latent topics. We demonstrate that MuTo is able to find shared topics on real-world multilingual corpora, successfully pairing related documents across languages. MuTo provides a new framework for creating multilingual topic models without needing carefully curated parallel corpora and allows applications built using the topic model formalism to be applied to a much wider class of corpora. Topic models are a powerful formalism for unsupervised analysis of corpora [1, 8].

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Machine Learning

1205.2657

Country:

Asia (0.68)
North America > United States > Ohio (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Variable Selection for Latent Dirichlet Allocation

Kim, Dongwoo, Chung, Yeonseung, Oh, Alice

arXiv.org Machine LearningMay-3-2012

In latent Dirichlet allocation (LDA), topics are multinomial distributions over the entire vocabulary. However, the vocabulary usually contains many words that are not relevant in forming the topics. We adopt a variable selection method widely used in statistical modeling as a dimension reduction tool and combine it with LDA. In this variable selection model for LDA (vsLDA), topics are multinomial distributions over a subset of the vocabulary, and by excluding words that are not informative for finding the latent topic structure of the corpus, vsLDA finds topics that are more robust and discriminative. We compare three models, vsLDA, LDA with symmetric priors, and LDA with asymmetric priors, on heldout likelihood, MCMC chain consistency, and document classification. The performance of vsLDA is better than symmetric LDA for likelihood and classification, better than asymmetric LDA for consistency and classification, and about the same in the other comparisons.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1205.1053

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback