AITopics | Discourse & Dialogue

Collaborating Authors

Discourse & Dialogue

Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).

News Overviews Instructional Materials AI-Alerts Classics

Multilingual Topic Models

Krstovski, Kriste, Kurtz, Michael J., Smith, David A., Accomazzi, Alberto

arXiv.org Machine LearningDec-18-2017

Scientific publications have evolved several features for mitigating vocabulary mismatch when indexing, retrieving, and computing similarity between articles. These mitigation strategies range from simply focusing on high-value article sections, such as titles and abstracts, to assigning keywords, often from controlled vocabularies, either manually or through automatic annotation. Various document representation schemes possess different cost-benefit tradeoffs. In this paper, we propose to model different representations of the same article as translations of each other, all generated from a common latent representation in a multilingual topic model. We start with a methodological overview on latent variable models for parallel document representations that could be used across many information science tasks. We then show how solving the inference problem of mapping diverse representations into a shared topic space allows us to evaluate representations based on how topically similar they are to the original article. In addition, our proposed approach provides means to discover where different concept vocabularies require improvement.

information retrieval, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

1712.06704

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

A survey of available corpora for building data-driven dialogue systems

@machinelearnbotDec-17-2017, 00:03:33 GMT

Bear with me, it's more interesting than it sounds:). Yes, this (46-page) paper does include a catalogue of data sets with dialogues from different domains, but it also includes a high level survey of techniques that are used in building dialogue systems (aka chatbots). In particular, it focuses on data-driven systems, i.e. those that incorporate some kind of learning from data. This particular paper is focused on corpus-based learning where you have been able to build up, or have access to, a data set on which you can train your models. If you want to build a defensible machine learning based business, having access to quality sources of data that your competitors don't is a good start.

artificial intelligence, machine learning, natural language, (18 more...)

@machinelearnbot

Genre: Overview (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)

Add feedback

Smart Business: automated sentiments analysis on top

@machinelearnbotDec-14-2017, 23:35:49 GMT

The modern world seems really fast and dynamic with a multitude of new products being launched. Marketing agencies are making fortune by monitoring the markets and delivering reports on consumers' opinions. For today, the feedback analysis is a separate area, let's say a growing industry with an array of products and services. And the prices for those services are pretty exorbitant. So, do vendors have a chance to cut down expenses?

sentiment analysis, smart business, vendor, (6 more...)

@machinelearnbot

Industry:

Marketing (0.51)
Information Technology (0.33)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.55)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.55)

Add feedback

Signals Build, Train, & Monetise Cryptotrading Strategies

#artificialintelligenceDec-14-2017, 02:40:37 GMT

No knowledge of machine learning is required for using Signals model builder. Just choose from a variety of indicators, ranging from traditional technical analysis to deep learning or sentiment analysis based on media monitoring and combine them together. However, if you happen to be a developer or a data scientist you can develop new trading indicators from scratch and monetize your data science skills through Signals indicator marketplace.

deep learning, monetise cryptotrading strategy, natural language, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.38)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Add feedback

TensorFlow for Short-Term Stocks Prediction

@machinelearnbotDec-13-2017, 20:00:11 GMT

News have been de-duplicated based on the title. Finally, TICKER, PUBLICATION_DATE and SUMMARY columns were kept. Sentiment Analysis was performed on the SUMMARY column using Loughran and McDonald Financial Sentiment Dictionary for financial sentiment analysis, implemented in the pysentiment python library. This library offers both a tokenizer, that performs also stemming and stop words removal, and a method to score a tokenized text.

artificial intelligence, machine learning, natural language, (16 more...)

@machinelearnbot

Industry: Banking & Finance > Trading (0.51)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.48)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Add feedback

Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge

Gallagher, Ryan J., Reing, Kyle, Kale, David, Steeg, Greg Ver

arXiv.org Machine LearningDec-3-2017

While generative models such as Latent Dirichlet Allocation (LDA) have proven fruitful in topic modeling, they often require detailed assumptions and careful specification of hyperparameters. Such model complexity issues only compound when trying to generalize generative models to incorporate human input. We introduce Correlation Explanation (CorEx), an alternative approach to topic modeling that does not assume an underlying generative model, and instead learns maximally informative topics through an information-theoretic framework. This framework naturally generalizes to hierarchical and semi-supervised extensions with no additional modeling assumptions. In particular, word-level domain knowledge can be flexibly incorporated within CorEx through anchor words, allowing topic separability and representation to be promoted with minimal human intervention. Across a variety of datasets, metrics, and experiments, we demonstrate that CorEx produces topics that are comparable in quality to those produced by unsupervised and semi-supervised variants of LDA.

corex, nephrology, vascular disease, (31 more...)

arXiv.org Machine Learning

1611.10277

Country:

North America > United States > California (0.28)
North America > United States > Missouri (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.53)

Add feedback

SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction

Wang, Hongwei, Zhang, Fuzheng, Hou, Min, Xie, Xing, Guo, Minyi, Liu, Qi

arXiv.org Machine LearningDec-3-2017

In online social networks people often express attitudes towards others, which forms massive sentiment links among users. Predicting the sign of sentiment links is a fundamental task in many areas such as personal advertising and public opinion analysis. Previous works mainly focus on textual sentiment classification, however, text information can only disclose the "tip of the iceberg" about users' true opinions, of which the most are unobserved but implied by other sources of information such as social relation and users' profile. To address this problem, in this paper we investigate how to predict possibly existing sentiment links in the presence of heterogeneous information. First, due to the lack of explicit sentiment links in mainstream social networks, we establish a labeled heterogeneous sentiment dataset which consists of users' sentiment relation, social relation and profile knowledge by entity-level sentiment extraction method. Then we propose a novel and flexible end-to-end Signed Heterogeneous Information Network Embedding (SHINE) framework to extract users' latent representations from heterogeneous networks and predict the sign of unobserved sentiment links. SHINE utilizes multiple deep autoencoders to map each user into a low-dimension feature space while preserving the network structure. We demonstrate the superiority of SHINE over state-of-the-art baselines on link prediction and node recommendation in two real-world datasets. The experimental results also prove the efficacy of SHINE in cold start scenario.

autoencoder, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1145/3159652.3159666

1712.00732

Country:

Asia > China (0.69)
North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.57)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(3 more...)

Add feedback

Sentiment Classification using Images and Label Embeddings

Graesser, Laura, Gupta, Abhinav, Sharma, Lakshay, Bakhturina, Evelina

arXiv.org Machine LearningDec-3-2017

In this project we analysed how much semantic information images carry, and how much value image data can add to sentiment analysis of the text associated with the images. To better understand the contribution from images, we compared models which only made use of image data, models which only made use of text data, and models which combined both data types. We also analysed if this approach could help sentiment classifiers generalize to unknown sentiments.

machine learning, natural language, relu, (20 more...)

arXiv.org Machine Learning

1712.00725

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Prediction-Constrained Topic Models for Antidepressant Recommendation

Hughes, Michael C., Hope, Gabriel, Weiner, Leah, McCoy, Thomas H., Perlis, Roy H., Sudderth, Erik B., Doshi-Velez, Finale

arXiv.org Machine LearningDec-1-2017

Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals: faithful generative explanations of high-dimensional data and accurate prediction of associated class labels. Existing approaches fail to balance these goals by not properly handling a fundamental asymmetry: the intended task is always predicting labels from data, not data from labels. Our new prediction-constrained objective trains models that predict labels from heldout data well while also producing good generative likelihoods and interpretable topic-word parameters. In a case study on predicting depression medications from electronic health records, we demonstrate improved recommendations compared to previous supervised topic models and high- dimensional logistic regression from words alone.

machine learning, natural language, prediction, (17 more...)

arXiv.org Machine Learning

1712.00499

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Uncertainty Estimates for Efficient Neural Network-based Dialogue Policy Optimisation

Tegho, Christopher, Budzianowski, Paweł, Gašić, Milica

arXiv.org Machine LearningNov-30-2017

In statistical dialogue management, the dialogue manager learns a policy that maps a belief state to an action for the system to perform. Efficient exploration is key to successful policy optimisation. Current deep reinforcement learning methods are very promising but rely on epsilon-greedy exploration, thus subjecting the user to a random choice of action during learning. Alternative approaches such as Gaussian Process SARSA (GPSARSA) estimate uncertainties and are sample efficient, leading to better user experience, but on the expense of a greater computational complexity. This paper examines approaches to extract uncertainty estimates from deep Q-networks (DQN) in the context of dialogue management. We perform an extensive benchmark of deep Bayesian methods to extract uncertainty estimates, namely Bayes-By-Backprop, dropout, its concrete variation, bootstrapped ensemble and alpha-divergences, combining it with DQN algorithm.

machine learning, natural language, reinforcement learning, (14 more...)

arXiv.org Machine Learning

1711.11486

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback