AITopics | Mueller, Jonas

Plotting

Mueller, Jonas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Latent Space Secrets of Denoising Text-Autoencoders

Shen, Tianxiao, Mueller, Jonas, Barzilay, Regina, Jaakkola, Tommi

arXiv.org Machine LearningMay-29-2019

While neural language models have recently demonstrated impressive performance in unconditional text generation, controllable generation and manipulation of text remain challenging. Latent variable generative models provide a natural approach for control, but their application to text has proven more difficult than to images. Models such as variational autoencoders may suffer from posterior collapse or learning an irregular latent geometry. We propose to instead employ adversarial autoencoders (AAEs) and add local perturbations by randomly replacing/removing words from input sentences during training. Within the prior enforced by the adversary, structured perturbations in the data space begin to carve and organize the latent space. Theoretically, we prove that perturbations encourage similar sentences to map to similar latent representations. Experimentally, we investigate the trade-off between text-generation and autoencoder-reconstruction capabilities. Our straightforward approach significantly improves over regular AAEs as well as other autoencoders, and enables altering the tense/sentiment of sentences through simple addition of a fixed vector offset to their latent representation.

deep learning, neural network, perturbation, (20 more...)

arXiv.org Machine Learning

1905.12777

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Unsupervised Text Style Transfer via Iterative Matching and Translation

Jin, Zhijing, Jin, Di, Mueller, Jonas, Matthews, Nicholas, Santus, Enrico

arXiv.org Artificial IntelligenceJan-31-2019

Text style transfer seeks to learn how to automatically rewrite sentences from a source domain to the target domain in different styles, while simultaneously preserving their semantic contents. A major challenge in this task stems from the lack of parallel data that connects the source and target styles. Existing approaches try to disentangle content and style, but this is quite difficult and often results in poor content-preservation and grammaticality. In contrast, we propose a novel approach by first constructing a pseudo-parallel resource that aligns a subset of sentences with similar content between source and target corpus. And then a standard sequence-to-sequence model can be applied to learn the style transfer. Subsequently, we iteratively refine the learned style transfer function while improving upon the imperfections in our original alignment. Our method is applied to the tasks of sentiment modification and formality transfer, where it outperforms state-of-the-art systems by a large margin. As an auxiliary contribution, we produced a publicly-available test set with human-generated style transfers for future community use.

dataset, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1901.11333

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

What made you do this? Understanding black-box decisions with sufficient input subsets

Carter, Brandon, Mueller, Jonas, Jain, Siddhartha, Gifford, David

arXiv.org Machine LearningOct-9-2018

Local explanation frameworks aim to rationalize particular decisions made by a black-box prediction model. Existing techniques are often restricted to a specific type of predictor or based on input saliency, which may be undesirably sensitive to factors unrelated to the model's decision making process. We instead propose sufficient input subsets that identify minimal subsets of features whose observed values alone suffice for the same decision to be reached, even if all other input feature values are missing. General principles that globally govern a model's decision-making can also be revealed by searching for clusters of such input patterns across many data points. Our approach is conceptually straightforward, entirely model-agnostic, simply implemented using instance-wise backward selection, and able to produce more concise rationales than existing techniques. We demonstrate the utility of our interpretation method on various neural network models trained on text, image, and genomic data.

deep learning, neural network, prediction, (21 more...)

arXiv.org Machine Learning

1810.03805

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Low-rank Bandit Methods for High-dimensional Dynamic Pricing

Mueller, Jonas, Syrgkanis, Vasilis, Taddy, Matt

arXiv.org Machine LearningJan-30-2018

We consider high dimensional dynamic multi-product pricing with an evolving but low-dimensional linear demand model. Assuming the temporal variation in cross-elasticities exhibits low-rank structure based on fixed (latent) features of the products, we show that the revenue maximization problem reduces to an online bandit convex optimization with side information given by the observed demands. We design dynamic pricing algorithms whose revenue approaches that of the best fixed price vector in hindsight, at a rate that only depends on the intrinsic rank of the demand model and not the number of products. Our approach applies a bandit convex optimization algorithm in a projected low-dimensional space spanned by the latent product features, while simultaneously learning this span via online singular value decomposition of a carefully-crafted matrix containing the observed demands.

algorithm, artificial intelligence, optimization problem, (17 more...)

arXiv.org Machine Learning

1801.10242

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Learning Optimal Interventions

Mueller, Jonas, Reshef, David N., Du, George, Jaakkola, Tommi

arXiv.org Machine LearningFeb-22-2017

Our goal is to identify beneficial interventions from observational data. We consider interventions that are narrowly focused (impacting few covariates) and may be tailored to each individual or globally enacted over a population. For applications where harmful intervention is drastically worse than proposing no change, we propose a conservative definition of the optimal intervention. Assuming the underlying relationship remains invariant under intervention, we develop efficient algorithms to identify the optimal intervention policy from limited data and provide theoretical guarantees for our approach in a Gaussian Process setting. Although our methods assume covariates can be precisely adjusted, they remain capable of improving outcomes in misspecified settings where interventions incur unintentional downstream effects. Empirically, our approach identifies good interventions in two practical applications: gene perturbation and writing improvement.

health & medicine, intervention, optimization problem, (19 more...)

arXiv.org Machine Learning

1606.05027

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Modeling & Simulation (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Siamese Recurrent Architectures for Learning Sentence Similarity

Mueller, Jonas (Massachusetts Institute of Technology) | Thyagarajan, Aditya (M. S. Ramaiah Institute of Technology)

AAAI ConferencesApr-19-2016

We present a siamese adaptation of the Long Short-Term Memory (LSTM) network for labeled data comprised of pairs of variable-length sequences. Our model is applied to assess semantic similarity between sentences, where we exceed state of the art, outperforming carefully handcrafted features and recently proposed neural network systems of greater complexity. For these applications, we provide word-embedding vectors supplemented with synonymic information to the LSTMs, which use a fixed size vector to encode the underlying meaning expressed in a sentence (irrespective of the particular wording/syntax). By restricting subsequent operations to rely on a simple Manhattan metric, we compel the sentence representations learned by our model to form a highly structured space whose geometry reflects complex semantic relationships. Our results are the latest in a line of findings that showcase LSTMs as powerful language models capable of tasks requiring intricate understanding.

deep learning, neural network, representation, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Principal Differences Analysis: Interpretable Characterization of Differences between Distributions

Mueller, Jonas, Jaakkola, Tommi

arXiv.org Machine LearningOct-29-2015

We introduce principal differences analysis (PDA) for analyzing differences between high-dimensional distributions. The method operates by finding the projection that maximizes the Wasserstein divergence between the resulting univariate populations. Relying on the Cramer-Wold device, it requires no assumptions about the form of the underlying distributions, nor the nature of their inter-class differences. A sparse variant of the method is introduced to identify features responsible for the differences. We provide algorithms for both the original minimax formulation as well as its semidefinite relaxation. In addition to deriving some convergence results, we illustrate how the approach may be applied to identify differences between cell populations in the somatosensory cortex and hippocampus as manifested by single cell RNA-seq. Our broader framework extends beyond the specific choice of Wasserstein divergence.

health & medicine, neurology, wasserstein distance, (20 more...)

arXiv.org Machine Learning

1510.08956

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback