AITopics | Gehrmann, Sebastian

Collaborating Authors

Gehrmann, Sebastian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics

Das, Payel, Sercu, Tom, Wadhawan, Kahini, Padhi, Inkit, Gehrmann, Sebastian, Cipcigan, Flaviu, Chenthamarakshan, Vijil, Strobelt, Hendrik, Santos, Cicero dos, Chen, Pin-Yu, Yang, Yi Yan, Tan, Jeremy, Hedrick, James, Crain, Jason, Mojsilovic, Aleksandra

arXiv.org Machine LearningMay-22-2020

De novo therapeutic design is challenged by a vast chemical repertoire and multiple constraints such as high broad-spectrum potency and low toxicity. We propose CLaSS (Controlled Latent attribute Space Sampling) - a novel and efficient computational method for attribute-controlled generation of molecules, which leverages guidance from classifiers trained on an informative latent space of molecules modeled using a deep generative autoencoder. We further screen the generated molecules by using a set of deep learning classifiers in conjunction with novel physicochemical features derived from high-throughput molecular simulations. The proposed approach is employed for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency, which are emerging drug candidates for tackling antibiotic resistance. Synthesis and wet lab testing of only twenty designed sequences identified two novel and minimalist AMPs with high potency against diverse Gram-positive and Gram-negative pathogens, including the hard-to-treat multidrug-resistant K. pneumoniae, as well as low in vitro and in vivo toxicity. The proposed approach thus presents a viable path for faster discovery of potent and selective broad-spectrum antimicrobials with a higher success rate than state-of-the-art methods.

deep learning, neural network, sequence, (20 more...)

arXiv.org Machine Learning

2005.11248

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Gehrmann, Sebastian, Strobelt, Hendrik, Krüger, Robert, Pfister, Hanspeter, Rush, Alexander M.

arXiv.org Artificial IntelligenceJul-24-2019

Automation of tasks can have critical consequences when humans lose agency over decision processes. Deep learning models are particularly susceptible since current black-box approaches lack explainable reasoning. We argue that both the visual interface and model structure of deep learning systems need to take into account interaction design. We propose a framework of collaborative semantic inference (CSI) for the co-design of interactions and models to enable visual collaboration between humans and algorithms. The approach exposes the intermediate reasoning process of models which allows semantic interactions with the visual metaphors of a problem, which means that a user can both understand and control parts of the model reasoning process. We demonstrate the feasibility of CSI with a co-designed case study of a document summarization system.

deep learning, interaction, neural network, (21 more...)

arXiv.org Artificial Intelligence

1907.10739

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GLTR: Statistical Detection and Visualization of Generated Text

Gehrmann, Sebastian, Strobelt, Hendrik, Rush, Alexander M.

arXiv.org Artificial IntelligenceJun-10-2019

The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common Figure 1: The top-k overlay within GLTR. It is easy sampling schemes. In a human-subjects study, to distinguish sampled from written text. The real text we show that the annotation scheme provided is from the Wikipedia page of The Great British Bake by GLTR improves the human detection-rate Off, the fake from GPT-2 large with temperature 0.7. of fake text from 54% to 72% without any prior training.

artificial intelligence, gltr, natural language, (14 more...)

arXiv.org Artificial Intelligence

1906.04043

Genre: Research Report (0.84)

Industry: Media > News (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.49)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.38)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.38)

Add feedback

End-to-End Content and Plan Selection for Data-to-Text Generation

Gehrmann, Sebastian, Dai, Falcon Z., Elder, Henry, Rush, Alexander M.

arXiv.org Artificial IntelligenceOct-10-2018

Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG. This problem can be challenging when the form of the structured data varies between examples. This paper presents a survey of several extensions to sequence-to-sequence models to account for the latent content selection process, particularly variants of copy attention and coverage decoding. We further propose a training method based on diverse ensembling to encourage models to learn distinct sentence templates during training. An empirical evaluation of these techniques shows an increase in the quality of generated text across five automated metrics, as well as human evaluation.

computational linguistics, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

1810.047

Country:

North America > Canada (0.14)
Europe > Spain (0.14)
Europe > Germany (0.14)

Genre:

Overview (0.88)
Research Report (0.82)

Industry: Consumer Products & Services > Restaurants (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Bottom-Up Abstractive Summarization

Gehrmann, Sebastian, Deng, Yuntian, Rush, Alexander M.

arXiv.org Artificial IntelligenceAug-31-2018

Neural network-based methods for abstractive summarization produce outputs that are more fluent than other techniques, but which can be poor at content selection. This work proposes a simple technique for addressing this issue: use a data-efficient content selector to over-determine phrases in a source document that should be part of the summary. We use this selector as a bottom-up attention step to constrain the model to likely phrases. We show that this approach improves the ability to compress text, while still generating fluent summaries. This two-step process is both simpler and higher performing than other end-to-end content selection models, leading to significant improvements on ROUGE for both the CNN-DM and NYT corpus. Furthermore, the content selector can be trained with as little as 1,000 sentences, making it easy to transfer a trained summarizer to a new domain.

deep learning, neural network, summarization, (18 more...)

arXiv.org Artificial Intelligence

1808.10792

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

Strobelt, Hendrik, Gehrmann, Sebastian, Behrisch, Michael, Perer, Adam, Pfister, Hanspeter, Rush, Alexander M.

arXiv.org Artificial IntelligenceApr-24-2018

Neural Sequence-to-Sequence models have proven to be accurate and robust for many sequence prediction tasks, and have become the standard approach for automatic translation of text. The models work in a five stage blackbox process that involves encoding a source sequence to a vector space and then decoding out to a new target sequence. This process is now standard, but like many deep learning methods remains quite difficult to understand or debug. In this work, we present a visual analysis tool that allows interaction with a trained sequence-to-sequence model through each stage of the translation process. The aim is to identify which patterns have been learned and to detect model errors. We demonstrate the utility of our tool through several real-world large-scale sequence-to-sequence use cases.

deep learning, neural network, prediction, (21 more...)

arXiv.org Artificial Intelligence

1804.09299

Country:

Europe (0.68)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)

Add feedback

LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks

Strobelt, Hendrik, Gehrmann, Sebastian, Pfister, Hanspeter, Rush, Alexander M.

arXiv.org Artificial IntelligenceOct-30-2017

Recurrent neural networks, and in particular long short-term memory (LSTM) networks, are a remarkably effective tool for sequence modeling that learn a dense black-box hidden representation of their sequential input. Researchers interested in better understanding these models have studied the changes in hidden state representations over time and noticed some interpretable patterns but also significant noise. In this work, we present LSTMVIS, a visual analysis tool for recurrent neural networks with a focus on understanding these hidden state dynamics. The tool allows users to select a hypothesis input range to focus on local state changes, to match these states changes to similar patterns in a large data set, and to align these results with structural annotations from their domain. We show several use cases of the tool for analyzing specific hidden state properties on dataset containing nesting, phrase structure, and chord progressions, and demonstrate how the tool can be used to isolate patterns for further statistical analysis. We characterize the domain, the different stakeholders, and their goals and tasks.

deep learning, neural network, representation, (20 more...)

arXiv.org Artificial Intelligence

1606.07461

Country:

North America > United States > California (0.14)
North America > Canada > Quebec (0.14)
Europe > United Kingdom > Scotland (0.14)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Rule-Based and Deep Learning Models for Patient Phenotyping

Gehrmann, Sebastian, Dernoncourt, Franck, Li, Yeran, Carlson, Eric T., Wu, Joy T., Welt, Jonathan, Foote, John Jr., Moseley, Edward T., Grant, David W., Tyler, Patrick D., Celi, Leo Anthony

arXiv.org Machine LearningMar-25-2017

Objective: We investigate whether deep learning techniques for natural language processing (NLP) can be used efficiently for patient phenotyping. Patient phenotyping is a classification task for determining whether a patient has a medical condition, and is a crucial part of secondary analysis of healthcare data. We assess the performance of deep learning algorithms and compare them with classical NLP approaches. Materials and Methods: We compare convolutional neural networks (CNNs), n-gram models, and approaches based on cTAKES that extract pre-defined medical concepts from clinical notes and use them to predict patient phenotypes. The performance is tested on 10 different phenotyping tasks using 1,610 discharge summaries extracted from the MIMIC-III database. Results: CNNs outperform other phenotyping algorithms in all 10 tasks. The average F1-score of our model is 76 (PPV of 83, and sensitivity of 71) with our model having an F1-score up to 37 points higher than alternative approaches. We additionally assess the interpretability of our model by presenting a method that extracts the most salient phrases for a particular prediction. Conclusion: We show that NLP methods based on deep learning improve the performance of patient phenotyping. Our CNN-based algorithm automatically learns the phrases associated with each patient phenotype. As such, it reduces the annotation complexity for clinical domain experts, who are normally required to develop task-specific annotation rules and identify relevant phrases. Our method performs well in terms of both performance and interpretability, which indicates that deep learning is an effective approach to patient phenotyping based on clinicians' notes.

deep learning, phenotype, vascular disease, (26 more...)

arXiv.org Machine Learning

1703.08705

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.94)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
(10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback