Sacaleanu, Bogdan
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation Models
Feng, Steven Y., Khetan, Vivek, Sacaleanu, Bogdan, Gershman, Anatole, Hovy, Eduard
We motivate and introduce CHARD: Clinical Health-Aware Reasoning across Dimensions, to investigate the capability of text generation models to act as implicit clinical knowledge bases and generate free-flow textual explanations about various health-related conditions across several dimensions. We collect and present an associated dataset, CHARDat, consisting of explanations about 52 health conditions across three clinical dimensions. We conduct extensive experiments using BART and T5 along with data augmentation, and perform automatic, human, and qualitative analyses. We show that while our models can perform decently, CHARD is very challenging with strong potential for further exploration.
Cross-Domain Reasoning via Template Filling
Rajagopal, Dheeraj, Khetan, Vivek, Sacaleanu, Bogdan, Gershman, Anatole, Fano, Andrew, Hovy, Eduard
In this paper, we explore the ability of sequence to sequence models to perform cross-domain reasoning. Towards this, we present a prompt-template-filling approach to enable sequence to sequence models to perform cross-domain reasoning. We also present a case-study with commonsense and health and well-being domains, where we study how prompt-template-filling enables pretrained sequence to sequence models across domains. Our experiments across several pretrained encoder-decoder models show that cross-domain reasoning is challenging for current models. We also show an in-depth error analysis and avenues for future research for reasoning across domains
Risk Event and Probability Extraction for Modeling Medical Risks
Jochim, Charles (IBM Research – Ireland) | Sacaleanu, Bogdan (IBM Research – Ireland) | Deleris, Léa A. (IBM Research – Ireland)
In this paper we address the task of extracting risk events and probabilities from free text, focusing in particular on the biomedical domain. While our initial motivation is to enable the determination of the parameters of a Bayesian belief network, our approach is not specific to that use case. We are the first to investigate this task as a sequence tagging problem where we label spans of text as events A or B that are then used to construct probability statements of the form P(A|B)=x. We show that our approach significantly outperforms an entity extraction baseline on a new annotated medical risk event corpus. We also explore semi-supervised methods that lead to modest improvement, encouraging further work in this direction.