Mazaitis, Kathryn
Computationally Assisted Quality Control for Public Health Data Streams
Joshi, Ananya, Mazaitis, Kathryn, Rosenfeld, Roni, Wilder, Bryan
Irregularities in public health data streams (like COVID-19 Cases) hamper data-driven decision-making for public health stakeholders. A real-time, computer-generated list of the most important, outlying data points from thousands of daily-updated public health data streams could assist an expert reviewer in identifying these irregularities. However, existing outlier detection frameworks perform poorly on this task because they do not account for the data volume or for the statistical properties of public health streams. Accordingly, we developed FlaSH (Flagging Streams in public Health), a practical outlier detection framework for public health data users that uses simple, scalable models to capture these statistical properties explicitly. In an experiment where human experts evaluate FlaSH and existing methods (including deep learning approaches), FlaSH scales to the data volume of this task, matches or exceeds these other methods in mean accuracy, and identifies the outlier points that users empirically rate as more helpful. Based on these results, FlaSH has been deployed on data streams used by public health stakeholders.
Conversational Neuro-Symbolic Commonsense Reasoning
Arabshahi, Forough, Lee, Jennifer, Gawarecki, Mikayla, Mazaitis, Kathryn, Azaria, Amos, Mitchell, Tom
One aspect of human commonsense reasoning is the ability to make presumptions about daily experiences, activities and social interactions with others. We propose a new commonsense reasoning benchmark where the task is to uncover commonsense presumptions implied by imprecisely stated natural language commands in the form of if-then-because statements. For example, in the command "If it snows at night then wake me up early because I don't want to be late for work" the speaker relies on commonsense reasoning of the listener to infer the implicit presumption that it must snow enough to cause traffic slowdowns. Such if-then-because commands are particularly important when users instruct conversational agents. We release a benchmark data set for this task, collected from humans and annotated with commonsense presumptions. We develop a neuro-symbolic theorem prover that extracts multi-hop reasoning chains and apply it to this problem. We further develop an interactive conversational framework that evokes commonsense knowledge from humans for completing reasoning chains.
Bootstrapping Distantly Supervised IE Using Joint Learning and Small Well-Structured Corpora
Bing, Lidong (Tencent Inc.) | Dhingra, Bhuwan (Carnegie Mellon University) | Mazaitis, Kathryn (Carnegie Mellon University) | Park, Jong Hyuk (Carnegie Mellon University) | Cohen, William W. (Carnegie Mellon University)
We propose a framework to improve the performance of distantly-supervised relation extraction, by jointly learning to solve two related tasks: concept-instance extraction and relation extraction. We further extend this framework to make a novel use of document structure: in some small, well-structured corpora, sections can be identified that correspond to relation arguments, and distantly-labeled examples from such sections tend to have good precision. Using these as seeds we extract additional relation examples by applying label propagation on a graph composed of noisy examples extracted from a large unstructured testing corpus. Combined with the soft constraint that concept examples should have the same type as the second argument of the relation, we get significant improvements over several state-of-the-art approaches to distantly-supervised relation extraction, and reasonable extraction performance even with very small set of distant labels.
A Soft Version of Predicate Invention Based on Structured Sparsity
Wang, William Yang (Carnegie Mellon University) | Mazaitis, Kathryn (Carnegie Mellon University) | Cohen, William W. (Carnegie Mellon University)
In predicate invention (PI), new predicates are introduced into a logical theory, usually by rewriting a group of closely-related rules to use a common invented predicate as a "subroutine". PI is difficult, since a poorly-chosen invented predicate may lead to error cascades. Here we suggest a "soft" version of predicate invention: instead of explicitly creating new predicates, we implicitly group closely-related rules by using structured sparsity to regularize their parameters together. We show that soft PI, unlike hard PI, consistently improves over previous strong baselines for structure-learning on two large-scale tasks.
Never-Ending Learning
Mitchell, Tom M. (Carnegie Mellon University) | Cohen, William (Carnegie Mellon University) | Hruschka, Estevam (University of Sao Carlos) | Talukdar, Partha (Indian Institute of Science) | Betteridge, Justin (Carnegie Mellon University) | Carlson, Andrew (Google) | Mishra, Bhavana Dalvi (Carnegien Mellon University) | Gardner, Matthew (Carnegie Mellon University) | Kisiel, Bryan (Carnegie Mellon University) | Krishnamurthy, Jayant (Carnegie Mellon University) | Lao, Ni (Google) | Mazaitis, Kathryn (Carnegie Mellon University) | Mohamed, Thahir (Carnegie Mellon University) | Nakashole, Ndapa (Carnegie Mellon University) | Platanios, Emmanouil Antonios (Ohioe State University) | Ritter, Alan (Carnegie Mellon University) | Samadi, Mehdi (Duolingo) | Settles, Burr (Carnegie Mellon University) | Wang, Richard (Carnegie Mellon University) | Wijaya, Derry (Carnegie Mellon University) | Gupta, Abhinav (Carnegie Mellon University) | Chen, Xinlei (Alpine Data Lab) | Saparov, Abulhair (Pittsburgh Supercomputer Center) | Greaves, Malcolm | Welling, Joel
Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning performed by humans. As a case study, we describe the Never-Ending Language Learner (NELL), which achieves some of the desired properties of a never-ending learner, and we discuss lessons learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million confidence-weighted beliefs (e.g., servedWith(tea, biscuits) ). NELL has also learned millions of features and parameters that enable it to read these beliefs from the web. Additionally, it has learned to reason over these beliefs to infer new beliefs, and is able to extend its ontology by synthesizing new relational predicates. NELL can be tracked online at http://rtw.ml.cmu.edu, and followed on Twitter at @CMUNELL.
ProPPR: Efficient First-Order Probabilistic Logic Programming for Structure Discovery, Parameter Learning, and Scalable Inference
Wang, William Yang (Carnegie Mellon University) | Mazaitis, Kathryn (Carnegie Mellon University) | Cohen, William W (Carnegie Mellon University)
A key challenge in statistical relational learning is to develop a semantically rich formalism that supports efficient probabilistic reasoning using large collections of extracted information. This paper presents a new, scalable probabilistic logic called ProPPR, which further extends stochastic logic programs (SLP) to a framework that enables efficient learning and inference on graphs: using an abductive second-order probabilistic logic, we show that first-order theories can be automatically generated via parameter learning; that in parameter learning, weight learning can be performed using parallel stochastic gradient descent with a supervised personalized PageRank algorithm; and that most importantly, queries can be approximately grounded with a small graph, and inference is independent of the size of the database.