Although data-sharing is crucial for making the best use of genetic data in diagnosing disease, many individuals who might donate data are concerned about privacy. Jagadeesh et al. describe a solution that combines a protocol from modern cryptography with frequency-based clinical genetics used to diagnose causal disease mutations in patients with monogenic disorders. This framework correctly identified the causal gene in cases involving actual patients, while protecting more than 99% of individual participants' most private variants.
Luo, Zhiyi (Shanghai Jiao Tong University) | Sha, Yuchen (Shanghai Jiao Tong University) | Zhu, Kenny Q. (Shanghai Jiao Tong University) | Hwang, Seung-Won (Yonsei University) | Wang, Zhongyuan (Microsoft Research Asia)
Commonsense causal reasoning is the process of capturing and understanding the causal dependencies amongst events and actions. Such events and actions can be expressed in terms, phrases or sentences in natural language text. Therefore, one possible way of obtaining causal knowledge is by extracting causal relations between terms or phrases from a large text corpus. However, causal relations in text are sparse, ambiguous, and sometimes implicit, and thus difficult to obtain. This paper attacks the problem of commonsense causality reasoning between short texts (phrases and sentences) using a data driven approach. We propose a framework that automatically harvests a network of causal-effect terms from a large web corpus. Backed by this network, we propose a novel and effective metric to properly model the causality strength between terms. We show these signals can be aggregated for causality reasonings between short texts, including sentences and phrases. In particular, our approach outperforms all previously reported results in the standard SEMEVAL COPA task by substantial margins.
Undetected errors in the expression measurements from highthroughput DNA microarrays and protein spectroscopy could seriously affect the diagnostic reliability in disease detection. In addition to a high resilience against such errors, diagnostic models need to be more comprehensible so that a deeper understanding of the causal interactions among biological entities like genes and proteins may be possible. In this paper, we introduce a robust knowledge discovery approach that addresses these challenges. First, the causal interactions among the genes and proteins in the noisy expression data are discovered automatically through Bayesian network learning. Then, the diagnosis of a disease based on the network is performed using a novel error-handling procedure, which automatically identifies the noisy measurements and accounts for their uncertainties during diagnosis. An application to the problem of ovarian cancer detection shows that the approach effectively discovers causal interactions among cancer-specific proteins. With the proposed error-handling procedure, the network perfectly distinguishes between the cancer and normal patients.
There is a brief description of the probabilistic causal graph model for representing, reasoning with, and learning causal structure using Bayesian networks. It is then argued that this model is closely related to how humans reason with and learn causal structure. It is shown that studies in psychology on discounting (reasoning concerning how the presence of one cause of an effect makes another cause less probable) support the hypothesis that humans reach the same judgments as algorithms for doing inference in Bayesian networks. Next, it is shown how studies by Piaget indicate that humans learn causal structure by observing the same independencies and dependencies as those used by certain algorithms for learning the structure of a Bayesian network. Based on this indication, a subjective definition of causality is forwarded. Finally, methods for further testing the accuracy of these claims are discussed.