Causal Inference through a Witness Protection Program Machine Learning

One of the most fundamental problems in causal inference is the estimation of a causal effect when variables are confounded. This is difficult in an observational study, because one has no direct evidence that all confounders have been adjusted for. We introduce a novel approach for estimating causal effects that exploits observational conditional independencies to suggest "weak" paths in a unknown causal graph. The widely used faithfulness condition of Spirtes et al. is relaxed to allow for varying degrees of "path cancellations" that imply conditional independencies but do not rule out the existence of confounding causal paths. The outcome is a posterior distribution over bounds on the average causal effect via a linear programming approach and Bayesian inference. We claim this approach should be used in regular practice along with other default tools in observational studies.

Estimating the Size of a Large Network and its Communities from a Random Sample

Neural Information Processing Systems

Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V;E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.

Fuzzy expert system for prediction of prostate cancer Artificial Intelligence

A fuzzy expert system (FES) for the prediction of prostate cancer (PC) is prescribed in this article. Age, prostate-specific antigen (PSA), prostate volume (PV) and $\%$ Free PSA ($\%$FPSA) are fed as inputs into the FES and prostate cancer risk (PCR) is obtained as the output. Using knowledge based rules in Mamdani type inference method the output is calculated. If PCR $\ge 50\%$, then the patient shall be advised to go for a biopsy test for confirmation. The efficacy of the designed FES is tested against a clinical data set. The true prediction for all the patients turns out to be $68.91\%$ whereas only for positive biopsy cases it rises to $73.77\%$. This simple yet effective FES can be used as supportive tool for decision making in medical diagnosis.

A rational decision making framework for inhibitory control

Neural Information Processing Systems

Intelligent agents are often faced with the need to choose actions with uncertain consequences, and to modify those actions according to ongoing sensory processing and changing task demands. The requisite ability to dynamically modify or cancel planned actions is known as inhibitory control in psychology. We formalize inhibitory control as a rational decision-making problem, and apply to it to the classical stop-signal task. Using Bayesian inference and stochastic control tools, we show that the optimal policy systematically depends on various parameters of the problem, such as the relative costs of different action choices, the noise level of sensory inputs, and the dynamics of changing environmental demands. Our normative model accounts for a range of behavioral data in humans and animals in the stop-signal task, suggesting that the brain implements statistically optimal, dynamically adaptive, and reward-sensitive decision-making in the context of inhibitory control problems.

PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data

Neural Information Processing Systems

We propose a "soft greedy" learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a nontrivial tradeoff between sparsity (the number of rays used) and the magnitude ofthe separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets.