poisson
Performance of weakly-supervised electronic health record-based phenotyping methods in rare-outcome settings
Hong, Yunjing, Nelson, Jennifer C., Williamson, Brian D.
Accurately identifying patients with specific medical conditions is a key challenge when using clinical data from electronic health records. Our objective was to comprehensively assess when weakly-supervised prediction methods, which use silver-standard labels (proxy measures of the true outcome) rather than gold-standard true labels, perform well in rare-outcome settings like vaccine safety studies. We compared three methods (PheNorm, MAP, and sureLDA) that combine structured features and features derived from clinical text using natural language processing, through an extensive simulation study with data-generating mechanisms ranging from simple to complex, varying outcome rates, and varying degrees of informative silver labels. We also considered using predicted probabilities to design a chart review validation study. No single method dominated the other across all prediction performance metrics. Probability-guided sampling selected a cohort enriched for patients with more mentions of important concepts in chart notes. SureLDA, the most complex of the three algorithms we considered, often performed well in simulations. Performance depended greatly on selected tuning parameters. Care should be taken when using weakly-supervised prediction methods in rare-outcome settings, particularly if the probabilities will be used in downstream analysis, but these methods can work well when silver labels are strong predictors of true outcomes.
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Alaska (0.04)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Health Care Technology > Medical Record (1.00)
A Power-Weighted Noncentral Complex Gaussian Distribution
The complex Gaussian distribution has been widely used as a fundamental spectral and noise model in signal processing and communication. However, its Gaussian structure often limits its ability to represent the diverse amplitude characteristics observed in individual source signals. On the other hand, many existing non-Gaussian amplitude distributions derived from hyperspherical models achieve good empirical fit due to their power-law structures, while they do not explicitly account for the complex-plane geometry inherent in complex-valued observations. In this paper, we propose a new probabilistic model for complex-valued random variables, which can be interpreted as a power-weighted noncentral complex Gaussian distribution. Unlike conventional hyperspherical amplitude models, the proposed model is formulated directly on the complex plane and preserves the geometric structure of complex-valued observations while retaining a higher-dimensional interpretation. The model introduces a nonlinear phase diffusion through a single shape parameter, enabling continuous control of the distributional geometry from arc-shaped diffusion along the phase direction to concentration of probability mass toward the origin. We formulate the proposed distribution and analyze the statistical properties of the induced amplitude distribution. The derived amplitude and power distributions provide a unified framework encompassing several widely used distributions in signal modeling, including the Rice, Nakagami, and gamma distributions. Experimental results on speech power spectra demonstrate that the proposed model consistently outperforms conventional distributions in terms of log-likelihood.
Exponential Family Discriminant Analysis: Generalizing LDA-Style Generative Classification to Non-Gaussian Models
We introduce Exponential Family Discriminant Analysis (EFDA), a unified generative framework that extends classical Linear Discriminant Analysis (LDA) beyond the Gaussian setting to any member of the exponential family. Under the assumption that each class-conditional density belongs to a common exponential family, EFDA derives closed-form maximum-likelihood estimators for all natural parameters and yields a decision rule that is linear in the sufficient statistic, recovering LDA as a special case and capturing nonlinear decision boundaries in the original feature space. We prove that EFDA is asymptotically calibrated and statistically efficient under correct specification, and we generalise it to $K \geq 2$ classes and multivariate data. Through extensive simulation across five exponential-family distributions (Weibull, Gamma, Exponential, Poisson, Negative Binomial), EFDA matches the classification accuracy of LDA, QDA, and logistic regression while reducing Expected Calibration Error (ECE) by $2$-$6\times$, a gap that is structural: it persists for all $n$ and across all class-imbalance levels, because misspecified models remain asymptotically miscalibrated. We further prove and empirically confirm that EFDA's log-odds estimator approaches the Cramér-Rao bound under correct specification, and is the only estimator in our comparison whose mean squared error converges to zero. Complete derivations are provided for nine distributions. Finally, we formally verify all four theoretical propositions in Lean 4, using Aristotle (Harmonic) and OpenGauss (Math, Inc.) as proof generators, with all outputs independently machine-checked by AXLE (Axiom).
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- Europe > France (0.04)
- North America (0.13)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (4 more...)
A Normative Theory for Causal Inference and Bayes Factor Computation in Neural Circuits
Wenhao Zhang, Si Wu, Brent Doiron, Tai Sing Lee
This study provides a normative theory for how Bayesian causal inference can be implemented in neural circuits. In both cognitive processes such as causal reasoning and perceptual inference such ascue integration, the nervous systems need to choose different models representing the underlying causal structures when making inferences on external stimuli.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Jordan (0.04)
- Oceania > Australia (0.04)
- North America > United States > Utah (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)