prevalence
- North America > United States (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > United Kingdom > England > Shropshire (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.67)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Public Health (1.00)
- (12 more...)
- North America > United States > New York (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- Europe (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (5 more...)
- North America > United States (0.04)
- Asia > Singapore (0.04)
- North America > United States (0.47)
- Europe > France (0.05)
- Europe > Portugal (0.05)
- (34 more...)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
- Health & Medicine > Public Health (0.96)
- Europe > Austria > Vienna (0.04)
- Asia > Middle East > Jordan (0.04)
- Oceania > Australia > Queensland (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Dermatology (0.47)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments
Fong, Edwin, James, Lancelot F., Lee, Juho
Modeling sparse count data, which arise across numerous scientific fields, presents significant statistical challenges. This chapter addresses these challenges in the context of infectious disease prediction, with a focus on predicting outbreaks in geographic regions that have historically reported zero cases. To this end, we present the detailed computational framework and experimental application of the Poisson Hierarchical Indian Buffet Process (PHIBP), with demonstrated success in handling sparse count data in microbiome and ecological studies. The PHIBP's architecture, grounded in the concept of absolute abundance, systematically borrows statistical strength from related regions and circumvents the known sensitivities of relative-rate methods to zero counts. Through a series of experiments on infectious disease data, we show that this principled approach provides a robust foundation for generating coherent predictive distributions and for the effective use of comparative measures such as alpha and beta diversity. The chapter's emphasis on algorithmic implementation and experimental results confirms that this unified framework delivers both accurate outbreak predictions and meaningful epidemiological insights in data-sparse settings.
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > United States > Indiana > Hamilton County > Fishers (0.04)
Provable Recovery of Locally Important Signed Features and Interactions from Random Forest
Vuk, Kata, Ihlo, Nicolas Alexander, Behr, Merle
Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.
- Europe > Germany > Bavaria > Regensburg (0.04)
- North America > United States > New York (0.04)
- North America > United States > Florida > Broward County (0.04)
Are generative AI text annotations systematically biased?
Stolwijk, Sjoerd B., Boukes, Mark, Trilling, Damian
This paper investigates bias in GLLM annotations by conceptually replicating manual annotations of Boukes (2024). Using various GLLMs (Llama3.1:8b, Llama3.3:70b, GPT4o, Qwen2.5:72b) in combination with five different prompts for five concepts (political content, interactivity, rationality, incivility, and ideology). We find GLLMs perform adequate in terms of F1 scores, but differ from manual annotations in terms of prevalence, yield substantively different downstream results, and display systematic bias in that they overlap more with each other than with manual annotations. Differences in F1 scores fail to account for the degree of bias.
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- Asia > Middle East > Jordan (0.04)