fdr
Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables
Conformal selection (CS) uses calibration data to identify test inputs whose unobserved outcomes are likely to satisfy a pre-specified minimal quality requirement, while controlling the false discovery rate (FDR). Existing methods fix the target FDR level before observing data, which prevents the user from adapting the balance between number of selected test inputs and FDR to downstream needs and constraints based on the available data. For example, in genomics or neuroimaging, researchers often inspect the distribution of test statistics, and decide how aggressively to pursue candidates based on observed evidence strength and available follow-up resources. To address this limitation, we introduce {post-hoc CS} (PH-CS), which generates a path of candidate selection sets, each paired with a data-driven false discovery proportion (FDP) estimate. PH-CS lets the user select any operating point on this path by maximizing a user-specified utility, arbitrarily balancing selection size and FDR. Building on conformal e-variables and the e-Benjamini-Hochberg (e-BH) procedure, PH-CS is proved to provide a finite-sample post-hoc reliability guarantee whereby the ratio between estimated FDP level and true FDP is, on average, upper bounded by $1$, so that the average estimated FDP is, to first order, a valid upper bound on the true FDR. PH-CS is extended to control quality defined in terms of a general risk. Experiments on synthetic and real-world datasets demonstrate that, unlike CS, PH-CS can consistently satisfy user-imposed utility constraints while producing reliable FDP estimates and maintaining competitive FDR control.
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- North America > United States > Utah > Summit County > Park City (0.04)
- North America > Greenland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > New Finding (0.34)
- Research Report > Experimental Study (0.34)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
- North America > United States > New York (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > Washington > King County > Seattle (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
- Health & Medicine > Therapeutic Area > Immunology (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
- North America > United States > Illinois (0.04)
- Asia > Middle East > Oman (0.04)
AMDP: An Adaptive Detection Procedure for False Discovery Rate Control in High-Dimensional Mediation Analysis
High-dimensional mediation analysis is often associated with a multiple testing problem for detecting significant mediators. Assessing the uncertainty of this detecting process via false discovery rate (FDR) has garnered great interest. To control the FDR in multiple testing, two essential steps are involved: ranking and selection. Existing approaches either construct p-values without calibration or disregard the joint information across tests, leading to conservation in FDR control or non-optimal ranking rules for multiple hypotheses. In this paper, we develop an adaptive mediation detection procedure (referred to as AMDP) to identify relevant mediators while asymptotically controlling the FDR in high-dimensional mediation analysis. AMDP produces the optimal rule for ranking hypotheses and proposes a data-driven strategy to determine the threshold for mediator selection. This novel method captures information from the proportions of composite null hypotheses and the distribution of p-values, which turns the high dimensionality into an advantage instead of a limitation. The numerical studies on synthetic and real data sets illustrate the performances of AMDP compared with existing approaches.