deployment
Information-Theoretic Limits of Safety Verification for Self-Improving Systems
Can a safety gate permit unbounded beneficial self-modification while maintaining bounded cumulative risk? We formalize this question through dual conditions -- requiring sum delta_n < infinity (bounded risk) and sum TPR_n = infinity (unbounded utility) -- and establish a theory of their (in)compatibility. Classification impossibility (Theorem 1): For power-law risk schedules delta_n = O(n^{-p}) with p > 1, any classifier-based gate under overlapping safe/unsafe distributions satisfies TPR_n <= C_alpha * delta_n^beta via Holder's inequality, forcing sum TPR_n < infinity. This impossibility is exponent-optimal (Theorem 3). A second independent proof via the NP counting method (Theorem 4) yields a 13% tighter bound without Holder's inequality. Universal finite-horizon ceiling (Theorem 5): For any summable risk schedule, the exact maximum achievable classifier utility is U*(N, B) = N * TPR_NP(B/N), growing as exp(O(sqrt(log N))) -- subpolynomial. At N = 10^6 with budget B = 1.0, a classifier extracts at most U* ~ 87 versus a verifier's ~500,000. Verification escape (Theorem 2): A Lipschitz ball verifier achieves delta = 0 with TPR > 0, escaping the impossibility. Formal Lipschitz bounds for pre-LayerNorm transformers under LoRA enable LLM-scale verification. The separation is strict. We validate on GPT-2 (d_LoRA = 147,456): conditional delta = 0 with TPR = 0.352. Comprehensive empirical validation is in the companion paper [D2].
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
- North America > United States > Vermont (0.05)
- Europe > United Kingdom > England (0.04)
- Asia > Singapore (0.04)
- Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Banking & Finance (1.00)
- (3 more...)
'What is the mission?' With Iran, California military families fear another 'forever war'
Things to Do in L.A. With Iran, California military families fear another'forever war' Shalena Critchlow, at the Oceanside Pier, holds a photo of her son Cpl. Saiveon Critchlow, who recently completed his service with the U.S. Marines. This is read by an automated voice. Please report any issues or inconsistencies here .
- Asia > Middle East > Iraq (0.06)
- North America > United States > California > San Diego County > San Diego (0.06)
- Europe > Middle East (0.05)
- (16 more...)
- Transportation > Air (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military > Air Force (0.95)
- Europe > France (0.04)
- Asia > Azerbaijan (0.04)
- North America > United States > New York (0.04)
- (9 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Education (1.00)
- Government > Military (0.94)
- (3 more...)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Iowa (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology (0.93)
- Food & Agriculture > Agriculture (0.69)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada > British Columbia (0.04)
- (3 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France (0.04)
- North America > United States > Michigan (0.04)
- (4 more...)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.67)
FairMultipleDecisionMaking ThroughSoftInterventions
How to ensure fairness in algorithmic decision making models is an important task in machine learning [12,15]. Over the past years, many researchers have been devoted to the design of fair classification algorithms withrespecttoapre-defined protected attribute,suchasraceorsex,anda decision task/model, such as hiring [1,11,24]. In particular,one line of the work istoincorporate fairness constraints into classic learning algorithms tobuild fair classifiers from potentially biased data [4,13,29,31-33]. Most of previous research generally focuses on a single decision model.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (4 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine (0.93)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Africa > Zambia > Lusaka Province > Lusaka (0.04)
- Leisure & Entertainment (0.46)
- Energy (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Communications > Social Media (0.93)