Goto

Collaborating Authors

 Education


Attack-Resistant Uniform Fairness for Linear and Smooth Contextual Bandits

arXiv.org Machine Learning

Modern systems, such as digital platforms and service systems, increasingly rely on contextual bandits for online decision-making; however, their deployment can inadvertently create unfair exposure among arms, undermining long-term platform sustainability and supplier trust. This paper studies the contextual bandit problem under a uniform $(1-ฮด)$-fairness constraint, and addresses its unique vulnerabilities to strategic manipulation. The fairness constraint ensures that preferential treatment is strictly justified by an arm's actual reward across all contexts and time horizons, using uniformity to prevent statistical loopholes. We develop novel algorithms that achieve (nearly) minimax-optimal regret for both linear and smooth reward functions, while maintaining strong $(1-\tilde{O}(1/T))$-fairness guarantees, and further characterize the theoretically inherent yet asymptotically marginal "price of fairness". However, we reveal that such merit-based fairness becomes uniquely susceptible to signal manipulation. We show that an adversary with a minimal $\tilde{O}(1)$ budget can not only degrade overall performance as in traditional attacks, but also selectively induce insidious fairness-specific failures while leaving conspicuous regret measures largely unaffected. To counter this, we design robust variants incorporating corruption-adaptive exploration and error-compensated thresholding. Our approach yields the first minimax-optimal regret bounds under $C$-budgeted attack while preserving $(1-\tilde{O}(1/T))$-fairness. Numerical experiments and a real-world case demonstrate that our algorithms sustain both fairness and efficiency.


Theory of Optimal Learning Rate Schedules and Scaling Laws for a Random Feature Model

arXiv.org Machine Learning

Setting the learning rate for a deep learning model is a critical part of successful training, yet choosing this hyperparameter is often done empirically with trial and error. In this work, we explore a solvable model of optimal learning rate schedules for a powerlaw random feature model trained with stochastic gradient descent (SGD). We consider the optimal schedule $ฮท_T^\star(t)$ where $t$ is the current iterate and $T$ is the total training horizon. This schedule is computed both numerically and analytically (when possible) using optimal control methods. Our analysis reveals two regimes which we term the easy phase and hard phase. In the easy phase the optimal schedule is a polynomial decay $ฮท_T^\star(t) \simeq T^{-ฮพ} (1-t/T)^ฮด$ where $ฮพ$ and $ฮด$ depend on the properties of the features and task. In the hard phase, the optimal schedule resembles warmup-stable-decay with constant (in $T$) initial learning rate and annealing performed over a vanishing (in $T$) fraction of training steps. We investigate joint optimization of learning rate and batch size, identifying a degenerate optimality condition. Our model also predicts the compute-optimal scaling laws (where model size and training steps are chosen optimally) in both easy and hard regimes. Going beyond SGD, we consider optimal schedules for the momentum $ฮฒ(t)$, where speedups in the hard phase are possible. We compare our optimal schedule to various benchmarks in our task including (1) optimal constant learning rates $ฮท_T(t) \sim T^{-ฮพ}$ (2) optimal power laws $ฮท_T(t) \sim T^{-ฮพ} t^{-ฯ‡}$, finding that our schedule achieves better rates than either of these. Our theory suggests that learning rate transfer across training horizon depends on the structure of the model and task. We explore these ideas in simple experimental pretraining setups.


Why Are Some Women Training for Pregnancy Like It's a Marathon?

WIRED

Why Are Some Women Training for Pregnancy Like It's a Marathon? A growing legion of "zero trimester" influencers are convincing followers that healthy pregnancies are a choice--and that raw milk, watching sunsets, and pricey specialized courses can help. Three years ago, Esther Rohr and her husband decided to start thinking about pregnancy. The 26-year-old Oregon-based wedding photographer made small but intentional lifestyle changes--going to bed earlier, drinking more water and less alcohol, dialing in her fitness, loading up on protein, and taking supplements like beef organ capsules and Vitamin D3. They started charging their phones in the kitchen for better sleep and unplugging their Wi-Fi at night, because her research suggested it might affect cellular health. Concerned about their exposure to reproductive toxins, Rohr began the slow, painstaking task of swapping out all their synthetic workout clothes, nonstick pans, and scented personal care products that might contain phthalates or other endocrine-disrupting chemicals. She bought an air purifier and hopes to eventually replace their LED bulbs with incandescents, because she worries they might be affecting her circadian rhythm.


Preference-based Conditional Treatment Effects and Policy Learning

arXiv.org Machine Learning

We introduce a new preference-based framework for conditional treatment effect estimation and policy learning, built on the Conditional Preference-based Treatment Effect (CPTE). CPTE requires only that outcomes be ranked under a preference rule, unlocking flexible modeling of heterogeneous effects with multivariate, ordinal, or preference-driven outcomes. This unifies applications such as conditional probability of necessity and sufficiency, conditional Win Ratio, and Generalized Pairwise Comparisons. Despite the intrinsic non-identifiability of comparison-based estimands, CPTE provides interpretable targets and delivers new identifiability conditions for previous unidentifiable estimands. We present estimation strategies via matching, quantile, and distributional regression, and further design efficient influence-function estimators to correct plug-in bias and maximize policy value. Synthetic and semi-synthetic experiments demonstrate clear performance gains and practical impact.


Learning Better Certified Models from Empirically-Robust Teachers

arXiv.org Machine Learning

Adversarial training attains strong empirical robustness to specific adversarial attacks by training on concrete adversarial perturbations, but it produces neural networks that are not amenable to strong robustness certificates through neural network verification. On the other hand, earlier certified training schemes directly train on bounds from network relaxations to obtain models that are certifiably robust, but display sub-par standard performance. Recent work has shown that state-of-the-art trade-offs between certified robustness and standard performance can be obtained through a family of losses combining adversarial outputs and neural network bounds. Nevertheless, differently from empirical robustness, verifiability still comes at a significant cost in standard performance. In this work, we propose to leverage empirically-robust teachers to improve the performance of certifiably-robust models through knowledge distillation. Using a versatile feature-space distillation objective, we show that distillation from adversarially-trained teachers consistently improves on the state-of-the-art in certified training for ReLU networks across a series of robust computer vision benchmarks.


Universal One-third Time Scaling in Learning Peaked Distributions

arXiv.org Machine Learning

Training large language models (LLMs) is computationally expensive, partly because the loss exhibits slow power-law convergence whose origin remains debatable. Through systematic analysis of toy models and empirical evaluation of LLMs, we show that this behavior can arise intrinsically from the use of softmax and cross-entropy. When learning peaked probability distributions, e.g., next-token distributions, these components yield power-law vanishing losses and gradients, creating a fundamental optimization bottleneck. This ultimately leads to power-law time scaling of the loss with a universal exponent of $1/3$. Our results provide a mechanistic explanation for observed neural scaling and suggest new directions for improving LLM training efficiency.


An 'Intimacy Crisis' Is Driving the Dating Divide

WIRED

An'Intimacy Crisis' Is Driving the Dating Divide In his book, sex and relationships researcher Justin Garcia says people have miscalculated their need for human intimacy, which is the real issue at root of the loneliness epidemic. In the US, nearly half of adults are single. A quarter of men suffer from loneliness. Rates of depression are on the rise . And one in four Gen Z adults--the so-called kinkiest generation, according to one study --have never had partnered sex. In an age of endless connection, where hooking up happens with the ease of a swipe and nontraditional relationship structures like polyamory are celebrated, why are people seemingly so disconnected and alone?



JONATHAN TURLEY: When elites cheer the mob, history warns that revolutions devour their own

FOX News

The American Revolution created a lasting democracy while the French Revolution became blood-soaked tyranny. But today's armchair revolutionaries echo similar calls.


Education advocates praise Texas A&M decision to wind down Women's and Gender Studies certificate

FOX News

Texas A&M eliminates Women's and Gender Studies certificate program after reviewing 5,400 course syllabi, canceling six courses representing 0.11% of total offerings.