Goto

Collaborating Authors

 deficiency


Yes, eating carrots can help your eyesight. But it's not a cure-all.

Popular Science

Yes, eating carrots can help your eyesight. The World War II propaganda that touted the veggie wasn't totally wrong, but carrots still won't give you night vision. Carrots' beta-carotene pigment can help support retina health, but the root vegetable still has its limits. Breakthroughs, discoveries, and DIY tips sent six days a week. In a British propaganda poster from World War II, an illustration in shadowy tones captures a dramatic nighttime scene: a woman and young girl peer around a black automobile, as if looking for a quick escape.


Le Cam Distortion: A Decision-Theoretic Framework for Robust Transfer Learning

Akdemir, Deniz

arXiv.org Machine Learning

Distribution shift is the defining challenge of real-world machine learning. The dominant paradigm--Unsupervised Domain Adaptation (UDA)--enforces feature invariance, aligning source and target representations via symmetric divergence minimization [Ganin et al., 2016]. We demonstrate that this approach is fundamentally flawed: when domains are unequally informative (e.g., high-quality vs degraded sensors), strict invariance necessitates information destruction, causing "negative transfer" that can be catastrophic in safety-critical applications [Wang et al., 2019]. We propose a decision-theoretic framework grounded in Le Cam's theory of statistical experiments [Le Cam, 1986], using constructive approximations to replace symmetric invariance with directional simulability. We introduce Le Cam Distortion, quantified by the Deficiency Distance $δ(E_1, E_2)$, as a rigorous upper bound for transfer risk conditional on simulability. Our framework enables transfer without source degradation by learning a kernel that simulates the target from the source. Across five experiments (genomics, vision, reinforcement learning), Le Cam Distortion achieves: (1) near-perfect frequency estimation in HLA genomics (correlation $r=0.999$, matching classical methods), (2) zero source utility loss in CIFAR-10 image classification (81.2% accuracy preserved vs 34.7% drop for CycleGAN), and (3) safe policy transfer in RL control where invariance-based methods suffer catastrophic collapse. Le Cam Distortion provides the first principled framework for risk-controlled transfer learning in domains where negative transfer is unacceptable: medical imaging, autonomous systems, and precision medicine.


Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

Neural Information Processing Systems

We study risk-sensitive reinforcement learning (RL) based on the entropic risk measure. Although existing works have established non-asymptotic regret guarantees for this problem, they leave open an exponential gap between the upper and lower bounds. We identify the deficiencies in existing algorithms and their analysis that result in such a gap. To remedy these deficiencies, we investigate a simple transformation of the risk-sensitive Bellman equations, which we call the exponential Bellman equation. The exponential Bellman equation inspires us to develop a novel analysis of Bellman backup procedures in risk-sensitive RL algorithms, and further motivates the design of a novel exploration mechanism. We show that these analytic and algorithmic innovations together lead to improved regret upper bounds over existing ones.


Bridging the Gap: Toward Cognitive Autonomy in Artificial Intelligence

Golilarz, Noorbakhsh Amiri, Penchala, Sindhuja, Rahimi, Shahram

arXiv.org Artificial Intelligence

Artificial intelligence has advanced rapidly across perception, language, reasoning, and multimodal domains. Yet despite these achievements, modern AI systems remain fundamentally limited in their ability to self-monitor, self-correct, and regulate their behavior autonomously in dynamic contexts. This paper identifies and analyzes seven core deficiencies that constrain contemporary AI models: the absence of intrinsic self-monitoring, lack of meta-cognitive awareness, fixed and non-adaptive learning mechanisms, inability to restructure goals, lack of representational maintenance, insufficient embodied feedback, and the absence of intrinsic agency. Alongside identifying these limitations, we also outline a forward-looking perspective on how AI may evolve beyond them through architectures that mirror neurocognitive principles. We argue that these structural limitations prevent current architectures, including deep learning and transformer-based systems, from achieving robust generalization, lifelong adaptability, and real-world autonomy. Drawing on a comparative analysis of artificial systems and biological cognition [7], and integrating insights from AI research, cognitive science, and neuroscience, we outline how these capabilities are absent in current models and why scaling alone cannot resolve them. We conclude by advocating for a paradigmatic shift toward cognitively grounded AI (cognitive autonomy) capable of self-directed adaptation, dynamic representation management, and intentional, goal-oriented behavior, paired with reformative oversight mechanisms [8] that ensure autonomous systems remain interpretable, governable, and aligned with human values.


We're learning more about what vitamin D does to our bodies

MIT Technology Review

We're learning more about what vitamin D does to our bodies The sunshine vitamin could affect your immune system and heart health. It has started to get really wintry here in London over the last few days. The mornings are frosty, the wind is biting, and it's already dark by the time I pick my kids up from school. The darkness in particular has got me thinking about vitamin D, a.k.a. the sunshine vitamin. At a checkup a few years ago, a doctor told me I was deficient in vitamin D. But he wouldn't write me a prescription for supplements, simply because, as he put it, in the UK is deficient. Putting the entire population on vitamin D supplements would be too expensive for the country's national health service, he told me.


A Proof of Lemma

Neural Information Processing Systems

Recall that the algorithm of Section 3.2 uses two black-boxes that are impractical and have hidden To implement this algorithm for edge-generation, we must be able to compute a bidirectional mapping between an input point and the grid cell that contains the point. Using this, we will show the following lemma. Proof: We begin by proving (i) and (ii) and then use it to prove the lemma. No weights are assigned in the HK algorithm. The LR has a preprocessing step that computes the maximum cardinality matching for each piece.


Beyond Elicitation: Provision-based Prompt Optimization for Knowledge-Intensive Tasks

Xu, Yunzhe, Zhang, Zhuosheng, Liu, Zhe

arXiv.org Artificial Intelligence

Abstract--While prompt optimization has emerged as a critical technique for enhancing language model performance, existing approaches primarily focus on elicitation-based strategies that search for optimal prompts to activate models' capabilities. These methods exhibit fundamental limitations when addressing knowledge-intensive tasks, as they operate within fixed parametric boundaries rather than providing the factual knowledge, terminology precision, and reasoning patterns required in specialized domains. T o address these limitations, we propose Knowledge-Provision-based Prompt Optimization (KPPO), a framework that reformulates prompt optimization as systematic knowledge integration rather than potential elicitation. KPPO introduces three key innovations: 1) a knowledge gap filling mechanism for knowledge gap identification and targeted remediation; 2) a batch-wise candidate evaluation approach that considers both performance improvement and distributional stability; 3) an adaptive knowledge pruning strategy that balances performance and token efficiency, reducing up to 29% token usage. Extensive evaluation on 15 knowledge-intensive benchmarks from various domains demonstrates KPPO's superiority over elicitation-based methods, with an average performance improvement of ~6% over the strongest baseline while achieving comparable or lower token consumption. Large Language Models (LLMs) have achieved unprecedented performance across diverse natural language processing tasks through sophisticated prompt engineering techniques [1]. The field has evolved from manual prompt design approaches [2], [3] to automated optimization frameworks [4]- [7], where optimizer LLMs iteratively refine prompts to maximize task performance. These automated approaches, collectively termed elicitation-based optimization, operate under the fundamental assumption that optimal prompts can unlock latent capabilities within pre-trained model parameters through strategic reformulation of instructions, exemplars, or reasoning templates.


Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations

Liu, Chengzhi, Yang, Yuzhe, Zhou, Kaiwen, Zhang, Zhen, Fan, Yue, Xie, Yanan, Qi, Peng, Wang, Xin Eric

arXiv.org Artificial Intelligence

The promotion of academic papers has become an important means of enhancing research visibility. However, existing automated methods struggle limited storytelling, insufficient aesthetic quality, and constrained self-adjustment, making it difficult to achieve efficient and engaging dissemination. At the heart of those challenges is a simple principle: \emph{there is no way to improve it when you cannot evaluate it right}. To address this, we introduce \textbf{EvoPresent}, a self-improvement agent framework that unifies coherent narratives, aesthetic-aware designs, and realistic presentation delivery via virtual characters. Central to EvoPresent is \textbf{PresAesth}, a multi-task reinforcement learning (RL) aesthetic model that provides reliable aesthetic scoring, defect adjustment, and comparative feedback, enabling iterative self-improvement even under limited aesthetic training data. To systematically evaluate the methods, we introduce \textbf{EvoPresent Benchmark}, a comprehensive benchmark comprising: \textit{Presentation Generation Quality}, built on 650 top-tier AI conference papers with multimodal resources (slides, videos and scripts) to assess both content and design; and \textit{Aesthetic Awareness}, consisting of 2,000 slide pairs with varying aesthetic levels, supporting joint training and evaluation on scoring, defect adjustment, and comparison. Our findings highlight that (i) High-quality feedback is essential for agent self-improvement, while initial capability alone does not guarantee effective self-correction. (ii) Automated generation pipelines exhibit a trade-off between visual design and content construction. (iii) Multi-task RL training shows stronger generalization in aesthetic awareness tasks.


Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining

Subramanyam, Anirudh, Chen, Yuxin, Grossman, Robert L.

arXiv.org Artificial Intelligence

Scaling laws for language model training traditionally characterize how performance scales with model size and dataset volume. Prior work has explored architecture variants and data treatments such as dataset filtering and noise injection in language model pretraining; however, these studies have not formalized data quality within a principled scaling law. We introduce a dimensionless data-quality parameter Q, and propose a quality-aware scaling law extending the Chinchilla framework to predict loss as a joint function of model size, data volume, and data quality. The law is motivated by an effective-sample-size and information-theoretic view of noisy or redundant corpora, and it admits two practical estimators for Q: (i) a corruption rate proxy and (ii) a deficiency measure. Through synthetic experiments in neural machine translation and autoregressive modeling -- where we systematically control data quality via multiple levels of noise injection and coverage variation -- we show that loss scales predictably with data quality and that higher-quality data can substantially reduce model size and hence compute requirements. Our results demonstrate a sublinear decay of effective data with quality and robustness to moderate data corruption; out-of-sample evaluations further validate the predictive form of the law. Unlike prior empirical analyses, our work establishes an explicit, generalizable law for data quality, offering concrete guidance for balancing data curation effort and model scale in large-scale pretraining.


Are LLMs Better GNN Helpers? Rethinking Robust Graph Learning under Deficiencies with Iterative Refinement

Wang, Zhaoyan, Gao, Zheng, Kharel, Arogya, Ko, In-Young

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) are widely adopted in Web-related applications, serving as a core technique for learning from graph-structured data, such as text-attributed graphs. Yet in real-world scenarios, such graphs exhibit deficiencies that substantially undermine GNN performance. While prior GNN-based augmentation studies have explored robustness against individual imperfections, a systematic understanding of how graph-native and Large Language Models (LLMs) enhanced methods behave under compound deficiencies is still missing. Specifically, there has been no comprehensive investigation comparing conventional approaches and recent LLM-on-graph frameworks, leaving their merits unclear. To fill this gap, we conduct the first empirical study that benchmarks these two lines of methods across diverse graph deficiencies, revealing overlooked vulnerabilities and challenging the assumption that LLM augmentation is consistently superior. Building on empirical findings, we propose Robust Graph Learning via Retrieval-Augmented Contrastive Refinement (RoGRAD) framework. Unlike prior one-shot LLM-as-Enhancer designs, RoGRAD is the first iterative paradigm that leverages Retrieval-Augmented Generation (RAG) to inject retrieval-grounded augmentations by supplying class-consistent, diverse augmentations and enforcing discriminative representations through iterative graph contrastive learning. It transforms LLM augmentation for graphs from static signal injection into dynamic refinement. Extensive experiments demonstrate RoGRAD's superiority over both conventional GNN- and LLM-enhanced baselines, achieving up to 82.43% average improvement.