AITopics

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Dhawan, Nikita, Paruthi, Arnav, Kim, Andrew, Gondara, Lovedeep, Novikova, Jekaterina, Maddison, Chris J.

Causal Risk Minimization for High-Dimensional Treatments

arXiv.org Machine LearningMay-27-2026

Predicting the effect of interventions with many possible variations, e.g., therapeutic content that affects mental health outcomes or an earnings call transcript that drives movement in share price, is useful across several domains. However, classical causal estimators tend to assume that all possible interventions are observed, which is infeasible when interventions vary widely, for instance, in the space of all text strings. We adapt a well-known approach of recasting causal inference as a learning problem, to address high-dimensional treatment spaces. Specifically, under standard assumptions like no unobserved confounding, we show that causal error decomposes into a series of moment-balancing errors of increasing order, and design objectives that directly improve causal estimation. We also show how to project the effect of a high-dimensional treatment onto lower-dimensional treatment attributes, which allows a single model to answer several causal questions without additional attribute-specific training. We empirically evaluate our estimators in settings with high-dimensional continuous, discrete, and text treatments, the last of which used a semi-synthetic dataset of Amazon Reviews. Our experiments demonstrate the benefit of higher-order balance error optimization and competitive performance of projected causal estimates with attribute-specific estimators.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2605.27281

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-11-2026, 21:42:02 GMT

Supplementary Materials Di erentially Private Samplingfrom Distributions

If A is (",0)- di erentiallyprivate, thenA is " Yis (", )- di B : Y anyrandomB Ais (", )- di erA is thenthealgoB Ais - z CDP. Proof.Algorithm1 isthedesiredsamplerAPo.Itis (", ) - di erentiallyprivatesinceA is (", ) - di erentially private.

artificial intelligence, machine learning, thatis, (16 more...)

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Neural Information Processing SystemsDec-24-2025, 01:28:39 GMT

Amortized Proximal Optimization

We propose a framework for online meta-optimization of parameters that govern optimization, called Amortized Proximal Optimization (APO). We first interpret various existing neural network optimizers as approximate stochastic proximal point methods which trade off the current-batch loss with proximity terms in both function space and weight space. The idea behind APO is to amortize the minimization of the proximal point objective by meta-learning the parameters of an update rule. We show how APO can be used to adapt a learning rate or a structured preconditioning matrix. Under appropriate assumptions, APO can recover existing optimizers such as natural gradient descent and KFAC. It enjoys low computational overhead and avoids expensive and numerically sensitive operations required by some second-order optimizers, such as matrix inverses.

amortized proximal optimization, apo, name change, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-31-2025

Human-assisted Robotic Policy Refinement via Action Preference Optimization

Xia, Wenke, Yang, Yichu, Wu, Hongtao, Ma, Xiao, Kong, Tao, Hu, Di

Establishing a reliable and iteratively refined robotic system is essential for deploying real-world applications. While Vision-Language-Action (VLA) models are widely recognized as the foundation model for such robotic deployment, their reliance on offline expert demonstrations critically limits their capacity for post-deployment refinement. To mitigate this limitation, we introduce Action Preference Optimization (APO), a method designed to refine VLA models by human-assisted preference alignment gathered through interaction with environments. This method begins with a human-robot collaboration framework for reliable failure correction and interaction trajectory collection through human intervention. However, directly leveraging these interaction trajectories for preference optimization is non-trivial due to the challenges of irreversible robotic actions and token distribution mismatch. To solve this, APO proposes an adaptive reweighting algorithm with binary desirability signals derived from interaction, empowering VLA models effectively suppress failure-prone actions while enhancing corrective action adaptation. Ultimately, APO equips VLA models with the crucial capability to learn from failure, paving the way for their iterative refinement and reliable deployment in dynamic environments. The experiments conducted in simulation and real-world scenarios prove superior generalization and robustness of our human-assisted framework across a variety of manipulation tasks. We believe this work could bring insights for efficient and stable optimization of VLA models through human-robot collaboration. The code and dataset are released at https://github.com/GeWu-Lab/Action-Preference-Optimization

machine learning, reinforcement learning, trajectory, (15 more...)

2506.07127

Genre: Research Report (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

arXiv.org Artificial IntelligenceSep-4-2025

Communication Efficient Robotic Mixed Reality with Gaussian Splatting Cross-Layer Optimization

Liu, Chenxuan, Li, He, Li, Zongze, Wang, Shuai, Xu, Wei, Ye, Kejiang, Ng, Derrick Wing Kwan, Xu, Chengzhong

Realizing low-cost communication in robotic mixed reality (RoboMR) systems presents a challenge, due to the necessity of uploading high-resolution images through wireless channels. This paper proposes Gaussian splatting (GS) RoboMR (GSMR), which enables the simulator to opportunistically render a photo-realistic view from the robot's pose by calling ``memory'' from a GS model, thus reducing the need for excessive image uploads. However, the GS model may involve discrepancies compared to the actual environments. To this end, a GS cross-layer optimization (GSCLO) framework is further proposed, which jointly optimizes content switching (i.e., deciding whether to upload image or not) and power allocation (i.e., adjusting to content profiles) across different frames by minimizing a newly derived GSMR loss function. The GSCLO problem is addressed by an accelerated penalty optimization (APO) algorithm that reduces computational complexity by over $10$x compared to traditional branch-and-bound and search algorithms. Moreover, variants of GSCLO are presented to achieve robust, low-power, and multi-robot GSMR. Extensive experiments demonstrate that the proposed GSMR paradigm and GSCLO method achieve significant improvements over existing benchmarks on both wheeled and legged robots in terms of diverse metrics in various scenarios. For the first time, it is found that RoboMR can be achieved with ultra-low communication costs, and mixture of data is useful for enhancing GS performance in dynamic scenarios.

artificial intelligence, constraint, machine learning, (18 more...)

2508.08624

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
(2 more...)

Neural Information Processing SystemsAug-14-2025, 07:03:03 GMT

3af25aa3de8b7b02ddbd1b6be5031be8-Paper-Conference.pdf

arxiv preprint arxiv, optimization, optimizer, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Connecticut (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)

arXiv.org Artificial IntelligenceJun-11-2025

Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search

Han, Dongge, Xia, Menglin, Diaz, Daniel Madrigal, Kessler, Samuel, Mallick, Ankur, Zhang, Xuchao, Garcia, Mirian Del Carmen Hipolito, Xu, Jin, Rühle, Victor, Rajmohan, Saravan

Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-level reasoning guides that help SLMs systematically tackle related problems. Furthermore, our framework integrates a prompt template search mechanism to mitigate the SLMs' sensitivity to prompt variations. Our framework demonstrates improved SLM performance across various tasks, including math (GSM8K), coding (MBPP), and logic reasoning (BBH). Our approach improves the reasoning capabilities of SLMs without increasing model size or requiring additional training, offering a lightweight and deployment-friendly solution for on-device or resource-constrained environments.

large language model, machine learning, natural language, (16 more...)

2506.08669

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Neural Information Processing SystemsMay-27-2025, 00:02:37 GMT

Amortized Proximal Optimization

amortized proximal optimization, artificial intelligence, machine learning, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMay-13-2025

Purity Law for Generalizable Neural TSP Solvers

Liu, Wenzhao, Li, Haoran, Han, Congying, Zhang, Zicheng, Li, Anqi, Guo, Tiande

Achieving generalization in neural approaches across different scales and distributions remains a significant challenge for the Traveling Salesman Problem~(TSP). A key obstacle is that neural networks often fail to learn robust principles for identifying universal patterns and deriving optimal solutions from diverse instances. In this paper, we first uncover Purity Law (PuLa), a fundamental structural principle for optimal TSP solutions, defining that edge prevalence grows exponentially with the sparsity of surrounding vertices. Statistically validated across diverse instances, PuLa reveals a consistent bias toward local sparsity in global optima. Building on this insight, we propose Purity Policy Optimization~(PUPO), a novel training paradigm that explicitly aligns characteristics of neural solutions with PuLa during the solution construction process to enhance generalization. Extensive experiments demonstrate that PUPO can be seamlessly integrated with popular neural solvers, significantly enhancing their generalization performance without incurring additional computational overhead during inference.

artificial intelligence, machine learning, purity law, (18 more...)

2505.04558

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)