AITopics | rd 2

Collaborating Authors

rd 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

82039d16dce0aab3913b6a7ac73deff7-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 04:15:40 GMT

div 2, rd 2, sample training experience, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback

RD 2 : Reward Decomposition with Representation Decomposition

Neural Information Processing SystemsDec-24-2025, 05:53:25 GMT

Reward decomposition, which aims to decompose the full reward into multiple sub-rewards, has been proven beneficial for improving sample efficiency in reinforcement learning. Existing works on discovering reward decomposition are mostly policy dependent, which constrains diverse or disentangled behavior between different policies induced by different sub-rewards. In this work, we propose a set of novel reward decomposition principles by constraining uniqueness and compactness of different state features/representations relevant to different sub-rewards. Our principles encourage sub-rewards with minimal relevant features, while maintaining the uniqueness of each sub-reward. We derive a deep learning algorithm based on our principle, and term our method as RD$^2$, since we learn reward decomposition and representation decomposition jointly. RD$^2$ is evaluated on a toy case, where we have the true reward structure, and some Atari environments where reward structure exists but is unknown to the agent to demonstrate the effectiveness of RD$^2$ against existing reward decomposition methods.

decomposition, representation decomposition, reward decomposition, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Statistical Inference for Matching Decisions via Matrix Completion under Dependent Missingness

Duan, Congyuan, Ma, Wanteng, Xia, Dong, Xu, Kan

arXiv.org Machine LearningOct-31-2025

In contrast to the independent sampling assumed in classical matrix completion literature, the observed entries, which arise from past matching data, are constrained by matching capacity. This matching-induced dependence poses new challenges for both estimation and inference in the matrix completion framework. We propose a non-convex algorithm based on Grassmannian gradient descent and establish near-optimal entrywise convergence rates for three canonical mechanisms, i.e., one-to-one matching, one-to-many matching with one-sided random arrival, and two-sided random arrival. To facilitate valid uncertainty quantification and hypothesis testing on matching decisions, we further develop a general debiasing and projection framework for arbitrary linear forms of the reward matrix, deriving asymptotic normality with finite-sample guarantees under matching-induced dependent sampling. Our empirical experiments demonstrate that the proposed approach provides accurate estimation, valid confidence intervals, and efficient evaluation of matching policies.

artificial intelligence, machine learning, probability, (17 more...)

arXiv.org Machine Learning

2510.26478

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report (0.81)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

82039d16dce0aab3913b6a7ac73deff7-Supplemental.pdf

Neural Information Processing SystemsOct-3-2025, 10:01:07 GMT

q-learning, rd 2, sample training experience, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback

82039d16dce0aab3913b6a7ac73deff7-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 10:01:02 GMT

decomposition, representation, reward decomposition, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

details and add more discussions on related works in the camera-ready version

Neural Information Processing SystemsOct-3-2025, 09:58:51 GMT

We thank all reviewers for valuable comments. Entropy is used to measure sufficiency, compactness and uniqueness . The usage of variance to approximate entropy was discussed in L203. Therefore, the performance deteriorates dramatically. We will run our algorithm in more environments and provide the results in Appendix.

algorithm, camera-ready version, div 2, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

RD 2 : Reward Decomposition with Representation Decomposition

Neural Information Processing SystemsOct-10-2024, 16:13:16 GMT

Reward decomposition, which aims to decompose the full reward into multiple sub-rewards, has been proven beneficial for improving sample efficiency in reinforcement learning. Existing works on discovering reward decomposition are mostly policy dependent, which constrains diverse or disentangled behavior between different policies induced by different sub-rewards. In this work, we propose a set of novel reward decomposition principles by constraining uniqueness and compactness of different state features/representations relevant to different sub-rewards. Our principles encourage sub-rewards with minimal relevant features, while maintaining the uniqueness of each sub-reward. We derive a deep learning algorithm based on our principle, and term our method as RD 2, since we learn reward decomposition and representation decomposition jointly.

decomposition, representation decomposition, reward decomposition, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

RD2Bench: Toward Data-Centric Automatic R&D

Chen, Haotian, Shen, Xinjie, Ye, Zeqi, Yang, Xiao, Yang, Xu, Liu, Weiqing, Bian, Jiang

arXiv.org Artificial IntelligenceApr-17-2024

The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments. Researchers often seek the potential research directions by reading and then verifying them through experiments. The process imposes a significant burden on researchers. In the past decade, the data-driven black-box deep learning method demonstrates its effectiveness in a wide range of real-world scenarios, which exacerbates the experimental burden of researchers and thus renders the potential successful discoveries veiled. Therefore, automating such a research and development (R&D) process is an urgent need. In this paper, we serve as the first effort to formalize the goal by proposing a Real-world Data-centric automatic R&D Benchmark, namely RD2Bench. RD2Bench benchmarks all the operations in data-centric automatic R&D (D-CARD) as a whole to navigate future work toward our goal directly. We focuses on evaluating the interaction and synergistic effects of various model capabilities and aiding to select the well-performed trustworthy models. Although RD2Bench is very challenging to the state-of-the-art (SOTA) large language model (LLM) named GPT-4, indicating ample research opportunities and more research efforts, LLMs possess promising potential to bring more significant development to D-CARD: They are able to implement some simple methods without adopting any additional techniques. We appeal to future work to take developing techniques for tackling automatic R&D into consideration, thus bringing the opportunities of the potential revolutionary upgrade to human productivity.

conference paper, implementation task, model architecture implementation task, (13 more...)

arXiv.org Artificial Intelligence

2404.11276

Country:

North America > Canada > Ontario > Toronto (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

High-dimensional Varying Index Coefficient Models via Stein's Identity

Na, Sen, Yang, Zhuoran, Wang, Zhaoran, Kolar, Mladen

arXiv.org Machine LearningOct-21-2018

We study the parameter estimation problem for a varying index coefficient model in high dimensions. Unlike the most existing works that simultaneously estimate the parameters and link functions, based on the generalized Stein's identity, we propose computationally efficient estimators for the high dimensional parameters without estimating the link functions. We consider two different setups where we either estimate each sparse parameter vector individually or estimate the parameters simultaneously as a sparse or low-rank matrix. For all these cases, our estimators are shown to achieve optimal statistical rates of convergence (up to logarithmic terms in the low-rank setting). Moreover, throughout our analysis, we only require the covariate to satisfy certain moment conditions, which is significantly weaker than the Gaussian or elliptically symmetric assumptions that are commonly made in the existing literature. Finally, we conduct extensive numerical experiments to corroborate the theoretical results.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1810.07128

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback