AITopics

Country: Europe > United Kingdom > England (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsApr-24-2026, 13:55:00 GMT

Self-Consistent Models and Values

Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment. In particular, models enable planning, i.e. using more computation to improve value functions or policies, without requiring additional environment interactions. In this work, we investigate a way of augmenting model-based RL, by additionally encouraging a learned model and value function to be jointly self-consistent. Our approach differs from classic planning methods such as Dyna, which only update values to be consistent with the model. We propose multiple self-consistency updates, evaluate these in both tabular and function approximation settings, and find that, with appropriate choices, self-consistency helps both policy evaluation and control.

machine learning, reinforcement learning, self-consistency update, (12 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)

Neural Information Processing SystemsFeb-7-2026, 09:44:11 GMT

08f0efebb1c51aada9430a089a2050cc-Paper.pdf

arxiv preprint arxiv, objective, self-consistency update, (11 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
(3 more...)

Ma, Xuehui, Zhang, Shiliang, Sun, Zhiyong

Fault Tolerant Control of Mecanum Wheeled Mobile Robots

arXiv.org Artificial IntelligenceDec-9-2025

Mecanum wheeled mobile robots (MWMRs) are highly susceptible to actuator faults that degrade performance and risk mission failure. Current fault tolerant control (FTC) schemes for MWMRs target complete actuator failures like motor stall, ignoring partial faults e.g., in torque degradation. We propose an FTC strategy handling both fault types, where we adopt posterior probability to learn real-time fault parameters. We derive the FTC law by aggregating probability-weighed control laws corresponding to predefined faults. This ensures the robustness and safety of MWMR control despite varying levels of fault occurrence. Simulation results demonstrate the effectiveness of our FTC under diverse scenarios.

artificial intelligence, dyna, mobile robot, (15 more...)

2512.06444

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.64)

Neural Information Processing SystemsOct-3-2025, 18:18:02 GMT

a36b598abb934e4528412e5a2127b931-AuthorFeedback.pdf

We thank the reviewer for pointing this out.

algorithm, clarify, objective, (12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.51)
Information Technology > Artificial Intelligence > Machine Learning (0.50)

Zhan, Huixin, Zhang, Zijun

DYNA: Disease-Specific Language Model for Variant Pathogenicity

arXiv.org Artificial IntelligenceMay-31-2024

Clinical variant classification of pathogenic versus benign genetic variants remains a challenge in clinical genetics. Recently, the proposition of genomic foundation models has improved the generic variant effect prediction (VEP) accuracy via weakly-supervised or unsupervised training. However, these VEPs are not diseasespecific, limiting their adaptation at the point of care. To address this problem, we propose DYNA: Disease-specificity fine-tuning via a Siamese neural network broadly applicable to all genomic foundation models for more effective variant effect predictions in disease-specific contexts. We evaluate DYNA in two distinct diseaserelevant tasks. For coding VEPs, we focus on various cardiovascular diseases, where gene-disease relationships of loss-of-function vs. gain-of-function dictate disease-specific VEP. For non-coding VEPs, we apply DYNA to an essential posttranscriptional regulatory axis of RNA splicing, the most common non-coding pathogenic mechanism in established clinical VEP guidelines. The DYNA fine-tuned models show superior performance in the held-out rare variant testing set and are further replicated in large, clinically-relevant variant annotations in ClinVAR. Thus, DYNA offers a potent disease-specific variant effect prediction method, excelling in intra-gene generalization and generalization to unseen genetic variants, making it particularly valuable for disease associations and clinical applicability. Clinical variant interpretation is transforming precision medicine, yet limitations exist that prevent its further adaptations and utilities [1]. Following a disease diagnosis, the identification and classification of pathogenic vs benign genetic variant has important clinical implications. The outcome of clinical variant interpretation provides a basis for clinical screening [2, 3] and genetic testing of first-degree family members [4], and may serve as a prognostic marker for the affected patient [5, 6]. Currently, the utility of genetic testing is limited by the fact that a substantial proportion (30-50%) of yielded variants are classified as variant of uncertain significance (VUS) according to the ACMG guidelines [7].

dataset, sequence, variant, (13 more...)

2406.00164

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Choenni, Rochelle, Garrette, Dan, Shutova, Ekaterina

Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks

arXiv.org Artificial IntelligenceOct-31-2022

Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this paper, we propose novel methods for using language-specific subnetworks, which control cross-lingual parameter sharing, to reduce conflicts and increase positive transfer during fine-tuning. We introduce dynamic subnetworks, which are jointly updated with the model, and we combine our methods with meta-learning, an established, but complementary, technique for improving cross-lingual transfer. Finally, we provide extensive analyses of how each of our methods affects the models.

machine learning, natural language, subnetwork, (20 more...)

2211.00106

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Hungary > Csongrád-Csanád County > Szeged (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

arXiv.org Machine LearningOct-25-2021

Self-Consistent Models and Values

Farquhar, Gregory, Baumli, Kate, Marinho, Zita, Filos, Angelos, Hessel, Matteo, van Hasselt, Hado, Silver, David

Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment. In particular, models enable planning, i.e. using more computation to improve value functions or policies, without requiring additional environment interactions. In this work, we investigate a way of augmenting model-based RL, by additionally encouraging a learned model and value function to be jointly \emph{self-consistent}. Our approach differs from classic planning methods such as Dyna, which only update values to be consistent with the model. We propose multiple self-consistency updates, evaluate these in both tabular and function approximation settings, and find that, with appropriate choices, self-consistency helps both policy evaluation and control.

experiment, objective, self-consistency update, (11 more...)

arXiv.org Machine Learning

2110.1284

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)

#artificialintelligenceFeb-19-2019, 09:13:16 GMT

[ Archived Post ] Reinforcement Learning: A Survey – Jae Duk Seo – Medium

This paper overview of RL even covers the history, a good summary of a different area of studies. RL has a long history relates to statistic, computer science, and neuroscience. RL agent learns via trial and error it gathers the training data on its own. The standard RL model an agent that learns uses dynamic programming and statistic Not yet clear which method is better overall. For each time stamp, the agent receives some env, reward and more over time optimize the amount of reward it gets over one period.

agent, algorithm, value function, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceJun-12-2018

Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains

Pan, Yangchen, Zaheer, Muhammad, White, Adam, Patterson, Andrew, White, Martha

Model-based strategies for control are critical to obtain sample efficient learning. Dyna is a planning paradigm that naturally interleaves learning and planning, by simulating one-step experience to update the action-value function. This elegant planning strategy has been mostly explored in the tabular setting. The aim of this paper is to revisit sample-based planning, in stochastic and continuous domains with learned models. We first highlight the flexibility afforded by a model over Experience Replay (ER). Replay-based methods can be seen as stochastic planning methods that repeatedly sample from a buffer of recent agent-environment interactions and perform updates to improve data efficiency. We show that a model, as opposed to a replay buffer, is particularly useful for specifying which states to sample from during planning, such as predecessor states that propagate information in reverse from a state more quickly. We introduce a semi-parametric model learning approach, called Reweighted Experience Models (REMs), that makes it simple to sample next states or predecessors. We demonstrate that REM-Dyna exhibits similar advantages over replay-based methods in learning in continuous state problems, and that the performance gap grows when moving to stochastic domains, of increasing size.

machine learning, reinforcement learning, transition, (16 more...)

1806.04624

Country:

North America > Canada > Alberta (0.14)
North America > United States > Indiana (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)