AITopics | Braga

Collaborating Authors

Braga

Maximum-Entropy Exploration with Future State-Action Visitation Measures

Bolland, Adrien, Lambrechts, Gaspard, Ernst, Damien

arXiv.org Machine LearningMar-20-2026

Maximum entropy reinforcement learning motivates agents to explore states and actions to maximize the entropy of some distribution, typically by providing additional intrinsic rewards proportional to that entropy function. In this paper, we study intrinsic rewards proportional to the entropy of the discounted distribution of state-action features visited during future time steps. This approach is motivated by two results. First, we show that the expected sum of these intrinsic rewards is a lower bound on the entropy of the discounted distribution of state-action features visited in trajectories starting from the initial states, which we relate to an alternative maximum entropy objective. Second, we show that the distribution used in the intrinsic reward definition is the fixed point of a contraction operator and can therefore be estimated off-policy. Experiments highlight that the new objective leads to improved visitation of features within individual trajectories, in exchange for slightly reduced visitation of features in expectation over different trajectories, as suggested by the lower bound. It also leads to improved convergence speed for learning exploration-only agents. Control performance remains similar across most methods on the considered benchmarks.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2603.18965

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Locally Linear Continual Learning for Time Series based on VC-Theoretical Generalization Bounds

Ferreira, Yan V. G., Lima, Igor B., S., Pedro H. G. Mapa, Campos, Felipe V., Braga, Antonio P.

arXiv.org Machine LearningMar-17-2026

Most machine learning methods assume fixed probability distributions, limiting their applicability in nonstationary real-world scenarios. While continual learning methods address this issue, current approaches often rely on black-box models or require extensive user intervention for interpretability. We propose SyMPLER (Systems Modeling through Piecewise Linear Evolving Regression), an explainable model for time series forecasting in nonstationary environments based on dynamic piecewise-linear approximations. Unlike other locally linear models, SyMPLER uses generalization bounds from Statistical Learning Theory to automatically determine when to add new local models based on prediction errors, eliminating the need for explicit clustering of the data. Experiments show that SyMPLER can achieve comparable performance to both black-box and existing explainable models while maintaining a human-interpretable structure that reveals insights about the system's behavior. In this sense, our approach conciliates accuracy and interpretability, offering a transparent and adaptive solution for forecasting nonstationary time series.

artificial intelligence, local model, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2026.3672726

2603.13674

Country:

Europe > Portugal > Braga > Braga (0.41)
South America > Brazil > Minas Gerais > Belo Horizonte (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

62d2cec62b7fd46dd35fa8f2d4aeb52d-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-19-2026, 17:44:18 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Portugal > Braga > Braga (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments

Neural Information Processing SystemsFeb-19-2026, 15:59:54 GMT

We propose a Bayesian decision making framework for control of Markov Decision Processes (MDPs) with unknown dynamics and large, possibly continuous, state, action, and parameter spaces in data-poor environments. Most of the existing adaptive controllers for MDPs with unknown dynamics are based on the reinforcement learning framework and rely on large data sets acquired by sustained direct interaction with the system or via a simulator. This is not feasible in many applications, due to ethical, economic, and physical constraints. The proposed framework addresses the data poverty issue by decomposing the problem into an offline planning stage that does not rely on sustained direct interaction with the system or simulator and an online execution stage. In the offline process, parallel Gaussian process temporal difference (GPTD) learning techniques are employed for near-optimal Bayesian approximation of the expected discounted reward over a sample drawn from the prior distribution of unknown parameters. In the online stage, the action with the maximum expected return with respect to the posterior distribution of the parameters is selected. This is achieved by an approximation of the posterior distribution using a Markov Chain Monte Carlo (MCMC) algorithm, followed by constructing multiple Gaussian processes over the parameter space for efficient prediction of the means of the expected return at the MCMC sample. The effectiveness of the proposed framework is demonstrated using a simple dynamical system model with continuous state and action spaces, as well as a more complex model for a metastatic melanoma gene regulatory network observed through noisy synthetic gene expression data.

artificial intelligence, machine learning, reinforcement learning, (9 more...)

Neural Information Processing Systems

Country: Europe > Portugal > Braga > Braga (0.07)

Industry: Health & Medicine (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.59)

Add feedback

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Neural Information Processing SystemsFeb-18-2026, 19:29:15 GMT

A VSR, respectively) has traditionally been conducted independently.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey (0.04)
Europe > Portugal > Braga > Braga (0.04)
Africa > Mali (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Education (0.68)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f2733d3b0dde1d74995f35a9cf442d38-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 16:44:20 GMT

adam-rel, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Vienna (0.14)
Europe > Russia (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

f0a4a0ecdc29a0087c0848948e2fce81-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-18-2026, 15:55:47 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > North Sea (0.04)
Atlantic Ocean > North Atlantic Ocean > North Sea (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(5 more...)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Parseval Regularization for Continual Reinforcement Learning

Neural Information Processing SystemsFeb-18-2026, 12:42:09 GMT

We show that it provides significant benefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

cb03b5108f1c3a38c990ef0b45bc8b31-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 04:20:35 GMT

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Trading (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

Temporal-Difference Learning Using Distributed Error Signals Jonas Guan

Neural Information Processing SystemsFeb-18-2026, 00:42:00 GMT

It is unclear whether distributed errors alone are sufficient for synapses to make coordinated updates to learn complex, nonlinear reward-based learning tasks.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: