Constrainedepisodicreinforcementlearningin concave-convexandknapsacksettings

Feb-10-2026, 02:57:31 GMT–Neural Information Processing Systems

Our approach relies on the principle ofoptimism under uncertaintyto efficiently explore. Our learning algorithms optimizetheiractions withrespect toamodel based ontheempirical statistics, while optimistically overestimating rewards and underestimating the resource consumption (i.e., overestimating the distance from the constraint).

constraint, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Feb-10-2026, 02:57:31 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States > Illinois
    - Cook County > Chicago (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)

Duplicate Docs Excel Report

Title
bc6d753857fe3dd4275dff707dedf329-Paper.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found