Model-free Low-Rank Reinforcement Learning via Leveraged Entry-wise Matrix Estimation
–Neural Information Processing Systems
In the latter, the algorithm estimates the low-rank matrix corresponding to the (state, action) value function of the current policy using the following two-phase procedure.
Neural Information Processing Systems
Feb-11-2026, 03:11:28 GMT
- Country:
- South America > Chile
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Sweden
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report > Experimental Study (0.93)
- Technology: