4eb7d41ae6005f60fe401e56277ebd4e-AuthorFeedback.pdf

Feb-8-2026, 09:36:30 GMT–Neural Information Processing Systems

Forsupervised learning, [7]1 showed that gradually increasing theentropy5 of the training distribution helped. However in RL, breaking down a task in sub-problems that can be ordered by6 difficulty is non trivial [2]. For video games, [4]adapted the concept with astarting state increasingly further from the end of a9 demonstration. Thus, contrary to[1,3,4],we do not "reverse time" toartificially build asequence oftasks starting further13 from a goal state and subsequently harder to solve in the hope of learning how to reach this goal from all possible14 starting states, but ratherstack new optimization problems on top of previous ones, which gradually increases the15 computational complexityofthetask, inorder tolearn toactoptimally inoptimization problems with anincreasing16 number oflevels. Thus,contrarytomostproblemsinRL,herewearefacedwithatask naturally constitutedofahierarchy20 ofsub-problems ordered by their position inthe Polynomial Hierarchy,which motivates acurriculum.

artificial intelligence, arxivpreprintarxiv, demonstration, (1 more...)

Neural Information Processing Systems

Feb-8-2026, 09:36:30 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence (1.00)

Duplicate Docs Excel Report

Title
4eb7d41ae6005f60fe401e56277ebd4e-AuthorFeedback.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found