AITopics | phyre

PHYRE: A New Benchmark for Physical Reasoning

Neural Information Processing SystemsDec-25-2025, 07:31:39 GMT

Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learning algorithms on PHYRE and find that these algorithms fall short in solving the puzzles efficiently. We expect that PHYRE will encourage the development of novel sample-efficient agents that learn efficient but useful models of physics. For code and to play PHYRE for yourself, please visit https://player.phyre.ai.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Add feedback

PHYRE: A New Benchmark for Physical Reasoning

Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

Neural Information Processing SystemsNov-16-2025, 05:47:52 GMT

Neural Information Processing Systems http://nips.cc/

agent, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
(2 more...)

Add feedback

4191ef5f6c1576762869ac49281130c9-AuthorFeedback.pdf

Neural Information Processing SystemsNov-16-2025, 05:47:37 GMT

agent, artificial intelligence, phyre, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

PHYRE: A New Benchmark for Physical Reasoning

Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

Neural Information Processing SystemsOct-2-2025, 15:07:35 GMT

Neural Information Processing Systems http://nips.cc/

agent, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
(2 more...)

Add feedback

new environment for benchmarking aspects of physical reasoning in which agents are challenged to solve 2D physics

Neural Information Processing SystemsOct-2-2025, 15:07:19 GMT

We thank the reviewers for their detailed and constructive comments. Overall, the reviewers were positive about this contribution and liked the submission: " I generally The task is compelling and the benchmark is well thought out. " [R1]; " I like this paper, as it presents " [R2]; " The benchmark is designed to encourage physical The reviewers also raised concerns, which we will address next. For example, in CLEVR it now seems likely that some models (e.g., Relation Networks) have found shortcut "cheats" It is difficult to characterize what constitutes "intrinsic" difficulty, but by As a whole, the community must "go for recall" since By releasing PHYRE to the public, we hope to see rapid exploration of these good suggestions. We will attempt to improve the clarity.

agent, artificial intelligence, physical reasoning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

Reviews: PHYRE: A New Benchmark for Physical Reasoning

Neural Information Processing SystemsJan-23-2025, 06:06:14 GMT

The authors introduce a new game-style benchmark for physical reasoning, PHYRE, which contains a set of puzzles in a 2D physical environment using a set of parameterized task templates and variations on each template. The paper also presents baseline agents based on a non-parametric memorization strategy, DQN, and online learning variants of these agents. Reviewers are concerned that there is not enough visual complexity (shapes, textures, etc.), that the domain of physical reasoning is quite limited, and that the evaluations can be improved with more rigorous baselines. Although two reviewers see the work as marginally below threshold, all reviewers think an "accept" is reasonable.

artificial intelligence, machine learning, physical reasoning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

PHYRE: A New Benchmark for Physical Reasoning

Neural Information Processing SystemsOct-9-2024, 22:13:22 GMT

Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learning algorithms on PHYRE and find that these algorithms fall short in solving the puzzles efficiently. We expect that PHYRE will encourage the development of novel sample-efficient agents that learn efficient but useful models of physics.

artificial intelligence, machine learning, physical reasoning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Forward Prediction for Physical Reasoning

Girdhar, Rohit, Gustafson, Laura, Adcock, Aaron, van der Maaten, Laurens

arXiv.org Artificial IntelligenceJun-18-2020

Physical reasoning requires forward prediction: the ability to forecast what will happen next given some initial world state. We study the performance of state-of-the-art forward-prediction models in complex physical-reasoning tasks. We do so by incorporating models that operate on object or pixel-based representations of the world, into simple physical-reasoning agents. We find that forward-prediction models improve the performance of physical-reasoning agents, particularly on complex tasks that involve many objects. However, we also find that these improvements are contingent on the training tasks being similar to the test tasks, and that generalization to different tasks is more challenging. Surprisingly, we observe that forward predictors with better pixel accuracy do not necessarily lead to better physical-reasoning performance. Nevertheless, our best models set a new state-of-the-art on the PHYRE benchmark for physical reasoning.

artificial intelligence, forward-prediction model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2006.10734

Country: North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

PHYRE: A New Benchmark for Physical Reasoning

Bakhtin, Anton, Maaten, Laurens van der, Johnson, Justin, Gustafson, Laura, Girshick, Ross

Neural Information Processing SystemsMar-18-2020, 22:32:26 GMT

Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learning algorithms on PHYRE and find that these algorithms fall short in solving the puzzles efficiently. We expect that PHYRE will encourage the development of novel sample-efficient agents that learn efficient but useful models of physics.

artificial intelligence, machine learning, physical reasoning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.84)

Add feedback

Why Setting A Benchmark For Physical Reasoning In AI Matters

#artificialintelligenceSep-7-2019, 13:16:45 GMT

The machines of the modern world can now be taught how to learn, adapt and improvise with great tact. Asking a robot to run, do a cartwheel or throw a pitch would have sounded like a chapter from a generic sci-fi novel a few years ago. But now with the advancements in hardware acceleration and the optimisation of machine learning algorithms, techniques like Reinforcement Learning are being put into practical use. Hard coding a robot to perform even mundane skills poorly will take a lot of computational heavy lifting. However, it takes some ingenious constraint assumption to make the robot perform decently when put under unstructured, real-world situations.

artificial intelligence, machine learning, phyre, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Collaborating Authors

phyre

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

PHYRE: A New Benchmark for Physical Reasoning

PHYRE: A New Benchmark for Physical Reasoning

4191ef5f6c1576762869ac49281130c9-AuthorFeedback.pdf

PHYRE: A New Benchmark for Physical Reasoning

new environment for benchmarking aspects of physical reasoning in which agents are challenged to solve 2D physics

Reviews: PHYRE: A New Benchmark for Physical Reasoning

PHYRE: A New Benchmark for Physical Reasoning

Forward Prediction for Physical Reasoning

PHYRE: A New Benchmark for Physical Reasoning

Why Setting A Benchmark For Physical Reasoning In AI Matters