Goto

Collaborating Authors

 spatial-temporal reasoning




RUST: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models

Neural Information Processing Systems

To perform systematic evaluations, we set up 32 various tasks, including improvements to existing multimodal tasks, extension of text-only tasks to multimodal scenarios, and novel methods for risk assessment, which focus on models' basic performance with practical significance.


Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning

Aghzal, Mohamed, Plaku, Erion, Yao, Ziyu

arXiv.org Artificial Intelligence

Large language models (LLMs) have achieved remarkable success across a wide spectrum of tasks; however, they still face limitations in scenarios that demand long-term planning and spatial reasoning. To facilitate this line of research, in this work, we propose a new benchmark, termed $\textbf{P}$ath $\textbf{P}$lanning from $\textbf{N}$atural $\textbf{L}$anguage ($\textbf{PPNL}$). Our benchmark evaluates LLMs' spatial-temporal reasoning by formulating ''path planning'' tasks that require an LLM to navigate to target locations while avoiding obstacles and adhering to constraints. Leveraging this benchmark, we systematically investigate LLMs including GPT-4 via different few-shot prompting methodologies and BART and T5 of various sizes via fine-tuning. Our experimental results show the promise of few-shot GPT-4 in spatial reasoning, when it is prompted to reason and act interleavedly, although it still fails to make long-term temporal reasoning. In contrast, while fine-tuned LLMs achieved impressive results on in-distribution reasoning tasks, they struggled to generalize to larger environments or environments with more obstacles.


Does a Cartoon Penguin Make Math Education Great Again? - Facts So Romantic

Nautilus

Matthew Peterson is a pretty inspirational guy. As a dyslexic child he found math class difficult, so as an adult he resolved to totally change the way math is taught. After completing his studies in biology, electrical engineering, and Chinese language and literature at the University of California, Irvine, Peterson co-founded the nonprofit MIND Research Institute and set about developing "Spatial Temporal (ST) Math," a computer game-based method of teaching that doesn't rely on language as a medium. Instead it uses spatial-temporal reasoning--the ability to move stuff around in your mind and work out how it fits together. Proponents point to recent findings in neuroscience and education research--showing that early music training can enhance spatial-temporal reasoning, for example--as justification for this shift.


Effects of Representation on Solving Complex Spatial-Temporal Problems

Wetzel, Baylor (University of Minnesota)

AAAI Conferences

We present a study of how humans represent space when solving Tower Defense puzzles, a complex spatial reasoning task requiring the subject to protect locations by arranging a set of defense towers at strategic positions. We have discovered that the representation humans use is significantly more complex than what is needed to describe the spatial situation. Strategy and spatial representations are tightly intertwined with spatial representations forgoing objective, atomically-defined spatial features for context-sensitive, goal-oriented spatial affordances. Spatial relationships exist not only between objects but between an object’s properties, second-order properties, joint spatial properties and temporal properties.