Goto

Collaborating Authors

 Media



Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

Neural Information Processing Systems

Limitations in either capability can impede the overall performance of a VLM. A systematic evaluation of the perception and reasoning capabilities is crucial to provide valuable insights for future model optimization.




xLSTM: Extended Long Short-Term Memory Maximilian Beck

Neural Information Processing Systems

Since then, LSTMs have stood the test of time and contributed to numerous deep learning success stories, in particular they constituted the first Large Language Models (LLMs). However, the advent of the Transformer technology with parallelizable self-attention at its core marked the dawn of a new era, outpacing LSTMs at scale.



Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees Sijia Chen 1, 2, Yibo Wang 1, 2, Yi-Feng Wu3 Qing-Guo Chen

Neural Information Processing Systems

Tool-augmented large language models (LLMs) leverage tools, often in the form of APIs, to improve their reasoning capabilities on complex tasks. This enables them to act as intelligent agents interacting with the real world. The recently introduced ToolLLaMA model by Qin et al. [ 2023 ] utilizes the depth-first search-based decision tree (DFSDT) mechanism for multi-step reasoning with 16000+ real-world APIs, effectively enhancing the performance of tool-augmented LLMs compared to traditional chain reasoning mechanisms. However, their approach only employs successful paths from decision trees (also called inference trees) for supervised fine-tuning (SFT), missing out on the potential learning opportunities from failed paths. Inspired by this, we propose an inference trajectory optimization framework based on preference learning to address this limitation.