Goto

Collaborating Authors

 Large Language Model


Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Zihao Wang

Neural Information Processing Systems

These additional behavior tokens will be augmented to the vocabulary of pretrained Multimodal Language Models. With this encoder, we then pack long-term multimodal interactions involving task instructions, memories, thoughts, observations, textual responses, behavior trajectories, etc .






Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time

Neural Information Processing Systems

Large language models(LLMs) have sparked a new wave of exciting AI applications. Hosting these models at scale requires significant memory resources.