Goto

Collaborating Authors

 etris


TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding

Wu, Zhaoxuan, Zhou, Zijian, Verma, Arun, Prakash, Alok, Rus, Daniela, Low, Bryan Kian Hsiang

arXiv.org Artificial Intelligence

We propose TETRIS, a novel method that optimizes the total throughput of batch speculative decoding in multi-request settings. Unlike existing methods that optimize for a single request or a group of requests as a whole, TETRIS actively selects the most promising draft tokens (for every request in a batch) to be accepted when verified in parallel, resulting in fewer rejected tokens and hence less wasted computing resources. Such an effective resource utilization to achieve fast inference in large language models (LLMs) is especially important to service providers with limited inference capacity. Compared to baseline speculative decoding, TETRIS yields a consistently higher acceptance rate and more effective utilization of the limited inference capacity. We show theoretically and empirically that TETRIS outperforms baseline speculative decoding and existing methods that dynamically select draft tokens, leading to a more efficient batch inference in LLMs.


SMiRL: Surprise Minimizing RL in Dynamic Environments

Berseth, Glen, Geng, Daniel, Devin, Coline, Finn, Chelsea, Jayaraman, Dinesh, Levine, Sergey

arXiv.org Artificial Intelligence

All living organisms struggle against the forces of nature to carve out niches where they can maintain homeostasis. We propose that such a search for order amidst chaos might offer a unifying principle for the emergence of useful behaviors in artificial agents. We formalize this idea into an unsupervised reinforcement learning method called surprise minimizing RL (SMiRL). SMiRL trains an agent with the objective of maximizing the probability of observed states under a model trained on previously seen states. The resulting agents can acquire proactive behaviors that seek out and maintain stable conditions, such as balancing and damage avoidance, that are closely tied to an environment's prevailing sources of entropy, such as wind, earthquakes, and other agents. We demonstrate that our surprise minimizing agents can successfully play Tetris, Doom, control a humanoid to avoid falls and navigate to escape enemy agents, without any task-specific reward supervision. We further show that SMiRL can be used together with a standard task reward to accelerate reward-driven learning.


From AI World: Why Leveraging Edge Data Requires A Lot of Energy - RTInsights

#artificialintelligence

A fireside chat at AI World discussed the issues encountered in extracting data from the edge. One may be forgiven for assuming manufacturers are ready to crack open their various production systems and begin leveraging the data that has been trapped within. The reality can be much more complicated. The issues encountered in extracting data from the edge were explored in a fireside chat with Joseph Etris, engineering project manager with Continental Automotive Systems, joined by John Auld, regional sales director at Zededa, Inc. I had the opportunity to moderate the session, part of the recent AI World conference in Boston.