sprinter
Speeding up Speculative Decoding via Approximate Verification
Zhong, Meiyu, Teku, Noel, Tandon, Ravi
Speculative Decoding (SD) is a recently proposed technique for faster inference using Large Language Models (LLMs). SD operates by using a smaller draft LLM for autoregressively generating a sequence of tokens and a larger target LLM for parallel verification to ensure statistical consistency. However, periodic parallel calls to the target LLM for verification prevent SD from achieving even lower latencies. We propose SPRINTER, which utilizes a low-complexity verifier trained to predict if tokens generated from a draft LLM would be accepted by the target LLM. By performing approximate sequential verification, SPRINTER does not require verification by the target LLM and is only invoked when a token is deemed unacceptable. This leads to reducing the number of calls to the larger LLM and can achieve further speedups. We present a theoretical analysis of SPRINTER, examining the statistical properties of the generated tokens, as well as the expected reduction in latency as a function of the verifier. We evaluate SPRINTER on several datasets and model pairs, demonstrating that approximate verification can still maintain high quality generation while further reducing latency. For instance, on Wiki-Summaries dataset, SPRINTER achieves a 1.7x latency speedup and requires 8.3x fewer flops relative to SD, while still generating high-quality responses when using GPT2-Small and GPT2-XL as draft/target models.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Arizona > Pima County > Tucson (0.14)
- Europe > Spain > Galicia > Madrid (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
An AI Was Taught to Play the World's Hardest Video Game and Still Couldn't Set a New Record
What's the hardest video game you've ever played? If it wasn't QWOP then let me tell you right know that you don't know how truly difficult a game can be. The deceptively simple running game is so challenging to master that even an AI trained using machine learning still only mustered a top 10 score instead of shattering the record. If you've never played QWOP before, you owe it to yourself to give it a try and see if you can even get your sprinter off the starting line. Developed by Bennett Foddy back in 2008, QWOP was inspired by an '80s arcade game called Track & Field that requires players to mindlessly mashing buttons to win a race.
Anthropometric clusters of competitive cyclists and their sprint and endurance performance
Do athletes specialize towards sports disciplines that are well aligned with their anthropometry? Novel machine-learning algorithms now enable scientists to cluster athletes based on their individual anthropometry while integrating multiple anthropometric dimensions, which may provide new perspectives on anthropometry-dependent sports specialization. We aimed to identify clusters of competitive cyclists based on their individual anthropometry using multiple anthropometric measures, and to evaluate whether athletes with a similar anthropometry also competed in the same cycling discipline. Additionally, we assessed differences in sprint and endurance performance between the anthropometric clusters. Twenty-four nationally and internationally competitive male cyclists were included from sprint, pursuit and road disciplines.
Digital doping: Are big data, AI and virtual reality creating an uneven playing field?
Watching elite athletes run, leap and score, it's hard to imagine there's much room for improvement, but the Internet of Things, Big Data and virtual reality are shaving milliseconds from sprinters, extending the jumps of Olympians – and helping your favourite striker put the ball in the net. That's before bionics change sports forever, with predictions that the sprinters at the next summer Olympics could be outperformed by an athlete at the Paralympics. It all starts with data collection, which shouldn't be a shock. The film Moneyball was based on the true story of Oakland Athletics' team manager outwitting rivals with data science – and that was in 2002. Fast forward to 2018 and the combination of always-on sensors and connectivity takes the idea several leaps forward.
- North America > United States > Oregon (0.05)
- Europe > United Kingdom (0.05)