AITopics | sequoia

Collaborating Authors

sequoia

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sequoia: Scalable and Robust Speculative Decoding

Neural Information Processing SystemsMar-22-2026, 19:10:03 GMT

As the usage of large language models (LLMs) grows, it becomes increasingly important to serve them quickly and efficiently. While speculative decoding has recently emerged as a promising direction for accelerating LLM serving, existing methods are limited in their ability to scale to larger speculation budgets and adapt to different hyperparameters. This paper introduces Sequoia, a scalable and robust algorithm for speculative decoding. To improve scalability, Sequoia introduces a dynamic programming algorithm to find an optimal tree structure for the speculated tokens. To achieve robust speculative decoding, Sequoia uses a novel sampling and verification method that outperforms prior work across different decoding temperatures. Sequoia improves the decoding speed of Llama2-7B, Llama2-13B, and Vicuna-33B on an A100 GPU by up to $4.04\times$, $3.73\times$, and $2.27 \times$. To serve Llama3-70B-Instruct on a single L40 GPU through offloading, Sequoia reduces the per-token decoding latency to 0.60 s/token, $9.5\times$ faster than DeepSpeed-Zero-Inference.

large language model, machine learning, natural language, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Sequoia: Scalable and Robust Speculative Decoding

Neural Information Processing SystemsMay-27-2025, 20:26:08 GMT

algorithm, scalable and robust speculative decoding, sequoia, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Reinforcement learning with combinatorial actions for coupled restless bandits

Xu, Lily, Wilder, Bryan, Khalil, Elias B., Tambe, Milind

arXiv.org Artificial IntelligenceMar-17-2025

Reinforcement learning (RL) has increasingly been applied to solve real-world planning problems, with progress in handling large state spaces and time horizons. However, a key bottleneck in many domains is that RL methods cannot accommodate large, combinatorially structured action spaces. In such settings, even representing the set of feasible actions at a single step may require a complex discrete optimization formulation. We leverage recent advances in embedding trained neural networks into optimization problems to propose SEQUOIA, an RL algorithm that directly optimizes for long-term reward over the feasible action space. Our approach embeds a Q-network into a mixed-integer program to select a combinatorial action in each timestep. Here, we focus on planning over restless bandits, a class of planning problems which capture many real-world examples of sequential decision making. RMAB, a broader class of restless bandits with combinatorial actions that cannot be decoupled across the arms of the restless bandit, requiring direct solving over the joint, exponentially large action space. Our approach significantly outperforms existing methods--which cannot address sequential planning and combinatorial selection simultaneously--by an average of 24.8% on these difficult instances. Reinforcement learning (RL) has made tremendous progress in recent years to solve a wide range of practical problems (Treloar et al., 2020; Marot et al., 2021; Silvestro et al., 2022; Degrave et al., 2022). While successful at dealing with large or infinite state spaces, RL struggles with discrete, combinatorial action spaces. This limitation is pertinent to many real-world sequential decisionmaking problems, where resource constraints frequently lead to combinatorial action spaces (Dulac-Arnold et al., 2020). Consider, for example, a sequential resource allocation problem in which public health workers are dispatched to visit patients. The workers each have a limited daily budget to maximize patient well-being. These requirements give rise to an exponentially large combinatorial action space to optimize over, even when the number of workers and patients is small.

constraint, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2503.01919

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Public Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Google Maps changed the way we get around. It all began in a spare bedroom in Sydney

The GuardianFeb-8-2025, 19:00:31 GMT

Stephen Ma has every right to claim bragging rights for helping to hatch the world's most popular online mapping platform. Instead, for the past two decades Ma, one of the four co-founders of Google Maps, has buried himself in a big black hole of anonymity. But not because of any shame or regret – it's just that he isn't one to blow his own trumpet. "I tend to be a very private person," Ma says in a rare interview. "I find the limelight uncomfortable."

artificial intelligence, google map, platform, (15 more...)

The Guardian

Country:

Oceania > Australia > New South Wales (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Geographic Information Systems (0.78)
Information Technology > Communications (0.47)
Information Technology > Artificial Intelligence (0.47)

Add feedback

DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure

Xiong, Yunfan, Zhang, Ruoyu, Li, Yanzeng, Wu, Tianhao, Zou, Lei

arXiv.org Artificial IntelligenceOct-15-2024

While speculative decoding has recently appeared as a promising direction for accelerating the inference of large language models (LLMs), the speedup and scalability are strongly bounded by the token acceptance rate. Prevalent methods usually organize predicted tokens as independent chains or fixed token trees, which fails to generalize to diverse query distributions. In this paper, we propose DySpec, a faster speculative decoding algorithm with a novel dynamic token tree structure. We begin by bridging the draft distribution and acceptance rate from intuitive and empirical clues, and successfully show that the two variables are strongly correlated. Based on this, we employ a greedy strategy to dynamically expand the token tree at run time. Theoretically, we show that our method can achieve optimal results under mild assumptions. Empirically, DySpec yields a higher acceptance rate and speedup than fixed trees. DySpec can drastically improve the throughput and reduce the latency of token generation across various data distribution and model sizes, which significantly outperforms strong competitors, including Specinfer and Sequoia. Under low temperature setting, DySpec can improve the throughput up to 9.1$\times$ and reduce the latency up to 9.4$\times$ on Llama2-70B. Under high temperature setting, DySpec can also improve the throughput up to 6.21$\times$, despite the increasing difficulty of speculating more than one token per step for draft model.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.11744

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LinkedIn Has Answers to Questions You've Never Had

SlateJul-2-2024, 14:30:00 GMT

"What does a teacher do?" "What does a barber do?" "What are recent developments in Swiftonomics?" I pondered these questions only after LinkedIn prompted me to do so. Suddenly, I found myself contemplating the very essence of my own reality. How did I learn what I know? How does my hair go from long to short every five weeks?

google, linkedin, vector geometry, (4 more...)

Slate

Industry: Information Technology > Services (0.71)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

All the Top New Features Coming to MacOS Sequoia

WIREDJun-11-2024, 21:06:00 GMT

Apple has officially unveiled the latest version of its operating system for Mac. This time around, Apple stuck to its "California places" naming convention and went with macOS Sequoia. Also known as macOS 15, the new OS packs a ton of new capabilities onto the desktop, including a password management app, video conferencing tools, and updates to Safari, as well as all the features that come with Apple Intelligence--the company's new artificial intelligence–powered system. Below, we break down all these new features that will become available in macOS Sequoia when it ships this fall. Be sure to also check out our iOS 18 and iPadOS 18 feature roundup for all the new features coming to your iPhone and iPad, and our look at what's new in watchOS 11.

maco sequoia, macos sequoia, sequoia, (13 more...)

WIRED

Country: North America > United States > California (0.37)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Mobile (0.78)
Information Technology > Communications > Social Media (0.52)

Add feedback

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

Chen, Zhuoming, May, Avner, Svirschevski, Ruslan, Huang, Yuhsun, Ryabinin, Max, Jia, Zhihao, Chen, Beidi

arXiv.org Artificial IntelligenceFeb-29-2024

As the usage of large language models (LLMs) grows, performing efficient inference with these models becomes increasingly important. While speculative decoding has recently emerged as a promising direction for speeding up inference, existing methods are limited in their ability to scale to larger speculation budgets, and adapt to different hyperparameters and hardware. This paper introduces Sequoia, a scalable, robust, and hardware-aware algorithm for speculative decoding. To attain better scalability, Sequoia introduces a dynamic programming algorithm to find the optimal tree structure for the speculated tokens. To achieve robust speculative performance, Sequoia uses a novel sampling and verification method that outperforms prior work across different decoding temperatures. Finally, Sequoia introduces a hardware-aware tree optimizer that maximizes speculative performance by automatically selecting the token tree size and depth for a given hardware platform. Evaluation shows that Sequoia improves the decoding speed of Llama2-7B, Llama2-13B, and Vicuna-33B on an A100 by up to $4.04\times$, $3.73\times$, and $2.27\times$. For offloading setting on L40, Sequoia achieves as low as 0.56 s/token for exact Llama2-70B inference latency, which is $9.96\times$ on our optimized offloading system (5.6 s/token), $9.7\times$ than DeepSpeed-Zero-Inference, $19.5\times$ than Huggingface Accelerate.

algorithm, sequoia, verification algorithm, (15 more...)

arXiv.org Artificial Intelligence

2402.12374

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Amazon says its new AI-powered robots reduce fulfilment time by 25 percent

EngadgetOct-18-2023, 12:25:17 GMT

Amazon is integrating a new robotics system into its warehouses to improve delivery times, safety and general operations. The AI-powered technology, known as Sequoia, could improve the speed of finding and storing products by up to 75 percent and order fulfillment by up to 25 percent, the Wall Street Journal reports. The system was already introduced in one of Amazon's Houston-based warehouses. Sequoia involves vehicles transporting totes of products to a sorting machine. It uses robotic arms and computer vision to identify the inventory before sending it to employees for delivery.

ai-powered robot reduce fulfilment time, amazon, warehouse, (2 more...)

Engadget

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Sequoia: A Software Framework to Unify Continual Learning Research

Normandin, Fabrice, Golemo, Florian, Ostapenko, Oleksiy, Rodriguez, Pau, Riemer, Matthew D, Hurtado, Julio, Khetarpal, Khimya, Lindeborg, Ryan, Cecchi, Lucas, Lesort, Timothée, Charlin, Laurent, Rish, Irina, Caccia, Massimo

arXiv.org Artificial IntelligenceJun-5-2023

The field of Continual Learning (CL) seeks to develop algorithms that accumulate knowledge and skills over time through interaction with non-stationary environments. In practice, a plethora of evaluation procedures (settings) and algorithmic solutions (methods) exist, each with their own potentially disjoint set of assumptions. This variety makes measuring progress in CL difficult. We propose a taxonomy of settings, where each setting is described as a set of assumptions. A tree-shaped hierarchy emerges from this view, where more general settings become the parents of those with more restrictive assumptions. This makes it possible to use inheritance to share and reuse research, as developing a method for a given setting also makes it directly applicable onto any of its children. We instantiate this idea as a publicly available software framework called Sequoia, which features a wide variety of settings from both the Continual Supervised Learning (CSL) and Continual Reinforcement Learning (CRL) domains. Sequoia also includes a growing suite of methods which are easy to extend and customize, in addition to more specialized methods from external libraries. We hope that this new paradigm and its first implementation can help unify and accelerate research in CL. You can help us grow the tree by visiting www.github.com/lebrice/Sequoia.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2108.01005

Country:

North America > Canada > Quebec > Montreal (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback