AITopics | Stutsman, Ryan

Collaborating Authors

Stutsman, Ryan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Packrat: Automatic Reconfiguration for Latency Minimization in CPU-based DNN Serving

Bhardwaj, Ankit, Phanishayee, Amar, Narayanan, Deepak, Tarta, Mihail, Stutsman, Ryan

arXiv.org Artificial IntelligenceNov-29-2023

In this paper, we investigate how to push the performance limits of serving Deep Neural Network (DNN) models on CPU-based servers. Specifically, we observe that while intra-operator parallelism across multiple threads is an effective way to reduce inference latency, it provides diminishing returns. Our primary insight is that instead of running a single instance of a model with all available threads on a server, running multiple instances each with smaller batch sizes and fewer threads for intra-op parallelism can provide lower inference latency. However, the right configuration is hard to determine manually since it is workload- (DNN model and batch size used by the serving system) and deployment-dependent (number of CPU cores on server). We present Packrat, a new serving system for online inference that given a model and batch size ($B$) algorithmically picks the optimal number of instances ($i$), the number of threads each should be allocated ($t$), and the batch sizes each should operate on ($b$) that minimizes latency. Packrat is built as an extension to TorchServe and supports online reconfigurations to avoid serving downtime. Averaged across a range of batch sizes, Packrat improves inference latency by 1.43$\times$ to 1.83$\times$ on a range of commonly used DNNs.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.18174

Country:

Asia (1.00)
Europe (0.67)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Washington > King County (0.14)

Genre: Research Report (0.64)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Linearizability Checking for Actor-based Systems

Al-Mahfoudh, Mohammed S., Stutsman, Ryan, Gopalakrishnan, Ganesh

arXiv.org Artificial IntelligenceJul-19-2023

Recent demand for distributed software had led to a surge in popularity in actor-based frameworks. However, even with the stylized message passing model of actors, writing correct distributed software is still difficult. We present our work on linearizability checking in DS2, an integrated framework for specifying, synthesizing, and testing distributed actor systems. The key insight of our approach is that often subcomponents of distributed actor systems represent common algorithms or data structures (e.g.\ a distributed hash table or tree) that can be validated against a simple sequential model of the system. This makes it easy for developers to validate their concurrent actor systems without complex specifications. DS2 automatically explores the concurrent schedules that system could arrive at, and it compares observed output of the system to ensure it is equivalent to what the sequential implementation could have produced. We describe DS2's linearizability checking and test it on several concurrent replication algorithms from the literature. We explore in detail how different algorithms for enumerating the model schedule space fare in finding bugs in actor systems, and we present our own refinements on algorithms for exploring actor system schedules that we show are effective in finding bugs.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2110.06407

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Workflow (0.67)
Research Report (0.63)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture (1.00)
(3 more...)

Add feedback