Goto

Collaborating Authors

 chiron


Hierarchical Autoscaling for Large Language Model Serving with Chiron

Patke, Archit, Reddy, Dhemath, Jha, Saurabh, Narayanaswami, Chandra, Kalbarczyk, Zbigniew, Iyer, Ravishankar

arXiv.org Artificial Intelligence

Large language model (LLM) serving is becoming an increasingly important workload for cloud providers. Based on performance SLO requirements, LLM inference requests can be divided into (a) interactive requests that have tight SLOs in the order of seconds, and (b) batch requests that have relaxed SLO in the order of minutes to hours. These SLOs can degrade based on the arrival rates, multiplexing, and configuration parameters, thus necessitating the use of resource autoscaling on serving instances and their batch sizes. However, previous autoscalers for LLM serving do not consider request SLOs leading to unnecessary scaling and resource under-utilization. To address these limitations, we introduce Chiron, an autoscaler that uses the idea of hierarchical backpressure estimated using queue size, utilization, and SLOs. Our experiments show that Chiron achieves up to 90% higher SLO attainment and improves GPU efficiency by up to 70% compared to existing solutions.


CHIRON: Rich Character Representations in Long-Form Narratives

Gurung, Alexander, Lapata, Mirella

arXiv.org Artificial Intelligence

Characters are integral to long-form narratives, but are poorly understood by existing story analysis and generation systems. While prior work has simplified characters via graph-based methods and brief character descriptions, we aim to better tackle the problem of representing complex characters by taking inspiration from advice given to professional writers. We propose CHIRON, a new `character sheet' based representation that organizes and filters textual information about characters. We construct CHIRON sheets in two steps: a Generation Module that prompts an LLM for character information via question-answering and a Validation Module that uses automated reasoning and a domain-specific entailment model to eliminate false facts about a character. We validate CHIRON via the downstream task of masked-character prediction, where our experiments show CHIRON is better and more flexible than comparable summary-based baselines. We also show that metrics derived from CHIRON can be used to automatically infer character-centricity in stories, and that these metrics align with human judgments.


Top AI Research Advances For Machine Learning Infrastructure

#artificialintelligence

As deep learning models become more and more popular in real-world business applications and training datasets grow very large, machine learning (ML) infrastructure is becoming a critical issue in many companies. To help you stay aware of the latest research advances in ML infrastructure, we've summarized some of the most important research papers recently introduced in this area. As you read these summaries, you will be able to learn from the experience of the leading tech companies, including Google, Microsoft, and LinkedIn. The papers we've selected cover data labeling and data validation frameworks, different approaches to distributed training of ML models, a novel approach to tracking ML model performance in production, and more. If you'd like to skip around, here are the papers we've summarized: If these accessible AI research analyses & summaries are useful for you, you can subscribe to receive our regular industry updates below.


Top AI Research Advances For Machine Learning Infrastructure

#artificialintelligence

As deep learning models become more and more popular in real-world business applications and training datasets grow very large, machine learning (ML) infrastructure is becoming a critical issue in many companies. To help you stay aware of the latest research advances in ML infrastructure, we've summarized some of the most important research papers recently introduced in this area. As you read these summaries, you will be able to learn from the experience of the leading tech companies, including Google, Microsoft, and LinkedIn. The papers we've selected cover data labeling and data validation frameworks, different approaches to distributed training of ML models, a novel approach to tracking ML model performance in production, and more. If you'd like to skip around, here are the papers we've summarized: If these accessible AI research analyses & summaries are useful for you, you can subscribe to receive our regular industry updates below.


How Researchers Are Building Models To Safeguard Private Data In Machine Learning

#artificialintelligence

More machine learning applications are permeating in the tech ecosystem and the data that goes into ML systems is being derived from all sorts of sources -- regardless of its sensitivity. ML algorithms do not realise the aspect of sensitivity as it always looks at data as a way to establish and learn patterns, rather than looking into the who's who of the data. Miscreants might take advantage of this and circumvent the ML systems itself, which can have devastating effects altogether. If that happens, the purpose of ML will completely fail. To counter this, and establish a secure and safe ML environment, researchers are strictly working towards building privacy in ML models.


Bloodhound engineers reveal it has only been tested virtually - until now

Daily Mail - Science & tech

It was a staggering feat, a car that went faster than the speed of sound. Two decades on, that record remains unchallenged. Back in 2007, a small team of British engineers headed up by Richard Noble and Andy Green decided to have a pop at the world land speed record once more. Back in 2007, a small team of British engineers headed up by Richard Noble and Andy Green decided to have a pop at the world land speed record once more. A rocket scientist was brought in to design the largest hybrid rocket system ever developed in the UK, a structural engineer was brought in to design the car's internal structure and I was invited to join the team along with Ron Ayers to ensure that this car would, indeed, remain a car and stay firmly planted on the ground.