Goto

Collaborating Authors

HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units

arXiv.org Machine Learning

Deep learning models have achieved expert-level performance in healthcare with an exclusive focus on training accurate models. However, in many clinical environments such as intensive care unit (ICU), real-time model serving is equally if not more important than accuracy, because in ICU patient care is simultaneously more urgent and more expensive. Clinical decisions and their timeliness, therefore, directly affect both the patient outcome and the cost of care. To make timely decisions, we argue the underlying serving system must be latency-aware. To compound the challenge, health analytic applications often require a combination of models instead of a single model, to better specialize individual models for different targets, multi-modal data, different prediction windows, and potentially personalized predictions. To address these challenges, we propose HOLMES-an online model ensemble serving framework for healthcare applications. HOLMES dynamically identifies the best performing set of models to ensemble for highest accuracy, while also satisfying sub-second latency constraints on end-to-end prediction. We demonstrate that HOLMES is able to navigate the accuracy/latency tradeoff efficiently, compose the ensemble, and serve the model ensemble pipeline, scaling to simultaneously streaming data from 100 patients, each producing waveform data at 250~Hz. HOLMES outperforms the conventional offline batch-processed inference for the same clinical task in terms of accuracy and latency (by order of magnitude). HOLMES is tested on risk prediction task on pediatric cardio ICU data with above 95% prediction accuracy and sub-second latency on 64-bed simulation.


Three ways to fix DRAM's latency problem

ZDNet

In a brilliant PhD thesis, Understanding and Improving the Latency of DRAM-Based Memory Systems, Kevin K. Chang of CMU tackles the DRAM issue, and suggests some novel architectural enhancements to make substantial improvements in DRAM latency.


Brazilian gamers see improvement in broadband latency and speed

ZDNet

Brazilians have seen recent improvements in fixed broadband latency as demand for online gaming rises during the Covid-19 outbreak, a new study has found. Latency - the reaction time of a connection - varies between countries across Latin America, particularly when it comes to fixed broadband. Latency is a key metric in gaming and determines much of the user's experience in terms of the absence of lags during gameplay. According to the data from Ookla's Speedtest Intelligence, gamers in Brazil had the lowest mean latency on fixed broadband, relevant for games played on PC and console games, at 19 ms during Q2 2020, down from 23 ms in the same period in 2019. By comparison, Colombia had the highest fixed broadband latency at 43 ms.


New InfiniteIO Platform Reduces Latency and Accelerates Performance for Machine Learning, AI and Analytics

#artificialintelligence

AUSTIN, Texas--(BUSINESS WIRE)--InfiniteIO, the world's fastest metadata platform to reduce application latency, today announced the new Application Accelerator, which delivers dramatic performance improvements for critical applications by processing file metadata independently from on-premises storage or cloud systems. The new platform provides organizations across industries the lowest possible latency for their mission-critical applications, such as AI/machine learning, HPC and genomics, while minimizing disruption to IT teams. "Bandwidth and I/O challenges have been largely overcome, yet reducing latency remains a significant barrier to improving application performance," said Henry Baltazar, vice president of research at 451 Research. "Metadata requests are a large part of file system latency, making up the vast majority of requests to a storage system or cloud. InfiniteIO's approach to abstracting metadata from file data offers IT managers a nondisruptive way to immediately accelerate application performance."


New InfiniteIO Platform Reduces Latency and Accelerates Performance for Machine Learning, AI and Analytics

#artificialintelligence

InfiniteIO, the world's fastest metadata platform to reduce application latency, today announced the new Application Accelerator, which delivers dramatic performance improvements for critical applications by processing file metadata independently from on-premises storage or cloud systems. The new platform provides organizations across industries the lowest possible latency for their mission-critical applications, such as AI/machine learning, HPC and genomics, while minimizing disruption to IT teams. This press release features multimedia. "Bandwidth and I/O challenges have been largely overcome, yet reducing latency remains a significant barrier to improving application performance," said Henry Baltazar, vice president of research at 451 Research. "Metadata requests are a large part of file system latency, making up the vast majority of requests to a storage system or cloud. InfiniteIO's approach to abstracting metadata from file data offers IT managers a nondisruptive way to immediately accelerate application performance."