Goto

Collaborating Authors

 control knob


Turning AI Data Centers into Grid-Interactive Assets: Results from a Field Demonstration in Phoenix, Arizona

Colangelo, Philip, Coskun, Ayse K., Megrue, Jack, Roberts, Ciaran, Sengupta, Shayan, Sivaram, Varun, Tiao, Ethan, Vijaykar, Aroon, Williams, Chris, Wilson, Daniel C., MacFarland, Zack, Dreiling, Daniel, Morey, Nathan, Ratnayake, Anuja, Vairamohan, Baskar

arXiv.org Artificial Intelligence

Artificial intelligence (AI) is fueling exponential electricity demand growth, threatening grid reliability, raising prices for communities paying for new energy infrastructure, and stunting AI innovation as data centers wait for interconnection to constrained grids. This paper presents the first field demonstration, in collaboration with major corporate partners, of a software-only approach--Emerald Conductor--that transforms AI data centers into flexible grid resources that can efficiently and immediately harness existing power systems without massive infrastructure buildout. Conducted at a 256-GPU cluster running representative AI workloads within a commercial, hyperscale cloud data center in Phoenix, Arizona, the trial achieved a 25% reduction in cluster power usage for three hours during peak grid events while maintaining AI quality of service (QoS) guarantees. By orchestrating AI workloads based on real-time grid signals without hardware modifications or energy storage, this platform reimagines data centers as grid-interactive assets that enhance grid reliability, advance affordability, and accelerate AI's development.


Towards Real-Time Neural Volumetric Rendering on Mobile Devices: A Measurement Study

Wang, Zhe, Zhu, Yifei

arXiv.org Artificial Intelligence

Neural Radiance Fields (NeRF) is an emerging technique to synthesize 3D objects from 2D images with a wide range of potential applications. However, rendering existing NeRF models is extremely computation intensive, making it challenging to support real-time interaction on mobile devices. In this paper, we take the first initiative to examine the state-of-the-art real-time NeRF rendering technique from a system perspective. We first define the entire working pipeline of the NeRF serving system. We then identify possible control knobs that are critical to the system from the communication, computation, and visual performance perspective. Furthermore, an extensive measurement study is conducted to reveal the effects of these control knobs on system performance. Our measurement results reveal that different control knobs contribute differently towards improving the system performance, with the mesh granularity being the most effective knob and the quantization being the least effective knob. In addition, diverse hardware device settings and network conditions have to be considered to fully unleash the benefit of operating under the appropriate knobs


Does cognitive computing offer the next wave of analytics beyond data science?

#artificialintelligence

"AI" may be a hot buzzword – and a global market expected to grow to nearly $310 billion by 2026 – but what exactly does artificial intelligence mean? The definition of AI can be a willowy one to pin down because its application is so broad and ranging in degree of complexity, scope, algorithmic underpinnings and methodologies used. For these reasons, there is an increased call for a more advanced, specific definition of AI beyond "the simulation of human intelligence processes by machines." Some consider the AI next step – and ultimate evolution – to be cognitive computing. "It's such a vast amorphic term that AI has come to mean right now," said Stephen DeAngelis, founder and CEO of Enterra Solutions.


Zero-Shot Controlled Generation with Encoder-Decoder Transformers

Hazarika, Devamanyu, Namazifar, Mahdi, Hakkani-Tür, Dilek

arXiv.org Artificial Intelligence

Controlling neural network-based models for natural language generation (NLG) has broad applications in numerous areas such as machine translation, document summarization, and dialog systems. Approaches that enable such control in a zero-shot manner would be of great importance as, among other reasons, they remove the need for additional annotated data and training. In this work, we propose novel approaches for controlling encoder-decoder transformer-based NLG models in zero-shot. This is done by introducing three control knobs, namely, attention biasing, decoder mixing, and context augmentation, that are applied to these models at generation time. These knobs control the generation process by directly manipulating trained NLG models (e.g., biasing cross-attention layers) to realize the desired attributes in the generated outputs. We show that not only are these NLG models robust to such manipulations, but also their behavior could be controlled without an impact on their generation performance. These results, to the best of our knowledge, are the first of their kind. Through these control knobs, we also investigate the role of transformer decoder's self-attention module and show strong evidence that its primary role is maintaining fluency of sentences generated by these models. Based on this hypothesis, we show that alternative architectures for transformer decoders could be viable options. We also study how this hypothesis could lead to more efficient ways for training encoder-decoder transformer models.