object storage
Bridging the Clinical Expertise Gap: Development of a Web-Based Platform for Accessible Time Series Forecasting and Analysis
Mullen, Aaron D., Harris, Daniel R., Slavova, Svetla, Bumgardner, V. K. Cody
Time series forecasting has applications across domains and industries, especially in healthcare, but the technical expertise required to analyze data, build models, and interpret results can be a barrier to using these techniques. This article presents a web platform that makes the process of analyzing and plotting data, training forecasting models, and interpreting and viewing results accessible to researchers and clinicians. Users can upload data and generate plots to showcase their variables and the relationships between them. The platform supports multiple forecasting models and training techniques which are highly customizable according to the user's needs. Additionally, recommendations and explanations can be generated from a large language model that can help the user choose appropriate parameters for their data and understand the results for each model. The goal is to integrate this platform into learning health systems for continuous data collection and inference from clinical pipelines.
AI Factories: It's time to rethink the Cloud-HPC divide
Lopez, Pedro Garcia, Pons, Daniel Barcelona, Copik, Marcin, Hoefler, Torsten, Quiñones, Eduardo, Malawski, Maciej, Pietzutch, Peter, Marti, Alberto, Timoudas, Thomas Ohlson, Slominski, Aleksander
The strategic importance of artificial intelligence is driving a global push toward Sovereign AI initiatives. Nationwide governments are increasingly developing dedicated infrastructures, called AI Factories (AIF), to achieve technological autonomy and secure the resources necessary to sustain robust local digital ecosystems. In Europe, the EuroHPC Joint Undertaking is investing hundreds of millions of euros into several AI Factories, built atop existing high-performance computing (HPC) supercomputers. However, while HPC systems excel in raw performance, they are not inherently designed for usability, accessibility, or serving as public-facing platforms for AI services such as inference or agentic applications. In contrast, AI practitioners are accustomed to cloud-native technologies like Kubernetes and object storage, tools that are often difficult to integrate within traditional HPC environments. This article advocates for a dual-stack approach within supercomputers: integrating both HPC and cloud-native technologies. Our goal is to bridge the divide between HPC and cloud computing by combining high performance and hardware acceleration with ease of use and service-oriented front-ends. This convergence allows each paradigm to amplify the other. To this end, we will study the cloud challenges of HPC (Serverless HPC) and the HPC challenges of cloud technologies (High-performance Cloud).
The Use of Object Storage to Transform ML Infrastructure
Machine learning infrastructure is probably the greatest thing to focus on when building machine learning models. Creating processes for integrating machine learning within a company's current computational infrastructure stays a challenge for which robust industry standards don't yet exist. However, organizations are progressively understanding that the advancement of an infrastructure that underpins the consistent training, testing, and deployment of models at an enterprise scale is as essential to long-term viability as the models themselves. Small organizations, notwithstanding, the battle to go up against enormous companies that have the assets to fill the huge, modular teams and processes of internal tool development that are regularly important to create strong ML pipelines. At present data scientists, who should focus on significant AI advancement, need to do loads of DevOps work before they are prepared to do the thing they do best: playing with the data and algorithms.
An Architecture for Artificial Intelligence Storage
As we've talked about in the past, the focus on data – how much is being generated, where it's being created, the tools needed to take advantage of it, the shortage of skilled talent to manage it, and so on – is rapidly changing the way enterprises are operating both in the datacenter and in the cloud and dictating many of the product roadmaps being developed by tech vendors. Automation, analytics, artificial intelligence (AI) and machine learning, and the ability to easily move applications and data between on-premises and cloud environments are the focus of much of what OEMs and other tech players are doing. And all of this is being accelerated by the COVID-19 pandemic, which is speeding up enterprise movement to the cloud and forcing them to adapt to a suddenly widely distributed workforce, trends that won't be changing any time soon as the coronavirus outbreak tightens its grip, particularly in the United States. OEMs over the past several months have been particularly aggressive in expanding their offerings in the storage sector, which is playing a central role in help enterprises bridge the space between the datacenter, the cloud and the network edge and to deal with the vast amounts of structured and – in particular – unstructured data being created. That can be seen in announcements that some of the larger vendors have made over the past few months.
IBM Ramps Up AI, Analytics Via New File, Object Storage
IBM Thursday introduced new storage hardware and software aimed at placing its storage at the center of large-scale data requirements for artificial intelligence and analytics workloads. The new offerings are aimed at helping to build the kind of information architecture needed to get the most out of businesses' fast-changing data, said Eric Herzog, IBM's chief marketing officer and vice president of worldwide storage channels. "The new stuff is all about storage solutions for AI, big data and business analytics," Herzog told CRN. "IBM thinks customers need an information architecture to build AI before they can collect and analyze their data and feed it into their AI systems." IBM storage technology has always been an important part of customers' high-performance computing, artificial intelligence and machine-learning infrastructures, said John Zawistowski, global systems solutions executive at Sycomp, a Foster City, Calif.-based solution provider and IBM channel partner. "Why IBM? It's the way they integrated the AI software platform and storage," Zawistowski told CRN. "And the way IBM understands the importance of doing that. And the way IBM technology performs."
How bigger data is activating analytics
It wasn't so long ago that business analytics operated on a months-long cycle. For most of the twentieth century, the main interaction between a company and its data was a regular review of its most easily quantifiable measures, in the form of annual or quarterly financial assessments. Today, interacting with data this infrequently would be unimaginable in even a small business. As data availability and transfer speeds have grown at exponential rates, the time lag between intake and analysis of data has shortened to the point that, today, real-time data analytics is often part of an organisation's standard operating procedure. There are few industries which have not been lifted up by this rising tide of data.
Machine Learning on Autonomous Database: A Practical Example
The dataset used for building a network intrusion detection classifier is the classic KDD you can download here, released as first version in the 1999 KDD Cup, with 125.973 records in the training set. It was built for DARPA Intrusion Detection Evaluation Program by MIT Lincoln Laboratory. The dataset is already split into training and test dataset. The sub-classes into training dataset are 22 for attacks, and one "normal" for traffic allowed. The list of attacks and the associations with the four categories reported above is hold in this file.
IBM/FfDL
This repository contains the core services of the FfDL (Fabric for Deep Learning) platform. FfDL is an operating system "fabric" for Deep Learning Once installed, use the command make minikube to start Minikube and set up local network routes. The minimum recommended capacity for FfDL is 4GB Memory and 2 CPUs. If you already have a FfDL deployment up and running, you can jump to FfDL User Guide to use FfDL for training your deep learning models. If you are getting started and want to setup your own FfDL deployment, please follow the steps below.