Goto

Collaborating Authors

 hpc


AI Factories: It's time to rethink the Cloud-HPC divide

Lopez, Pedro Garcia, Pons, Daniel Barcelona, Copik, Marcin, Hoefler, Torsten, Quiñones, Eduardo, Malawski, Maciej, Pietzutch, Peter, Marti, Alberto, Timoudas, Thomas Ohlson, Slominski, Aleksander

arXiv.org Artificial Intelligence

The strategic importance of artificial intelligence is driving a global push toward Sovereign AI initiatives. Nationwide governments are increasingly developing dedicated infrastructures, called AI Factories (AIF), to achieve technological autonomy and secure the resources necessary to sustain robust local digital ecosystems. In Europe, the EuroHPC Joint Undertaking is investing hundreds of millions of euros into several AI Factories, built atop existing high-performance computing (HPC) supercomputers. However, while HPC systems excel in raw performance, they are not inherently designed for usability, accessibility, or serving as public-facing platforms for AI services such as inference or agentic applications. In contrast, AI practitioners are accustomed to cloud-native technologies like Kubernetes and object storage, tools that are often difficult to integrate within traditional HPC environments. This article advocates for a dual-stack approach within supercomputers: integrating both HPC and cloud-native technologies. Our goal is to bridge the divide between HPC and cloud computing by combining high performance and hardware acceleration with ease of use and service-oriented front-ends. This convergence allows each paradigm to amplify the other. To this end, we will study the cloud challenges of HPC (Serverless HPC) and the HPC challenges of cloud technologies (High-performance Cloud).


crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels

Herde, Marek, Lührs, Lukas, Huseljic, Denis, Sick, Bernhard

arXiv.org Artificial Intelligence

Crowdworking is a cost-efficient solution for acquiring class labels. Since these labels are subject to noise, various approaches to learning from crowds have been proposed. Typically, these approaches are evaluated with default hyperparameter configurations, resulting in unfair and suboptimal performance, or with hyperparameter configurations tuned via a validation set with ground truth class labels, representing an often unrealistic scenario. Moreover, both setups can produce different approach rankings, complicating study comparisons. Therefore, we introduce crowd-hpo as a framework for evaluating approaches to learning from crowds in combination with criteria to select well-performing hyperparameter configurations with access only to noisy crowd-labeled validation data. Extensive experiments with neural networks demonstrate that these criteria select hyperparameter configurations, which improve the learning from crowd approaches' generalization performances, measured on separate test sets with ground truth labels. Hence, incorporating such criteria into experimental studies is essential for enabling fairer and more realistic benchmarking.


HPC-AI Coupling Methodology for Scientific Applications

Lu, Yutong, Huang, Dan, Chen, Pin

arXiv.org Artificial Intelligence

Artificial intelligence (AI) technologies have fundamentally transformed numerical-based high-performance computing (HPC) applications with data-driven approaches and endeavored to address existing challenges, e.g. high computational intensity, in various scientific domains. In this study, we explore the scenarios of coupling HPC and AI (HPC-AI) in the context of emerging scientific applications, presenting a novel methodology that incorporates three patterns of coupling: surrogate, directive, and coordinate. Each pattern exemplifies a distinct coupling strategy, AI-driven prerequisite, and typical HPC-AI ensembles. Through case studies in materials science, we demonstrate the application and effectiveness of these patterns. The study highlights technical challenges, performance improvements, and implementation details, providing insight into promising perspectives of HPC-AI coupling. The proposed coupling patterns are applicable not only to materials science but also to other scientific domains, offering valuable guidance for future HPC-AI ensembles in scientific discovery.


Survey of HPC in US Research Institutions

Shu, Peng, Chen, Junhao, Liu, Zhengliang, Zhao, Huaqin, Li, Xinliang, Liu, Tianming

arXiv.org Artificial Intelligence

The rapid growth of AI, data-intensive science, and digital twin technologies has driven an unprecedented demand for high-performance computing (HPC) across the research ecosystem. While national laboratories and industrial hyperscalers have invested heavily in exascale and GPU-centric architectures, university-operated HPC systems remain comparatively under-resourced. This survey presents a comprehensive assessment of the HPC landscape across U.S. universities, benchmarking their capabilities against Department of Energy (DOE) leadership-class systems and industrial AI infrastructures. We examine over 50 premier research institutions, analyzing compute capacity, architectural design, governance models, and energy efficiency. Our findings reveal that university clusters, though vital for academic research, exhibit significantly lower growth trajectories (CAGR $\approx$ 18%) than their national ($\approx$ 43%) and industrial ($\approx$ 78%) counterparts. The increasing skew toward GPU-dense AI workloads has widened the capability gap, highlighting the need for federated computing, idle-GPU harvesting, and cost-sharing models. We also identify emerging paradigms, such as decentralized reinforcement learning, as promising opportunities for democratizing AI training within campus environments. Ultimately, this work provides actionable insights for academic leaders, funding agencies, and technology partners to ensure more equitable and sustainable HPC access in support of national research priorities.


Reviews: A Zero-Positive Learning Approach for Diagnosing Software Performance Regressions

Neural Information Processing Systems

This paper describes a system for detecting the source of performance regressions in source code. The idea is to measure performance counters (HPCs) at a per-function level of the code, and then when a performance regression is detected, it is localized by looking for the function with most anomalous performance counters. The anomaly detection is done by training autoencoders on the HPCs, and there is a further idea to cluster functions with similar behavior profiles to avoid the need for learning an autoencoder for every function in a large code base. This is a controversial paper because there is little methodological novelty. R1 gave the lowest score and asks whether we want to allow this kind of paper in NeurIPS, worrying that if we accept any application of ML, then NeurIPS risks becoming too broad.


Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features

Li, Yupei, Milling, Manuel, Specia, Lucia, Schuller, Björn W.

arXiv.org Artificial Intelligence

The availability of high-quality APIs for Large Language Models (LLMs) has facilitated the widespread creation of Machine-Generated Content (MGC), posing challenges such as academic plagiarism and the spread of misinformation. Existing MGC detectors often focus solely on surface-level information, overlooking implicit and structural features. This makes them susceptible to deception by surface-level sentence patterns, particularly for longer texts and in texts that have been subsequently paraphrased. To overcome these challenges, we introduce novel methodologies and datasets. Besides the publicly available dataset Plagbench, we developed the paraphrased Long-Form Question and Answer (paraLFQA) and paraphrased Writing Prompts (paraWP) datasets using GPT and DIPPER, a discourse paraphrasing tool, by extending artifacts from their original versions. To address the challenge of detecting highly similar paraphrased texts, we propose MhBART, an encoder-decoder model designed to emulate human writing style while incorporating a novel difference score mechanism. This model outperforms strong classifier baselines and identifies deceptive sentence patterns. To better capture the structure of longer texts at document level, we propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features. It results in substantial performance gains across both datasets -- 15.5\% absolute improvement on paraLFQA, 4\% absolute improvement on paraWP, and 1.5\% absolute improvement on M4 compared to SOTA approaches.


The Landscape and Challenges of HPC Research and LLMs

Chen, Le, Ahmed, Nesreen K., Dutta, Akash, Bhattacharjee, Arijit, Yu, Sixing, Mahmud, Quazi Ishtiaque, Abebe, Waqwoya, Phan, Hung, Sarkar, Aishwarya, Butler, Branden, Hasabnis, Niranjan, Oren, Gal, Vo, Vy A., Munoz, Juan Pablo, Willke, Theodore L., Mattson, Tim, Jannesari, Ali

arXiv.org Artificial Intelligence

Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks.


Mapping the landscape of histomorphological cancer phenotypes using self-supervised learning on unlabeled, unannotated pathology slides

Quiros, Adalberto Claudio, Coudray, Nicolas, Yeaton, Anna, Yang, Xinyu, Liu, Bojing, Le, Hortense, Chiriboga, Luis, Karimkhan, Afreen, Narula, Navneet, Moore, David A., Park, Christopher Y., Pass, Harvey, Moreira, Andre L., Quesne, John Le, Tsirigos, Aristotelis, Yuan, Ke

arXiv.org Artificial Intelligence

Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotations used for training these models. To address this limitation of supervised methods, we developed Histomorphological Phenotype Learning (HPL), a fully blue{self-}supervised methodology that requires no expert labels or annotations and operates via the automatic discovery of discriminatory image features in small image tiles. Tiles are grouped into morphologically similar clusters which constitute a library of histomorphological phenotypes, revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer tissues, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. We then demonstrate that these properties are maintained in a multi-cancer study. These results show the clusters represent recurrent host responses and modes of tumor growth emerging under natural selection. Code, pre-trained models, learned embeddings, and documentation are available to the community at https://github.com/AdalbertoCq/Histomorphological-Phenotype-Learning


A optimization framework for herbal prescription planning based on deep reinforcement learning

Yang, Kuo, Yu, Zecong, Su, Xin, He, Xiong, Wang, Ning, Zheng, Qiguang, Yu, Feidie, Liu, Zhuang, Wen, Tiancai, Zhou, Xuezhong

arXiv.org Artificial Intelligence

Treatment planning for chronic diseases is a critical task in medical artificial intelligence, particularly in traditional Chinese medicine (TCM). However, generating optimized sequential treatment strategies for patients with chronic diseases in different clinical encounters remains a challenging issue that requires further exploration. In this study, we proposed a TCM herbal prescription planning framework based on deep reinforcement learning for chronic disease treatment (PrescDRL). PrescDRL is a sequential herbal prescription optimization model that focuses on long-term effectiveness rather than achieving maximum reward at every step, thereby ensuring better patient outcomes. We constructed a high-quality benchmark dataset for sequential diagnosis and treatment of diabetes and evaluated PrescDRL against this benchmark. Our results showed that PrescDRL achieved a higher curative effect, with the single-step reward improving by 117% and 153% compared to doctors. Furthermore, PrescDRL outperformed the benchmark in prescription prediction, with precision improving by 40.5% and recall improving by 63%. Overall, our study demonstrates the potential of using artificial intelligence to improve clinical intelligent diagnosis and treatment in TCM.


Simulations Suggest Information Processing Roles for the Diverse Currents in Hippocampal Neurons

Neural Information Processing Systems

The model presently includes descrip(cid:173) tions of eleven non-linear somatic currents of the HPC, and the electrotonic structure of the neuron is modelled with a soma/short-cable approximation. Model simulations qualitatively or quantitatively reproduce a wide range of somatic electrical behavior i HPCs, and demonstrate possible roles for the various currents in information processing. There are several substrates for neuronal computation, including connec(cid:173) tivity, synapses, morphometries of dendritic trees, linear parameters of cell membrane, as well as non-linear, time-varying membrane conductances, also referred to as currents or channels. In the classical description of neuronal function, the contribution of membrane channels is constrained to that of generating the action potential, setting firing threshold, and establishing the relationship between (steady-state) stimulus intensity and firing frequency. However, it is becoming clear that the role of these channels may be much more complex, resulting in a variety of novel "computational operators" that reflect the information processing occurring in the biological neural net.