Deploying AI Frameworks on Secure HPC Systems with Containers

Brayford, David, Vallecorsa, Sofia, Atanasov, Atanas, Baruffa, Fabio, Riviera, Walter

May-24-2019–arXiv.org Artificial Intelligence

The increasing interest in the usage of Artificial Intelligence techniques (AI) from the research community and industry to tackle "real world" problems, requires High Performance Computing (HPC) resources to efficiently compute and scale complex algorithms across thousands of nodes. Unfortunately, typical data scientists are not familiar with the unique requirements and characteristics of HPC environments. They usually develop their applications with high-level scripting languages or frameworks such as TensorFlow and the installation process often requires connection to external systems to download open source software during the build. HPC environments, on the other hand, are often based on closed source applications that incorporate parallel and distributed computing API's such as MPI and OpenMP, while users have restricted administrator privileges, and face security restrictions such as not allowing access to external systems. In this paper we discuss the issues associated with the deployment of AI frameworks in a secure HPC environment and how we successfully deploy AI frameworks on SuperMUC-NG with Charliecloud.

artificial intelligence, container, machine learning, (18 more...)

arXiv.org Artificial Intelligence

May-24-2019

arXiv.org PDF

Add feedback

Country:
- Europe
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - Switzerland > Geneva
    - Geneva (0.04)
  - United Kingdom > England
    - Wiltshire > Swindon (0.04)
- North America > United States
  - California > Los Angeles County
    - Long Beach (0.04)
  - District of Columbia > Washington (0.04)
  - Massachusetts
    - Middlesex County > Waltham (0.04)
    - Suffolk County > Boston (0.04)
  - New York > New York County
    - New York City (0.04)

Genre:
- Research Report (0.50)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology
  - Architecture > Distributed Systems (1.00)
  - Artificial Intelligence
    - Applied AI (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found