AITopics | dockerfile

Collaborating Authors

dockerfile

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Repo2Run: Automated Building Executable Environment for Code Repository at Scale

Neural Information Processing SystemsJun-16-2026, 00:44:33 GMT

Scaling up executable code data is significant for improving language models' software engineering capability. The intricate nature of the process makes it labor-intensive, time-consuming, and expert-knowledge-dependent to build a large number of executable code repositories, limiting the scalability of existing work based on running tests. The primary bottleneck lies in the automated building of test environments for different repositories, which is an essential yet underexplored task. To mitigate the gap, we introduce Repo2Run, the first LLM-based agent aiming at automating the building of executable test environments for any repositories at scale. Specifically, given a code repository, Repo2Run iteratively builds the Docker image, runs unit tests based on the feedback of the building, and synthesizes the Dockerfile until the entire pipeline is executed successfully. The resulting Dockerfile can then be used to create Docker container environments for running code and tests. We created a benchmark containing 420 Python repositories with unit tests for evaluation. The results illustrate that Repo2Run achieves an 86.0%

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States (0.92)
Asia > China (0.92)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

An LLM-based Agent for Reliable Docker Environment Configuration

Hu, Ruida, Peng, Chao, Wang, Xinchen, Gao, Cuiyun

arXiv.org Artificial IntelligenceMar-6-2025

Environment configuration is a critical yet time-consuming step in software development, especially when dealing with unfamiliar code repositories. While Large Language Models (LLMs) demonstrate the potential to accomplish software engineering tasks, existing methods for environment configuration often rely on manual efforts or fragile scripts, leading to inefficiencies and unreliable outcomes. We introduce Repo2Run, the first LLM-based agent designed to fully automate environment configuration and generate executable Dockerfiles for arbitrary Python repositories. We address two major challenges: (1) enabling the LLM agent to configure environments within isolated Docker containers, and (2) ensuring the successful configuration process is recorded and accurately transferred to a Dockerfile without error. To achieve this, we propose atomic configuration synthesis, featuring a dual-environment architecture (internal and external environment) with a rollback mechanism to prevent environment "pollution" from failed commands, guaranteeing atomic execution (execute fully or not at all) and a Dockerfile generator to transfer successful configuration steps into runnable Dockerfiles. We evaluate Repo2Run on our proposed benchmark of 420 recent Python repositories with unit tests, where it achieves an 86.0%

configuration, dockerfile, repository, (14 more...)

arXiv.org Artificial Intelligence

2502.13681

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(7 more...)

Genre:

Research Report (0.82)
Overview (0.67)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Raiders of the Lost Dependency: Fixing Dependency Conflicts in Python using LLMs

Bartlett, Antony, Liem, Cynthia, Panichella, Annibale

arXiv.org Artificial IntelligenceJan-27-2025

Fixing Python dependency issues is a tedious and error-prone task for developers, who must manually identify and resolve environment dependencies and version constraints of third-party modules and Python interpreters. Researchers have attempted to automate this process by relying on large knowledge graphs and database lookup tables. However, these traditional approaches face limitations due to the variety of dependency error types, large sets of possible module versions, and conflicts among transitive dependencies. This study explores the potential of using large language models (LLMs) to automatically fix dependency issues in Python programs. We introduce PLLM (pronounced "plum"), a novel technique that employs retrieval-augmented generation (RAG) to help an LLM infer Python versions and required modules for a given Python file. PLLM builds a testing environment that iteratively (1) prompts the LLM for module combinations, (2) tests the suggested changes, and (3) provides feedback (error messages) to the LLM to refine the fix. This feedback cycle leverages natural language processing (NLP) to intelligently parse and interpret build error messages. We benchmark PLLM on the Gistable HG2.9K dataset, a collection of challenging single-file Python gists. We compare PLLM against two state-of-the-art automatic dependency inference approaches, namely PyEGo and ReadPyE, w.r.t. the ability to resolve dependency issues. Our results indicate that PLLM can fix more dependency issues than the two baselines, with +218 (+15.97%) more fixes over ReadPyE and +281 (+21.58%) over PyEGo. Our deeper analyses suggest that PLLM is particularly beneficial for projects with many dependencies and for specific third-party numerical and machine-learning modules. Our findings demonstrate the potential of LLM-based approaches to iteratively resolve Python dependency issues.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.16191

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Denmark (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension

Lyu, Bohan, Cong, Xin, Yu, Heyang, Yang, Pan, Qin, Yujia, Ye, Yining, Lu, Yaxi, Zhang, Zhong, Yan, Yukun, Lin, Yankai, Liu, Zhiyuan, Sun, Maosong

arXiv.org Artificial IntelligenceDec-28-2023

While Large Language Models (LLMs) like ChatGPT and GPT-4 have demonstrated exceptional proficiency in natural language processing, their efficacy in addressing complex, multifaceted tasks remains limited. A growing area of research focuses on LLM-based agents equipped with external tools capable of performing diverse tasks. However, existing LLM-based agents only support a limited set of tools which is unable to cover a diverse range of user queries, especially for those involving expertise domains. It remains a challenge for LLM-based agents to extend their tools autonomously when confronted with various user queries. As GitHub has hosted a multitude of repositories which can be seen as a good resource for tools, a promising solution is that LLM-based agents can autonomously integrate the repositories in GitHub according to the user queries to extend their tool set. In this paper, we introduce GitAgent, an agent capable of achieving the autonomous tool extension from GitHub. GitAgent follows a four-phase procedure to incorporate repositories and it can learn human experience by resorting to GitHub Issues/PRs to solve problems encountered during the procedure. Experimental evaluation involving 30 user queries demonstrates GitAgent's effectiveness, achieving a 69.4% success rate on average.

query, repository, user query, (13 more...)

arXiv.org Artificial Intelligence

2312.17294

Country:

North America > United States > Maryland > Baltimore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An updated guide to Docker and ROS 2

RobohubAug-7-2023, 15:51:22 GMT

Since then, I've had the chance to use Docker more in my work and have picked up some new tricks. This was long overdue, but I've finally collected my updated learnings in this post. Recently, I encountered an article titled ROS Docker; 6 reasons why they are not a good fit, and I largely agree with it. However, the reality is that it's still quite difficult to ensure a reproducible ROS environment for people who haven't spent years fighting the ROS learning curve and are adept at debugging dependency and/or build errors… so Docker is still very much a crutch that we fall back on to get working demos (and sometimes products!) If the article above hasn't completely discouraged you from embarking on this Docker adventure, please enjoy reading.

container, distro, username, (15 more...)

Robohub

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback

Using docker to run old GPU-accelerated deep learning models

#artificialintelligenceJan-17-2023, 14:30:19 GMT

Deep learning models are wonderful, and we always want to use the newest cutting edge solutions to get the best results. But once in a while you stumble upon a relevant whitepaper that looks relevant to the task on hands, even though it's made a few years ago. And few years is an ethernity for the deep learning projects: old versions of frameworks, CUDA, python, etc -- nothing of that is easy to just install and laucnh on the modern systems. Usual answer for that would be Anaconda, but it doesn't provide enough isolation when it comes to the GPU accelerated models. My way of dealing with this problem would be of no surprise to the most: containerisation, in other words -- Docker.

artificial intelligence, container, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GitHub - Emekadavid/kitchenware-classification: A classification model that was built on six kitchenware items. It detects the items and outputs a probability of what a given picture among the 6 kitchenware items is.

#artificialintelligenceJan-15-2023, 05:45:19 GMT

This is a project that is organized by Datatalks.Club. In this competition, one has to train a deep learning model in tensorflow or pytorch to classify kitchenware items. I used tensorflow and keras for this task. As an image classification model, when given the image of one of the above-listed kitchenware items, the model will output probailities for each of the six classes. The highest probability serves as the model's final classification.

artificial intelligence, information management, machine learning, (21 more...)

#artificialintelligence

Industry: Appliances & Durable Goods (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

How To Use Docker To Run Multiple ROS Distributions on the Same Machine

#artificialintelligenceDec-1-2022, 15:15:59 GMT

The Robot Operating System (ROS) is widely used in Robotics. But, the different available ROS distributions can lead to software version conflicts. For example, Ubuntu 18.04 uses ROS Melodic, based on Python 2.7. Ubuntu 20.04 uses ROS Noetic, which is based on Python 3. Oftentimes, our ROS master needs to be an older ROS version to have specific hardware driver support. In this article, I will show you how to keep a ROS Melodic master together with more recent ROS distributions on the same machine.

container, docker, node, (13 more...)

#artificialintelligence

Technology:

Information Technology > Software (0.98)
Information Technology > Artificial Intelligence > Robots (0.56)
Information Technology > Artificial Intelligence > Vision (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Add feedback

GitHub - tensorchord/envd: 🏕 Development environment for AI/ML

#artificialintelligenceNov-17-2022, 14:16:47 GMT

Development environments are full of python and system dependencies, CUDA, BASH scripts, Dockerfiles, SSH configurations, Kubernetes YAMLs, and many other clunky things that are always breaking. Use include function to import any git repositories. No more copy/paste Dockerfile instructions, let's reuse them. Buildkit supports parallel builds and software cache (e.g. You can enjoy the benefits without knowledge of it.

development environment, envd, tensorchord envd, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

GitHub - allenai/tango: Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.

#artificialintelligenceSep-4-2022, 16:31:38 GMT

AI2 Tango replaces messy directories and spreadsheets full of file versions by organizing experiments into discrete steps that can be cached and reused throughout the lifetime of a research project. Even though ai2-tango itself is quite small, installing everything will pull in a lot of dependencies. Don't be surprised if this takes a while! You can build a Docker image suitable for tango projects by using the official Dockerfile as a starting point for your own Dockerfile, or you can simply use one of our prebuilt images as a base image in your Dockerfile. Make sure to choose the right base image for your use case depending on the version of tango you're using and the CUDA version that your host machine supports.

discrete step, integration, research project, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback