filesystem
Crash-Consistent Checkpointing for AI Training on macOS/APFS
Deep learning training relies on periodic checkpoints to recover from failures, but unsafe checkpoint installation can leave corrupted files on disk. This paper presents an experimental study of checkpoint installation protocols and integrity validation for AI training on macOS/APFS. We implement three write modes with increasing durability guarantees: unsafe (baseline, no fsync), atomic_nodirsync (file-level durability via fsync()), and atomic_dirsync (file + directory durability). We design a format-agnostic integrity guard using SHA-256 checksums with automatic rollback. Through controlled experiments including crash injection (430 unsafe-mode trials) and corruption injection (1,600 atomic-mode trials), we demonstrate that the integrity guard detects 99.8-100% of corruptions with zero false positives. Performance overhead is 56.5-108.4% for atomic_nodirsync and 84.2-570.6% for atomic_dirsync relative to the unsafe baseline. Our findings quantify the reliability-performance trade-offs and provide deployment guidance for production AI infrastructure.
Securing AI Agent Execution
Bรผhler, Christoph, Biagiola, Matteo, Di Grazia, Luca, Salvaneschi, Guido
Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model Context Protocol (MCP) has become the de facto standard for connecting agents with such resources, but security has lagged behind: thousands of MCP servers execute with unrestricted access to host systems, creating a broad attack surface. In this paper, we introduce AgentBound, the first access control framework for MCP servers. AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy. We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that policy enforcement engine introduces negligible overhead. Our contributions provide developers and project managers with a practical foundation for securing MCP servers while maintaining productivity, enabling researchers and tool builders to explore new directions for declarative access control and MCP security.
The Artificial Scientist -- in-transit Machine Learning of Plasma Simulations
Kelling, Jeffrey, Bolea, Vicente, Bussmann, Michael, Checkervarty, Ankush, Debus, Alexander, Ebert, Jan, Eisenhauer, Greg, Gutta, Vineeth, Kesselheim, Stefan, Klasky, Scott, Pausch, Richard, Podhorszki, Norbert, Poschel, Franz, Rogers, David, Rustamov, Jeyhun, Schmerler, Steve, Schramm, Ulrich, Steiniger, Klaus, Widera, Rene, Willmann, Anna, Chandrasekaran, Sunita
Increasing HPC cluster sizes and large-scale simulations that produce petabytes of data per run, create massive IO and storage challenges for analysis. Deep learning-based techniques, in particular, make use of these amounts of domain data to extract patterns that help build scientific understanding. Here, we demonstrate a streaming workflow in which simulation data is streamed directly to a machine-learning (ML) framework, circumventing the file system bottleneck. Data is transformed in transit, asynchronously to the simulation and the training of the model. With the presented workflow, data operations can be performed in common and easy-to-use programming languages, freeing the application user from adapting the application output routines. As a proof-of-concept we consider a GPU accelerated particle-in-cell (PIConGPU) simulation of the Kelvin- Helmholtz instability (KHI). We employ experience replay to avoid catastrophic forgetting in learning from this non-steady process in a continual manner. We detail challenges addressed while porting and scaling to Frontier exascale system.
Tackling Execution-Based Evaluation for NL2Bash
Vo, Ngoc Phuoc An, Paulovicks, Brent, Sheinin, Vadim
Given recent advancement of Large Language Models (LLMs), the task of translating from natural language prompts to different programming languages (code generation) attracts immense attention for wide application in different domains. Specially code generation for Bash (NL2Bash) is widely used to generate Bash scripts for automating different tasks, such as performance monitoring, compilation, system administration, system diagnostics, etc. Besides code generation, validating synthetic code is critical before using them for any application. Different methods for code validation are proposed, both direct (execution evaluation) and indirect validations (i.e. exact/partial match, BLEU score). Among these, Execution-based Evaluation (EE) can validate the predicted code by comparing the execution output of model prediction and expected output in system. However, designing and implementing such an execution-based evaluation system for NL2Bash is not a trivial task. In this paper, we present a machinery for execution-based evaluation for NL2Bash. We create a set of 50 prompts to evaluate some popular LLMs for NL2Bash. We also analyze several advantages and challenges of EE such as syntactically different yet semantically equivalent Bash scripts generated by different LLMs, or syntactically correct but semantically incorrect Bash scripts, and how we capture and process them correctly.
Docker on MacOS is slow and how to fix it ยท Paolo Mainardi
Thanks to the DALLยทE 2, we finally have a very nice graphic representation of the feelings of a Docker container inside a macOS environment, I will try with this article to make this poor container safe to the coast. Docker engine, on macOS and Windows, needs a Linux Kernel; there aren't any exceptions here, you do not see it, but it is there to do all the dirty jobs (HN: https://news.ycombinator.com/item?id Instead, Docker CLI and docker-compose are native binaries for all operating systems. Two things are worth mentioning here regarding Microsoft; the first one is that Windows (and this sometimes can lead to some confusion) natively support Docker to run Windows containers. This implementation has been possible thanks to the joint effort of Microsoft and Docker in 2016 to create a container engine implementing the Docker specification on Windows; kudos to you, MS.
IBM's CodeFlare automates AI model development
Where does your enterprise stand on the AI adoption curve? Take our AI survey to find out. IBM today announced a new serverless framework called CodeFlare that's designed to reduce the time developers spend preparing AI models for deployment in hybrid cloud environments. The company says it automates the training, processing, and scaling of models to enable engineers to focus on data insights. Data and machine learning analytics are proliferating across industries, with the tasks becoming increasingly complex.
Finding photos on Twitter using face recognition with TensorFlow.js
As a developer advocate, I spend a lot of time at developer conferences (talking about serverless). Upon returning from each trip, I need to compile a "trip report" on the event for my bosses. This helps demonstrate the value in attending events and that I'm not just accruing air miles and hotel points for funโฆ I always include any social media content people post about my talks in the trip report. This is usually tweets with photos of me on stage. If people are tweeting about your session, I assume they enjoyed it and wanted to share with their followers.
A Batch Job ML Model Deployment
This blog post continues the ideas started in three previous blog posts. The code in this blog post can be found in this github repo. In previous blog posts I showed how to develop an ML model in such a way that makes it easy to deploy, and I showed how to create a web app that is able to deploy any model that followed the same design pattern. However, not all deployments of ML model are deployed within web apps. In this blog post I deploy the same model used in the previous blog posts as an ETL job.
Data Retrieval pipeline at source{d}
Data collection and processing might be less sexy than Machine Learning but nevertheless is crucial for any progress, and it is also something that source{d} as a company was built upon and has invested a lot into. It was briefly highlighted at several conference talks (go-git, gitbase, gitbase indexes). Now is time for a full-length blog post with the details. Before we begin a small reminder: as with most of what we do at source{d}, all the tools described in this blog post are available as an Open Source software and packaged in source{d}, our end user product. Most of the recent progress on ML and Deep Learning, in particular, is attributed to the fact of having an abundance of data and plenty of computing resources to use for training large Neural Network models.
What's New in H2O Machine Learning: Christmas 2018 - DZone AI
There were two releases shortly after one another. First, on December 21st, there was a minor (fix) release 3.22.0.3 immediately followed by a more major release (but still on 3.22 branch) codename Xu, named after mathematician Jinchao Xu, whose work is focused on deep neural networks besides many other fields of research. Of course, the new 3.22.1.1 release with codename Xu contains all the fixes present in the previous fix release, 3.22.0.3. The following points are highlights of the most impactful changes. For a complete list of changes, fixes, and improvements, please read the recent changes section.