Goto

Collaborating Authors

 runtime environment


Agent-Oriented Visual Programming for the Web of Things

arXiv.org Artificial Intelligence

In this paper we introduce and discuss an approach for multi-agent-oriented visual programming. This aims at enabling individuals without programming experience but with knowledge in specific target domains to design and (re)configure autonomous software. We argue that, compared to procedural programming, it should be simpler for users to create programs when agent abstractions are employed. The underlying rationale is that these abstractions, and specifically the belief-desire-intention architecture that is aligned with human practical reasoning, match more closely with people's everyday experience in interacting with other agents and artifacts in the real world. On top of this, we designed and implemented a visual programming system for agents that hides the technicalities of agent-oriented programming using a blocks-based visual development environment that is built on the JaCaMo platform. To further validate the proposed solution, we integrate the Web of Things (WoT) to let users create autonomous behaviour on top of physical mashups of devices, following the trends in industrial end-user programming. Finally, we report on a pilot user study where we verified that novice users are indeed able to make use of this development environment to create multi-agent systems to solve simple automation tasks.


Training Language Model Agents to Find Vulnerabilities with CTF-Dojo

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated exceptional capabilities when trained within executable runtime environments, notably excelling at software engineering tasks through verified feedback loops. Yet, scalable and generalizable execution-grounded environments remain scarce, limiting progress in training more capable ML agents. We introduce CTF-Dojo, the first large-scale executable runtime tailored for training LLMs with verifiable feedback, featuring 658 fully functional Capture-The-Flag (CTF)-style challenges containerized in Docker with guaranteed reproducibility. To enable rapid scaling without manual intervention, we develop CTF-Forge, an automated pipeline that transforms publicly available artifacts into ready-to-use execution environments in minutes, eliminating weeks of expert configuration traditionally required. We trained LLM-based agents on just 486 high-quality, execution-verified trajectories from CTF-Dojo, achieving up to 11.6% absolute gains over strong baselines across three competitive benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best-performing 32B model reaches 31.9% Pass@1, establishing a new open-weight state-of-the-art that rivals frontier models like DeepSeek-V3-0324 and Gemini-2.5-Flash. By framing CTF-style tasks as a benchmark for executable-agent learning, CTF-Dojo demonstrates that execution-grounded training signals are not only effective but pivotal in advancing high-performance ML agents without dependence on costly proprietary systems.


PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross Docking

arXiv.org Artificial Intelligence

Existing protein-ligand docking studies typically focus on the self-docking scenario, which is less practical in real applications. Moreover, some studies involve heavy frameworks requiring extensive training, posing challenges for convenient and efficient assessment of docking methods. To fill these gaps, we design PoseX, an open-source benchmark to evaluate both self-docking and cross-docking, enabling a practical and comprehensive assessment of algorithmic advances. Specifically, we curated a novel dataset comprising 718 entries for self-docking and 1,312 entries for cross-docking; second, we incorporated 23 docking methods in three methodological categories, including physics-based methods (e.g., Schrödinger Glide), AI docking methods (e.g., DiffDock) and AI co-folding methods (e.g., AlphaFold3); third, we developed a relaxation method for post-processing to minimize conformational energy and refine binding poses; fourth, we built a leaderboard to rank submitted models in real-time. We derived some key insights and conclusions from extensive experiments: (1) AI approaches have consistently outperformed physics-based methods in overall docking success rate. (2) Most intra- and intermolecular clashes of AI approaches can be greatly alleviated with relaxation, which means combining AI modeling with physics-based post-processing could achieve excellent performance. (3) AI co-folding methods exhibit ligand chirality issues, except for Boltz-1x, which introduced physics-inspired potentials to fix hallucinations, suggesting modeling on stereochemistry improves the structural plausibility markedly. (4) Specifying binding pockets significantly promotes docking performance, indicating that pocket information can be leveraged adequately, particularly for AI co-folding methods, in future modeling efforts. The code, dataset, and leaderboard are released at https://github.com/CataAI/PoseX.


Towards Single-System Illusion in Software-Defined Vehicles -- Automated, AI-Powered Workflow

arXiv.org Artificial Intelligence

We propose a novel model- and feature-based approach to development of vehicle software systems, where the end architecture is not explicitly defined. Instead, it emerges from an iterative process of search and optimization given certain constraints, requirements and hardware architecture, while retaining the property of single-system illusion, where applications run in a logically uniform environment. One of the key points of the presented approach is the inclusion of modern generative AI, specifically Large Language Models (LLMs), in the loop. With the recent advances in the field, we expect that the LLMs will be able to assist in processing of requirements, generation of formal system models, as well as generation of software deployment specification and test code. The resulting pipeline is automated to a large extent, with feedback being generated at each step.


Methods Included

Communications of the ACM

Although workflows are very popular, prior to the CWL standards, all workflow systems were incompatible with each other. This means that users who do not use the CWL standards are required to express their computational workflows in a different way each time they use another workflow system, leading to local success but global unportability. The success of workflows is now their biggest drawback. Users are locked into a particular vendor, project, and often a specific hardware setup, hampering sharing and reuse. Even non-academics suffer from this situation, as the lack of standards, or their adoption, hinders effective collaboration on computational methods within and between companies.


Four Questions You Might Get in a Data Science Interview

#artificialintelligence

As we enter a new realm of how we work in a post-pandemic world, you may have noticed that a lot of people are taking new opportunities that may not have been available before. I'm specifically referring to how the advent of remote work has opened up new opportunities for positions where location may have been a barrier before. There's also the unfortunate coincidence that some people may now be seeking new opportunities due to job loss as a cause of the pandemic. Having been through a data science interview myself, I can definitely relate to just how nerve wracking the interview process can be! The data science interview process is generally a multi-phase approach, often consisting of one or more coding assessments, a "culture fit" interview, and of course, a technical question and answer time.


How to do Deep Learning for Java?

#artificialintelligence

Deep Learning libraries like DL4J have come a long way, and this post exhibits how we can do regular tasks of building a Java app, training and evaluating a model — easily & agnostic of platform!


Transparent FPGA Acceleration with TensorFlow

arXiv.org Artificial Intelligence

Today, artificial neural networks are one of the major innovators pushing the progress of machine learning. This has particularly affected the development of neural network accelerating hardware. However, since most of these architectures require specialized toolchains, there is a certain amount of additional effort for developers each time they want to make use of a new deep learning accelerator. Furthermore the flexibility of the device is bound to the architecture itself, as well as to the functionality of the runtime environment. In this paper we propose a toolflow using TensorFlow as frontend, thus offering developers the opportunity of using a familiar environment. On the backend we use an FPGA, which is addressable via an HSA runtime environment. In this way we are able to hide the complexity of controlling new hardware from the user, while at the same time maintaining a high amount of flexibility. This can be achieved by our HSA toolflow, since the hardware is not statically configured with the structure of the network. Instead, it can be dynamically reconfigured during runtime with the respective kernels executed by the network and simultaneously from other sources e.g. OpenCL/OpenMP.


Practical Applications for AI and ML in Embedded Systems - RTInsights

#artificialintelligence

Embedded development is often driven by the need to deploy highly optimized and efficient systems. AI is positioned to disrupt businesses either by enabling new approaches to solving complex problems or threatening the status quo for whole business sectors or types of jobs. Whether you understand what the excitement is all about and how it will be applied to your market, or you struggle to understand how you might take advantage of the technology, having some basic understanding of artificial intelligence and its potential applications has to be part of your strategic planning process. Despite the hype, it is sobering to remember that artificial intelligence is not a magic trick that can do anything. It's a tool with which a magician can do a few tricks.


Ten strategies to implement AI on the Cloud and Edge

#artificialintelligence

The deployment of Machine Learning and Deep Learning algorithms on Edge devices is a complex undertaking. In this post, I list the strategies for deploying AI to Edge devices end-to-end i.e. for the full pipeline covering machine learning (building modules) and deployment (devops) I welcome your comments on additional ideas that could be included. In subsequent posts, I will elaborate these ideas in detail and ultimately, this will a free book on Data Science Central. I will take a use-case based approach i.e. each section would start with a use case. Many IoT applications are simple telemetry applications i.e. data is captured using a single sensor and action is undertaken based on the data. In doing so, the data may be stored or visualised.