AITopics

2511.13158

Country:

Europe (0.68)
North America > United States > California (0.46)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.54)

Industry:

Education (0.93)
Information Technology > Smart Houses & Appliances (0.47)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

arXiv.org Artificial IntelligenceSep-24-2025

Training Language Model Agents to Find Vulnerabilities with CTF-Dojo

Zhuo, Terry Yue, Wang, Dingmin, Ding, Hantian, Kumar, Varun, Wang, Zijian

Large language models (LLMs) have demonstrated exceptional capabilities when trained within executable runtime environments, notably excelling at software engineering tasks through verified feedback loops. Yet, scalable and generalizable execution-grounded environments remain scarce, limiting progress in training more capable ML agents. We introduce CTF-Dojo, the first large-scale executable runtime tailored for training LLMs with verifiable feedback, featuring 658 fully functional Capture-The-Flag (CTF)-style challenges containerized in Docker with guaranteed reproducibility. To enable rapid scaling without manual intervention, we develop CTF-Forge, an automated pipeline that transforms publicly available artifacts into ready-to-use execution environments in minutes, eliminating weeks of expert configuration traditionally required. We trained LLM-based agents on just 486 high-quality, execution-verified trajectories from CTF-Dojo, achieving up to 11.6% absolute gains over strong baselines across three competitive benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best-performing 32B model reaches 31.9% Pass@1, establishing a new open-weight state-of-the-art that rivals frontier models like DeepSeek-V3-0324 and Gemini-2.5-Flash. By framing CTF-style tasks as a benchmark for executable-agent learning, CTF-Dojo demonstrates that execution-grounded training signals are not only effective but pivotal in advancing high-performance ML agents without dependence on costly proprietary systems.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2508.1837

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.93)
Government > Military > Cyberwarfare (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-22-2025

PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross Docking

Jiang, Yize, Li, Xinze, Zhang, Yuanyuan, Han, Jin, Xu, Youjun, Pandit, Ayush, Zhang, Zaixi, Wang, Mengdi, Wang, Mengyang, Liu, Chong, Yang, Guang, Choi, Yejin, Li, Wu-Jun, Fu, Tianfan, Wu, Fang, Liu, Junhong

Existing protein-ligand docking studies typically focus on the self-docking scenario, which is less practical in real applications. Moreover, some studies involve heavy frameworks requiring extensive training, posing challenges for convenient and efficient assessment of docking methods. To fill these gaps, we design PoseX, an open-source benchmark to evaluate both self-docking and cross-docking, enabling a practical and comprehensive assessment of algorithmic advances. Specifically, we curated a novel dataset comprising 718 entries for self-docking and 1,312 entries for cross-docking; second, we incorporated 23 docking methods in three methodological categories, including physics-based methods (e.g., Schrödinger Glide), AI docking methods (e.g., DiffDock) and AI co-folding methods (e.g., AlphaFold3); third, we developed a relaxation method for post-processing to minimize conformational energy and refine binding poses; fourth, we built a leaderboard to rank submitted models in real-time. We derived some key insights and conclusions from extensive experiments: (1) AI approaches have consistently outperformed physics-based methods in overall docking success rate. (2) Most intra- and intermolecular clashes of AI approaches can be greatly alleviated with relaxation, which means combining AI modeling with physics-based post-processing could achieve excellent performance. (3) AI co-folding methods exhibit ligand chirality issues, except for Boltz-1x, which introduced physics-inspired potentials to fix hallucinations, suggesting modeling on stereochemistry improves the structural plausibility markedly. (4) Specifying binding pockets significantly promotes docking performance, indicating that pocket information can be leveraged adequately, particularly for AI co-folding methods, in future modeling efforts. The code, dataset, and leaderboard are released at https://github.com/CataAI/PoseX.

ai co-folding method, machine learning, natural language, (19 more...)

2505.017

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada (0.04)
Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Lebioda, Krzysztof, Vorobev, Viktor, Petrovic, Nenad, Pan, Fengjunjie, Zolfaghari, Vahid, Knoll, Alois

Towards Single-System Illusion in Software-Defined Vehicles -- Automated, AI-Powered Workflow

arXiv.org Artificial IntelligenceMar-21-2024

We propose a novel model- and feature-based approach to development of vehicle software systems, where the end architecture is not explicitly defined. Instead, it emerges from an iterative process of search and optimization given certain constraints, requirements and hardware architecture, while retaining the property of single-system illusion, where applications run in a logically uniform environment. One of the key points of the presented approach is the inclusion of modern generative AI, specifically Large Language Models (LLMs), in the loop. With the recent advances in the field, we expect that the LLMs will be able to assist in processing of requirements, generation of formal system models, as well as generation of software deployment specification and test code. The resulting pipeline is automated to a large extent, with feedback being generated at each step.

constraint, requirement, specification, (15 more...)

2403.1446

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.05)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
(4 more...)

Genre: Workflow (1.00)

Industry: Automobiles & Trucks (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Communications of the ACMMay-21-2022, 06:10:34 GMT

Methods Included

Although workflows are very popular, prior to the CWL standards, all workflow systems were incompatible with each other. This means that users who do not use the CWL standards are required to express their computational workflows in a different way each time they use another workflow system, leading to local success but global unportability. The success of workflows is now their biggest drawback. Users are locked into a particular vendor, project, and often a specific hardware setup, hampering sharing and reuse. Even non-academics suffer from this situation, as the lack of standards, or their adoption, hinders effective collaboration on computational methods within and between companies.

cwl standard, implementation, workflow, (17 more...)

Communications of the ACM

AI-Alerts: 2022 > 2022-05 > AAAI AI-Alert for May 24, 2022 (1.00)

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
North America > United States > North Carolina (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
(4 more...)

Genre: Workflow (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

#artificialintelligenceNov-5-2021, 06:16:04 GMT

Four Questions You Might Get in a Data Science Interview

As we enter a new realm of how we work in a post-pandemic world, you may have noticed that a lot of people are taking new opportunities that may not have been available before. I'm specifically referring to how the advent of remote work has opened up new opportunities for positions where location may have been a barrier before. There's also the unfortunate coincidence that some people may now be seeking new opportunities due to job loss as a cause of the pandemic. Having been through a data science interview myself, I can definitely relate to just how nerve wracking the interview process can be! The data science interview process is generally a multi-phase approach, often consisting of one or more coding assessments, a "culture fit" interview, and of course, a technical question and answer time.

data science interview, interview, potential answer, (10 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceJun-26-2021, 19:29:07 GMT

How to do Deep Learning for Java?

Deep Learning libraries like DL4J have come a long way, and this post exhibits how we can do regular tasks of building a Java app, training and evaluating a model — easily & agnostic of platform!

execution, uber jar, valohai, (13 more...)

Genre: Workflow (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Pfenning, Simon, Holzinger, Philipp, Reichenbach, Marc

Transparent FPGA Acceleration with TensorFlow

arXiv.org Artificial IntelligenceFeb-2-2021

Today, artificial neural networks are one of the major innovators pushing the progress of machine learning. This has particularly affected the development of neural network accelerating hardware. However, since most of these architectures require specialized toolchains, there is a certain amount of additional effort for developers each time they want to make use of a new deep learning accelerator. Furthermore the flexibility of the device is bound to the architecture itself, as well as to the functionality of the runtime environment. In this paper we propose a toolflow using TensorFlow as frontend, thus offering developers the opportunity of using a familiar environment. On the backend we use an FPGA, which is addressable via an HSA runtime environment. In this way we are able to hide the complexity of controlling new hardware from the user, while at the same time maintaining a high amount of flexibility. This can be achieved by our HSA toolflow, since the hardware is not statically configured with the structure of the network. Instead, it can be dynamically reconfigured during runtime with the respective kernels executed by the network and simultaneously from other sources e.g. OpenCL/OpenMP.

flexibility, fpga, tensorflow, (16 more...)

2102.06018

Country: Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceNov-1-2020, 11:53:29 GMT

Practical Applications for AI and ML in Embedded Systems - RTInsights

Embedded development is often driven by the need to deploy highly optimized and efficient systems. AI is positioned to disrupt businesses either by enabling new approaches to solving complex problems or threatening the status quo for whole business sectors or types of jobs. Whether you understand what the excitement is all about and how it will be applied to your market, or you struggle to understand how you might take advantage of the technology, having some basic understanding of artificial intelligence and its potential applications has to be part of your strategic planning process. Despite the hype, it is sobering to remember that artificial intelligence is not a magic trick that can do anything. It's a tool with which a magician can do a few tricks.

application, artificial intelligence, machine learning, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

#artificialintelligenceMar-5-2020, 20:48:07 GMT

Ten strategies to implement AI on the Cloud and Edge

The deployment of Machine Learning and Deep Learning algorithms on Edge devices is a complex undertaking. In this post, I list the strategies for deploying AI to Edge devices end-to-end i.e. for the full pipeline covering machine learning (building modules) and deployment (devops) I welcome your comments on additional ideas that could be included. In subsequent posts, I will elaborate these ideas in detail and ultimately, this will a free book on Data Science Central. I will take a use-case based approach i.e. each section would start with a use case. Many IoT applications are simple telemetry applications i.e. data is captured using a single sensor and action is undertaken based on the data. In doing so, the data may be stored or visualised.

application, deployment, platform, (9 more...)

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Industry: Information Technology > Services (0.33)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)