Goto

Collaborating Authors

 poc


PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts

Andersson, Vivi, Bobadilla, Sofia, Hobbelhagen, Harald, Monperrus, Martin

arXiv.org Artificial Intelligence

Smart contracts operate in a highly adversarial environment, where vulnerabilities can lead to substantial financial losses. Thus, smart contracts are subject to security audits. In auditing, proof-of-concept (PoC) exploits play a critical role by demonstrating to the stakeholders that the reported vulnerabilities are genuine, reproducible, and actionable. However, manually creating PoCs is time-consuming, error-prone, and often constrained by tight audit schedules. We introduce POCO, an agentic framework that automatically generates executable PoC exploits from natural-language vulnerability descriptions written by auditors. POCO autonomously generates PoC exploits in an agentic manner by interacting with a set of code-execution tools in a Reason-Act-Observe loop. It produces fully executable exploits compatible with the Foundry testing framework, ready for integration into audit reports and other security tools. We evaluate POCO on a dataset of 23 real-world vulnerability reports. POCO consistently outperforms the prompting and workflow baselines, generating well-formed and logically correct PoCs. Our results demonstrate that agentic frameworks can significantly reduce the effort required for high-quality PoCs in smart contract audits. Our contribution provides readily actionable knowledge for the smart contract security community.


CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale

Wang, Zhun, Shi, Tianneng, He, Jingxuan, Cai, Matthew, Zhang, Jialin, Song, Dawn

arXiv.org Artificial Intelligence

AI agents have significant potential to reshape cybersecurity, making a thorough assessment of their capabilities critical. However, existing evaluations fall short, because they are based on small-scale benchmarks and only measure static outcomes, failing to capture the full, dynamic range of real-world security challenges. To address these limitations, we introduce CyberGym, a large-scale benchmark featuring 1,507 real-world vulnerabilities across 188 software projects. Adjustable to different vulnerability analysis settings, CyberGym primarily tasks agents with generating a proof-of-concept test that reproduces a vulnerability, given only its text description and the corresponding codebase. Our extensive evaluation highlights that CyberGym effectively differentiates agents' and models' cybersecurity capabilities. Even the top-performing combinations only achieve a ~20% success rate, demonstrating the overall difficulty of CyberGym. Beyond static benchmarking, we show that CyberGym leads to the discovery of 35 zero-day vulnerabilities and 17 historically incomplete patches. These results underscore that CyberGym is not only a robust benchmark for measuring AI's progress in cybersecurity but also a platform for creating direct, real-world security impact.


Enhancing Hand Palm Motion Gesture Recognition by Eliminating Reference Frame Bias via Frame-Invariant Similarity Measures

Verduyn, Arno, Vochten, Maxim, De Schutter, Joris

arXiv.org Artificial Intelligence

The ability of robots to recognize human gestures facilitates a natural and accessible human-robot collaboration. However, most work in gesture recognition remains rooted in reference frame-dependent representations. This poses a challenge when reference frames vary due to different work cell layouts, imprecise frame calibrations, or other environmental changes. This paper investigated the use of invariant trajectory descriptors for robust hand palm motion gesture recognition under reference frame changes. First, a novel dataset of recorded Hand Palm Motion (HPM) gestures is introduced. The motion gestures in this dataset were specifically designed to be distinguishable without dependence on specific reference frames or directional cues. Afterwards, multiple invariant trajectory descriptor approaches were benchmarked to assess how their performances generalize to this novel HPM dataset. After this offline benchmarking, the best scoring approach is validated for online recognition by developing a real-time Proof of Concept (PoC). In this PoC, hand palm motion gestures were used to control the real-time movement of a manipulator arm. The PoC demonstrated a high recognition reliability in real-time operation, achieving an $F_1$-score of 92.3%. This work demonstrates the effectiveness of the invariant descriptor approach as a standalone solution. Moreover, we believe that the invariant descriptor approach can also be utilized within other state-of-the-art pattern recognition and learning systems to improve their robustness against reference frame variations.


Real-Time Energy Pricing in New Zealand: An Evolving Stream Analysis

Sun, Yibin, Gomes, Heitor Murilo, Pfahringer, Bernhard, Bifet, Albert

arXiv.org Artificial Intelligence

This paper introduces a group of novel datasets representing real-time time-series and streaming data of energy prices in New Zealand, sourced from the Electricity Market Information (EMI) website maintained by the New Zealand government. The datasets are intended to address the scarcity of proper datasets for streaming regression learning tasks. We conduct extensive analyses and experiments on these datasets, covering preprocessing techniques, regression tasks, prediction intervals, concept drift detection, and anomaly detection. Our experiments demonstrate the datasets' utility and highlight the challenges and opportunities for future research in energy price forecasting.


Sparse Tensor PCA via Tensor Decomposition for Unsupervised Feature Selection

Zheng, Junjing, Zhang, Xinyu, Jiang, Weidong

arXiv.org Artificial Intelligence

Recently, introducing Tensor Decomposition (TD) methods into unsupervised feature selection (UFS) has been a rising research point. A tensor structure is beneficial for mining the relations between different modes and helps relieve the computation burden. However, while existing methods exploit TD to minimize the reconstruction error of a data tensor, they don't fully utilize the interpretable and discriminative information in the factor matrices. Moreover, most methods require domain knowledge to perform feature selection. To solve the above problems, we develop two Sparse Tensor Principal Component Analysis (STPCA) models that utilize the projection directions in the factor matrices to perform UFS. The first model extends Tucker Decomposition to a multiview sparse regression form and is transformed into several alternatively solved convex subproblems. The second model formulates a sparse version of the family of Tensor Singular Value Decomposition (T-SVDs) and is transformed into individual convex subproblems. For both models, we prove the optimal solution of each subproblem falls onto the Hermitian Positive Semidefinite Cone (HPSD). Accordingly, we design two fast algorithms based on HPSD projection and prove their convergence. According to the experimental results on two original synthetic datasets (Orbit and Array Signal) and five real-world datasets, the two proposed methods are suitable for handling different data tensor scenarios and outperform the state-of-the-art UFS methods.


Synthetic Trajectory Generation Through Convolutional Neural Networks

Merhi, Jesse, Buchholz, Erik, Kanhere, Salil S.

arXiv.org Artificial Intelligence

Location trajectories provide valuable insights for applications from urban planning to pandemic control. However, mobility data can also reveal sensitive information about individuals, such as political opinions, religious beliefs, or sexual orientations. Existing privacy-preserving approaches for publishing this data face a significant utility-privacy trade-off. Releasing synthetic trajectory data generated through deep learning offers a promising solution. Due to the trajectories' sequential nature, most existing models are based on recurrent neural networks (RNNs). However, research in generative adversarial networks (GANs) largely employs convolutional neural networks (CNNs) for image generation. This discrepancy raises the question of whether advances in computer vision can be applied to trajectory generation. In this work, we introduce a Reversible Trajectory-to-CNN Transformation (RTCT) that adapts trajectories into a format suitable for CNN-based models. We integrated this transformation with the well-known DCGAN in a proof-of-concept (PoC) and evaluated its performance against an RNN-based trajectory GAN using four metrics across two datasets. The PoC was superior in capturing spatial distributions compared to the RNN model but had difficulty replicating sequential and temporal properties. Although the PoC's utility is not sufficient for practical applications, the results demonstrate the transformation's potential to facilitate the use of CNNs for trajectory generation, opening up avenues for future research. To support continued research, all source code has been made available under an open-source license.


Probabilities of Causation for Continuous and Vector Variables

Kawakami, Yuta, Kuroki, Manabu, Tian, Jin

arXiv.org Artificial Intelligence

Probabilities of causation (PoC) are valuable concepts for explainable artificial intelligence and practical decision-making. PoC are originally defined for scalar binary variables. In this paper, we extend the concept of PoC to continuous treatment and outcome variables, and further generalize PoC to capture causal effects between multiple treatments and multiple outcomes. In addition, we consider PoC for a sub-population and PoC with multi-hypothetical terms to capture more sophisticated counterfactual information useful for decision-making. We provide a nonparametric identification theorem for each type of PoC we introduce. Finally, we illustrate the application of our results on a real-world dataset about education.


Fast Collision Probability Estimation for Automated Driving using Multi-circular Shape Approximations

Tolksdorf, Leon, Birkner, Christian, Tejada, Arturo, van de Wouw, Nathan

arXiv.org Artificial Intelligence

Many state-of-the-art methods for safety assessment and motion planning for automated driving require estimation of the probability of collision (POC). To estimate the POC, a shape approximation of the colliding actors and probability density functions of the associated uncertain kinematic variables are required. Even with such information available, the derivation of the POC is in general, i.e., for any shape and density, only possible with Monte Carlo sampling (MCS). Random sampling of the POC, however, is challenging as computational resources are limited in real-world applications. We present expressions for the POC in the presence of Gaussian uncertainties, based on multi-circular shape approximations. In addition, we show that the proposed approach is computationally more efficient than MCS. Lastly, we provide a method for upper and lower bounding the estimation error for the POC induced by the used shape approximations.


Treatment of Epistemic Uncertainty in Conjunction Analysis with Dempster-Shafer Theory

Sanchez, Luis, Vasile, Massimiliano, Sanvido, Silvia, Mertz, Klaus, Taillan, Christophe

arXiv.org Artificial Intelligence

The paper presents an approach to the modelling of epistemic uncertainty in Conjunction Data Messages (CDM) and the classification of conjunction events according to the confidence in the probability of collision. The approach proposed in this paper is based on Dempster-Shafer Theory (DSt) of evidence and starts from the assumption that the observed CDMs are drawn from a family of unknown distributions. The Dvoretzky-Kiefer-Wolfowitz (DKW) inequality is used to construct robust bounds on such a family of unknown distributions starting from a time series of CDMs. A DSt structure is then derived from the probability boxes constructed with DKW inequality. The DSt structure encapsulates the uncertainty in the CDMs at every point along the time series and allows the computation of the belief and plausibility in the realisation of a given probability of collision. The methodology proposed in this paper is tested on a number of real events and compared against existing practices in the European and French Space Agencies. We will show that the classification system proposed in this paper is more conservative than the approach taken by the European Space Agency but provides an added quantification of uncertainty in the probability of collision.


Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

Ehsani, Kiana, Gupta, Tanmay, Hendrix, Rose, Salvador, Jordi, Weihs, Luca, Zeng, Kuo-Hao, Singh, Kunal Pratap, Kim, Yejin, Han, Winson, Herrasti, Alvaro, Krishna, Ranjay, Schwenk, Dustin, VanderBilt, Eli, Kembhavi, Aniruddha

arXiv.org Artificial Intelligence

Reinforcement learning (RL) with dense rewards and imitation learning (IL) with human-generated trajectories are the most widely used approaches for training modern embodied agents. RL requires extensive reward shaping and auxiliary losses and is often too slow and ineffective for long-horizon tasks. While IL with human supervision is effective, collecting human trajectories at scale is extremely expensive. In this work, we show that imitating shortest-path planners in simulation produces agents that, given a language instruction, can proficiently navigate, explore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth map or GPS coordinates). This surprising result is enabled by our end-to-end, transformer-based, SPOC architecture, powerful visual encoders paired with extensive image augmentation, and the dramatic scale and diversity of our training data: millions of frames of shortest-path-expert trajectories collected inside approximately 200,000 procedurally generated houses containing 40,000 unique 3D assets. Our models, data, training code, and newly proposed 10-task benchmarking suite CHORES will be open-sourced.