Goto

Collaborating Authors

 socket


Confidential LLM Inference: Performance and Cost Across CPU and GPU TEEs

Chrapek, Marcin, Copik, Marcin, Mettaz, Etienne, Hoefler, Torsten

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly deployed on converged Cloud and High-Performance Computing (HPC) infrastructure. However, as LLMs handle confidential inputs and are fine-tuned on costly, proprietary datasets, their heightened security requirements slow adoption in privacy-sensitive sectors such as healthcare and finance. We investigate methods to address this gap and propose Trusted Execution Environments (TEEs) as a solution for securing end-to-end LLM inference. We validate their practicality by evaluating these compute-intensive workloads entirely within CPU and GPU TEEs. On the CPU side, we conduct an in-depth study running full Llama2 inference pipelines (7B, 13B, 70B) inside Intel's TDX and SGX, accelerated by Advanced Matrix Extensions (AMX). We derive 12 insights, including that across various data types, batch sizes, and input lengths, CPU TEEs impose under 10% throughput and 20% latency overheads, further reduced by AMX. We run LLM inference on NVIDIA H100 Confidential Compute GPUs, contextualizing our CPU findings and observing throughput penalties of 4-8% that diminish as batch and input sizes grow. By comparing performance, cost, and security trade-offs, we show how CPU TEEs can be more cost-effective or secure than their GPU counterparts. To our knowledge, our work is the first to comprehensively demonstrate the performance and practicality of modern TEEs across both CPUs and GPUs for enabling confidential LLMs (cLLMs).


Towards simulation-based optimization of compliant fingers for high-speed connector assembly

Hartisch, Richard Matthias, Rother, Alexander, Krüger, Jörg, Haninger, Kevin

arXiv.org Artificial Intelligence

Mechanical compliance is a key design parameter for dynamic contact-rich manipulation, affecting task success and safety robustness over contact geometry variation. Design of soft robotic structures, such as compliant fingers, requires choosing design parameters which affect geometry and stiffness, and therefore manipulation performance and robustness. Today, these parameters are chosen through either hardware iteration, which takes significant development time, or simplified models (e.g. planar), which can't address complex manipulation task objectives. Improvements in dynamic simulation, especially with contact and friction modeling, present a potential design tool for mechanical compliance. We propose a simulation-based design tool for compliant mechanisms which allows design with respect to task-level objectives, such as success rate. This is applied to optimize design parameters of a structured compliant finger to reduce failure cases inside a tolerance window in insertion tasks. The improvement in robustness is then validated on a real robot using tasks from the benchmark NIST task board. The finger stiffness affects the tolerance window: optimized parameters can increase tolerable ranges by a factor of 2.29, with workpiece variation up to 8.6 mm being compensated. However, the trends remain task-specific. In some tasks, the highest stiffness yields the widest tolerable range, whereas in others the opposite is observed, motivating need for design tools which can consider application-specific geometry and dynamics.


Integrating SystemC TLM into FMI 3.0 Co-Simulations with an Open-Source Approach

Albu, Andrei Mihai, Pollo, Giovanni, Burrello, Alessio, Pagliari, Daniele Jahier, Tesconi, Cristian, Neri, Alessandra, Soldi, Dario, Autieri, Fabio, Vinco, Sara

arXiv.org Artificial Intelligence

The growing complexity of cyber-physical systems, particularly in automotive applications, has increased the demand for efficient modeling and cross-domain co-simulation techniques. While SystemC Transaction-Level Modeling (TLM) enables effective hardware/software co-design, its limited interoperability with models from other engineering domains poses integration challenges. This paper presents a fully open-source methodology for integrating SystemC TLM models into Functional Mock-up Interface (FMI)-based co-simulation workflows. By encapsulating SystemC TLM components as FMI 3.0 Co Simulation Functional Mock-up Units (FMUs), the proposed approach facilitates seamless, standardized integration across heterogeneous simulation environments. We introduce a lightweight open-source toolchain, address key technical challenges such as time synchronization and data exchange, and demonstrate the feasibility and effectiveness of the integration through representative case studies.


Affordable 3D-printed bionic arm uses muscle signals to move

FOX News

Kurt Knutsson discusses how the Hero PRO, a wireless, waterproof bionic arm with fast control and full 360-degree wrist rotation, transforms prosthetics. Bionic arms used to cost more than a new car. Unlimited Tomorrow is making 3D-printed prosthetics available for under 8,000 and doing it without sacrificing quality, comfort or functionality. Easton LaChappelle founded the company in 2014 at the age of 18. His simple goal was to give more people access to advanced prosthetics that actually fit their lives.


He worked with artificial limbs for decades. Then a lorry ripped off his right arm. What happened when the expert became the patient?

The Guardian

When the air ambulance brought Jim Ashworth-Beaumont to King's College hospital in south-east London, nobody thought he had a hope. He had been cycling home when a lorry driver failed to spot him alongside his trailer while turning left after a set of traffic lights. The vehicle's wheels opened his torso like a sardine tin, puncturing his lungs and splitting his liver in two. They also tore off his right arm. Weeks after the accident, in July 2020, Ashworth-Beaumont would see a photo of the severed limb taken by a doctor while it lay beside him in hospital. He had asked to see the picture and says it helped him come to terms with his loss. "My hand didn't look too bad," he says. "It was as if it was waving goodbye to me." Ashworth-Beaumont, a super-fit and sunny former Royal Marine from Edinburgh, would go on to spend six weeks in an induced coma as surgeons raced to repair his crushed body. But as he lay on the road, waiting for the paramedics, his only thoughts were that he was dying.


Evaluating Artificial Intelligence Algorithms for the Standardization of Transtibial Prosthetic Socket Shape Design

Jordaan, C. H. E., van der Stelt, M., Maal, T. J. J., Stirler, V. M. A., Leijendekkers, R., Kachman, T., de Jong, G. A.

arXiv.org Artificial Intelligence

The quality of a transtibial prosthetic socket depends on the prosthetist's skills and expertise, as the fitting is performed manually. This study investigates multiple artificial intelligence (AI) approaches to help standardize transtibial prosthetic socket design. Data from 118 patients were collected by prosthetists working in the Dutch healthcare system. This data consists of a three-dimensional (3D) scan of the residual limb and a corresponding 3D model of the prosthetist-designed socket. Multiple data pre-processing steps are performed for alignment, standardization and optionally compression using Morphable Models and Principal Component Analysis. Afterward, three different algorithms - a 3D neural network, Feedforward neural network, and random forest - are developed to either predict 1) the final socket shape or 2) the adaptations performed by a prosthetist to predict the socket shape based on the 3D scan of the residual limb. Each algorithm's performance was evaluated by comparing the prosthetist-designed socket with the AI-generated socket, using two metrics in combination with the error location. First, we measure the surface-to-surface distance to assess the overall surface error between the AI-generated socket and the prosthetist-designed socket. Second, distance maps between the AI-generated and prosthetist sockets are utilized to analyze the error's location. For all algorithms, estimating the required adaptations outperformed direct prediction of the final socket shape. The random forest model applied to adaptation prediction yields the lowest error with a median surface-to-surface distance of 1.24 millimeters, a first quartile of 1.03 millimeters, and a third quartile of 1.54 millimeters.


Non-collective Calibrating Strategy for Time Series Forecasting

Wang, Bin, Han, Yongqi, Ma, Minbo, Li, Tianrui, Zhang, Junbo, Hong, Feng, Yu, Yanwei

arXiv.org Artificial Intelligence

Deep learning-based approaches have demonstrated significant advancements in time series forecasting. Despite these ongoing developments, the complex dynamics of time series make it challenging to establish the rule of thumb for designing the golden model architecture. In this study, we argue that refining existing advanced models through a universal calibrating strategy can deliver substantial benefits with minimal resource costs, as opposed to elaborating and training a new model from scratch. We first identify a multi-target learning conflict in the calibrating process, which arises when optimizing variables across time steps, leading to the underutilization of the model's learning capabilities. To address this issue, we propose an innovative calibrating strategy called Socket+Plug (SoP). This approach retains an exclusive optimizer and early-stopping monitor for each predicted target within each Plug while keeping the fully trained Socket backbone frozen. The model-agnostic nature of SoP allows it to directly calibrate the performance of any trained deep forecasting models, regardless of their specific architectures. Extensive experiments on various time series benchmarks and a spatio-temporal meteorological ERA5 dataset demonstrate the effectiveness of SoP, achieving up to a 22% improvement even when employing a simple MLP as the Plug (highlighted in Figure 1). Code is available at https://github.com/hanyuki23/SoP.


EasyInsert: A Data-Efficient and Generalizable Insertion Policy

Li, Guanghe, Zhao, Junming, Wang, Shengjie, Gao, Yang

arXiv.org Artificial Intelligence

Insertion task is highly challenging that requires robots to operate with exceptional precision in cluttered environments. Existing methods often have poor generalization capabilities. They typically function in restricted and structured environments, and frequently fail when the plug and socket are far apart, when the scene is densely cluttered, or when handling novel objects. They also rely on strong assumptions such as access to CAD models or a digital twin in simulation. To address this, we propose EasyInsert, a framework which leverages the human intuition that relative pose (delta pose) between plug and socket is sufficient for successful insertion, and employs efficient and automated real-world data collection with minimal human labor to train a generalizable model for relative pose prediction. During execution, EasyInsert follows a coarse-to-fine execution procedure based on predicted delta pose, and successfully performs various insertion tasks. EasyInsert demonstrates strong zero-shot generalization capability for unseen objects in cluttered environments, handling cases with significant initial pose deviations while maintaining high sample efficiency and requiring little human effort. In real-world experiments, with just 5 hours of training data, EasyInsert achieves over 90% success in zero-shot insertion for 13 out of 15 unseen novel objects, including challenging objects like Type-C cables, HDMI cables, and Ethernet cables. Furthermore, with only one human demonstration and 4 minutes of automatically collected data for fine-tuning, it reaches over 90% success rate for all 15 objects.


Integrating Model-based Control and RL for Sim2Real Transfer of Tight Insertion Policies

Marougkas, Isidoros, Ramesh, Dhruv Metha, Doerr, Joe H., Granados, Edgar, Sivaramakrishnan, Aravind, Boularias, Abdeslam, Bekris, Kostas E.

arXiv.org Artificial Intelligence

Object insertion under tight tolerances ($< \hspace{-.02in} 1mm$) is an important but challenging assembly task as even small errors can result in undesirable contacts. Recent efforts focused on Reinforcement Learning (RL), which often depends on careful definition of dense reward functions. This work proposes an effective strategy for such tasks that integrates traditional model-based control with RL to achieve improved insertion accuracy. The policy is trained exclusively in simulation and is zero-shot transferred to the real system. It employs a potential field-based controller to acquire a model-based policy for inserting a plug into a socket given full observability in simulation. This policy is then integrated with residual RL, which is trained in simulation given only a sparse, goal-reaching reward. A curriculum scheme over observation noise and action magnitude is used for training the residual RL policy. Both policy components use as input the SE(3) poses of both the plug and the socket and return the plug's SE(3) pose transform, which is executed by a robotic arm using a controller. The integrated policy is deployed on the real system without further training or fine-tuning, given a visual SE(3) object tracker. The proposed solution and alternatives are evaluated across a variety of objects and conditions in simulation and reality. The proposed approach outperforms recent RL-based methods in this domain and prior efforts with hybrid policies. Ablations highlight the impact of each component of the approach.


MUKCa: Accurate and Affordable Cobot Calibration Without External Measurement Devices

Franzese, Giovanni, Spahn, Max, Kober, Jens, Della Santina, Cosimo

arXiv.org Artificial Intelligence

To increase the reliability of collaborative robots in performing daily tasks, we require them to be accurate and not only repeatable. However, having a calibrated kinematics model is regrettably a luxury, as available calibration tools are usually more expensive than the robots themselves. With this work, we aim to contribute to the democratization of cobots calibration by providing an inexpensive yet highly effective alternative to existing tools. The proposed minimalist calibration routine relies on a 3D-printable tool as the only physical aid to the calibration process. This two-socket spherical-joint tool kinematically constrains the robot at the end effector while collecting the training set. An optimization routine updates the nominal model to ensure a consistent prediction for each socket and the undistorted mean distance between them. We validated the algorithm on three robotic platforms: Franka, Kuka, and Kinova Cobots. The calibrated models reduce the mean absolute error from the order of 10 mm to 0.2 mm for both Franka and Kuka robots. We provide two additional experimental campaigns with the Franka Robot to render the improvements more tangible. First, we implement Cartesian control with and without the calibrated model and use it to perform a standard peg-in-the-hole task with a tolerance of 0.4 mm between the peg and the hole. Second, we perform a repeated drawing task combining Cartesian control with learning from demonstration. Both tasks consistently failed when the model was not calibrated, while they consistently succeeded after calibration.