Goto

Collaborating Authors

 Wang, Chen


A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter

arXiv.org Artificial Intelligence

Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform (multi-tenancy), where information on the physical servers is mostly obscured. The second challenge is the overhead in power consumption that the Cloud platform control plane induces. This paper addresses these challenges and introduces a novel pipeline framework for power model training. This allows versatile power consumption approximation of individual containers on the basis of available performance counters and other metrics. The proposed model utilizes machine learning techniques to predict the power consumed by the control plane and associated processes, and uses it for isolating the power consumed by the user containers, from the server power consumption. To determine how well the prediction results in an isolation, we introduce a metric termed isolation goodness. Applying the proposed power model does not require online power measurements, nor does it need information on the physical servers, configuration, or information on other tenants sharing the same machine. The results of cross-workload, cross-platform experiments demonstrated the higher accuracy of the proposed model when predicting power consumption of unseen containers on unknown platforms, including on virtual machines.


Salient Sparse Visual Odometry With Pose-Only Supervision

arXiv.org Artificial Intelligence

Visual Odometry (VO) is vital for the navigation of autonomous systems, providing accurate position and orientation estimates at reasonable costs. While traditional VO methods excel in some conditions, they struggle with challenges like variable lighting and motion blur. Deep learning-based VO, though more adaptable, can face generalization problems in new environments. Addressing these drawbacks, this paper presents a novel hybrid visual odometry (VO) framework that leverages pose-only supervision, offering a balanced solution between robustness and the need for extensive labeling. We propose two cost-effective and innovative designs: a self-supervised homographic pre-training for enhancing optical flow learning from pose-only labels and a random patch-based salient point detection strategy for more accurate optical flow patch extraction. These designs eliminate the need for dense optical flow labels for training and significantly improve the generalization capability of the system in diverse and challenging environments. Our pose-only supervised method achieves competitive performance on standard datasets and greater robustness and generalization ability in extreme and unseen scenarios, even compared to dense optical flow-supervised state-of-the-art methods.


POMDP-Guided Active Force-Based Search for Robotic Insertion

arXiv.org Artificial Intelligence

In robotic insertion tasks where the uncertainty exceeds the allowable tolerance, a good search strategy is essential for successful insertion and significantly influences efficiency. The commonly used blind search method is time-consuming and does not exploit the rich contact information. In this paper, we propose a novel search strategy that actively utilizes the information contained in the contact configuration and shows high efficiency. In particular, we formulate this problem as a Partially Observable Markov Decision Process (POMDP) with carefully designed primitives based on an in-depth analysis of the contact configuration's static stability. From the formulated POMDP, we can derive a novel search strategy. Thanks to its simplicity, this search strategy can be incorporated into a Finite-State-Machine (FSM) controller. The behaviors of the FSM controller are realized through a low-level Cartesian Impedance Controller. Our method is based purely on the robot's proprioceptive sensing and does not need visual or tactile sensors. To evaluate the effectiveness of our proposed strategy and control framework, we conduct extensive comparison experiments in simulation, where we compare our method with the baseline approach. The results demonstrate that our proposed method achieves a higher success rate with a shorter search time and search trajectory length compared to the baseline method. Additionally, we show that our method is robust to various initial displacement errors.


Towards Pareto Optimal Throughput in Small Language Model Serving

arXiv.org Artificial Intelligence

Large language models (LLMs) have revolutionized the state-of-the-art of many different natural language processing tasks. Although serving LLMs is computationally and memory demanding, the rise of Small Language Models (SLMs) offers new opportunities for resource-constrained users, who now are able to serve small models with cutting-edge performance. In this paper, we present a set of experiments designed to benchmark SLM inference at performance and energy levels. Our analysis provides a new perspective in serving, highlighting that the small memory footprint of SLMs allows for reaching the Pareto-optimal throughput within the resource capacity of a single accelerator. In this regard, we present an initial set of findings demonstrating how model replication can effectively improve resource utilization for serving SLMs.


PhysORD: A Neuro-Symbolic Approach for Physics-infused Motion Prediction in Off-road Driving

arXiv.org Artificial Intelligence

Motion prediction is critical for autonomous off-road driving, however, it presents significantly more challenges than on-road driving because of the complex interaction between the vehicle and the terrain. Traditional physics-based approaches encounter difficulties in accurately modeling dynamic systems and external disturbance. In contrast, data-driven neural networks require extensive datasets and struggle with explicitly capturing the fundamental physical laws, which can easily lead to poor generalization. By merging the advantages of both methods, neuro-symbolic approaches present a promising direction. These methods embed physical laws into neural models, potentially significantly improving generalization capabilities. However, no prior works were evaluated in real-world settings for off-road driving. To bridge this gap, we present PhysORD, a neural-symbolic approach integrating the conservation law, i.e., the Euler-Lagrange equation, into data-driven neural models for motion prediction in off-road driving. Our experiments showed that PhysORD can accurately predict vehicle motion and tolerate external disturbance by modeling uncertainties. It outperforms existing methods both in accuracy and efficiency and demonstrates data-efficient learning and generalization ability in long-term prediction.


iA$^*$: Imperative Learning-based A$^*$ Search for Pathfinding

arXiv.org Artificial Intelligence

The pathfinding problem, which aims to identify a collision-free path between two points, is crucial for many applications, such as robot navigation and autonomous driving. Classic methods, such as A$^*$ search, perform well on small-scale maps but face difficulties scaling up. Conversely, data-driven approaches can improve pathfinding efficiency but require extensive data labeling and lack theoretical guarantees, making it challenging for practical applications. To combine the strengths of the two methods, we utilize the imperative learning (IL) strategy and propose a novel self-supervised pathfinding framework, termed imperative learning-based A$^*$ (iA$^*$). Specifically, iA$^*$ is a bilevel optimization process where the lower-level optimization is dedicated to finding the optimal path by a differentiable A$^*$ search module, and the upper-level optimization narrows down the search space to improve efficiency via setting suitable initial values from a data-driven model. Besides, the model within the upper-level optimization is a fully convolutional network, trained by the calculated loss in the lower-level optimization. Thus, the framework avoids extensive data labeling and can be applied in diverse environments. Our comprehensive experiments demonstrate that iA$^*$ surpasses both classical and data-driven methods in pathfinding efficiency and shows superior robustness among different tasks, validated with public datasets and simulation environments.


BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation

arXiv.org Artificial Intelligence

We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an extensive survey on "what do you want robots to do for you?". The first is the definition of 1,000 everyday activities, grounded in 50 scenes (houses, gardens, restaurants, offices, etc.) with more than 9,000 objects annotated with rich physical and semantic properties. The second is OMNIGIBSON, a novel simulation environment that supports these activities via realistic physics simulation and rendering of rigid bodies, deformable bodies, and liquids. Our experiments indicate that the activities in BEHAVIOR-1K are long-horizon and dependent on complex manipulation skills, both of which remain a challenge for even state-of-the-art robot learning solutions. To calibrate the simulation-to-reality gap of BEHAVIOR-1K, we provide an initial study on transferring solutions learned with a mobile manipulator in a simulated apartment to its real-world counterpart. We hope that BEHAVIOR-1K's human-grounded nature, diversity, and realism make it valuable for embodied AI and robot learning research. Project website: https://behavior.stanford.edu.


Review-Incorporated Model-Agnostic Profile Injection Attacks on Recommender Systems

arXiv.org Artificial Intelligence

Recent studies have shown that recommender systems (RSs) are highly vulnerable to data poisoning attacks. Understanding attack tactics helps improve the robustness of RSs. We intend to develop efficient attack methods that use limited resources to generate high-quality fake user profiles to achieve 1) transferability among black-box RSs 2) and imperceptibility among detectors. In order to achieve these goals, we introduce textual reviews of products to enhance the generation quality of the profiles. Specifically, we propose a novel attack framework named R-Trojan, which formulates the attack objectives as an optimization problem and adopts a tailored transformer-based generative adversarial network (GAN) to solve it so that high-quality attack profiles can be produced. Comprehensive experiments on real-world datasets demonstrate that R-Trojan greatly outperforms state-of-the-art attack methods on various victim RSs under black-box settings and show its good imperceptibility.


Prompt Perturbation in Retrieval-Augmented Generation based Large Language Models

arXiv.org Artificial Intelligence

The robustness of large language models (LLMs) becomes increasingly important as their use rapidly grows in a wide range of domains. Retrieval-Augmented Generation (RAG) is considered as a means to improve the trustworthiness of text generation from LLMs. However, how the outputs from RAG-based LLMs are affected by slightly different inputs is not well studied. In this work, we find that the insertion of even a short prefix to the prompt leads to the generation of outputs far away from factually correct answers. We systematically evaluate the effect of such prefixes on RAG by introducing a novel optimization technique called Gradient Guided Prompt Perturbation (GGPP). GGPP achieves a high success rate in steering outputs of RAG-based LLMs to targeted wrong answers. It can also cope with instructions in the prompts requesting to ignore irrelevant context. We also exploit LLMs' neuron activation difference between prompts with and without GGPP perturbations to give a method that improves the robustness of RAG-based LLMs through a highly effective detector trained on neuron activation triggered by GGPP generated prompts. Our evaluation on open-sourced LLMs demonstrates the effectiveness of our methods.


OIL-AD: An Anomaly Detection Framework for Sequential Decision Sequences

arXiv.org Artificial Intelligence

Anomaly detection in decision-making sequences is a challenging problem due to the complexity of normality representation learning and the sequential nature of the task. Most existing methods based on Reinforcement Learning (RL) are difficult to implement in the real world due to unrealistic assumptions, such as having access to environment dynamics, reward signals, and online interactions with the environment. To address these limitations, we propose an unsupervised method named Offline Imitation Learning based Anomaly Detection (OIL-AD), which detects anomalies in decision-making sequences using two extracted behaviour features: action optimality and sequential association. Our offline learning model is an adaptation of behavioural cloning with a transformer policy network, where we modify the training process to learn a Q function and a state value function from normal trajectories. We propose that the Q function and the state value function can provide sufficient information about agents' behavioural data, from which we derive two features for anomaly detection. The intuition behind our method is that the action optimality feature derived from the Q function can differentiate the optimal action from others at each local state, and the sequential association feature derived from the state value function has the potential to maintain the temporal correlations between decisions (state-action pairs). Our experiments show that OIL-AD can achieve outstanding online anomaly detection performance with up to 34.8% improvement in F1 score over comparable baselines.