Zhu, James
Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models
Hsu, Aliyah R., Zhu, James, Wang, Zhichao, Bi, Bin, Mehrotra, Shubham, Pentyala, Shiva K., Tan, Katherine, Mao, Xiang-Bo, Omrani, Roshanak, Chaudhuri, Sougata, Radhakrishnan, Regunathan, Asur, Sitaram, Cheng, Claire Na, Yu, Bin
LLMs have demonstrated impressive proficiency in generating coherent and high-quality text, making them valuable across a range of text-generation tasks. However, rigorous evaluation of this generated content is crucial, as ensuring its quality remains a significant challenge due to persistent issues such as factual inaccuracies and hallucinations. This paper introduces two fine-tuned general-purpose LLM autoevaluators, REC-12B and REC-70B, specifically designed to evaluate generated text across several dimensions: faithfulness, instruction following, coherence, and completeness. These models not only provide ratings for these metrics but also offer detailed explanations and verifiable citations, thereby enhancing trust in the content. Moreover, the models support various citation modes, accommodating different requirements for latency and granularity. Extensive evaluations on diverse benchmarks demonstrate that our general-purpose LLM auto-evaluator, REC-70B, outperforms state-of-the-art LLMs, excelling in content evaluation by delivering better quality explanations and citations with minimal bias. It achieves Rank \#1 as a generative model on the RewardBench leaderboard\footnote{\url{https://huggingface.co/spaces/allenai/reward-bench}} under the model name \texttt{TextEval-Llama3.1-70B}. Our REC dataset and models are released at \url{https://github.com/adelaidehsu/REC}.
Hybrid Iterative Linear Quadratic Estimation: Optimal Estimation for Hybrid Systems
Payne, J. Joe, Zhu, James, Kong, Nathan J., Johnson, Aaron M.
In this paper we present Hybrid iterative Linear Quadratic Estimation (HiLQE), an optimization based offline state estimation algorithm for hybrid dynamical systems. We utilize the saltation matrix, a first order approximation of the variational update through an event driven hybrid transition, to calculate gradient information through hybrid events in the backward pass of an iterative linear quadratic optimization over state estimates. This enables accurate computation of the value function approximation at each timestep. Additionally, the forward pass in the iterative algorithm is augmented with hybrid dynamics in the rollout. A reference extension method is used to account for varying impact times when comparing states for the feedback gain in noise calculation. The proposed method is demonstrated on an ASLIP hopper system with position measurements. In comparison to the Salted Kalman Filter (SKF), the algorithm presented here achieves a maximum of 63.55% reduction in estimation error magnitude over all state dimensions near impact events.
Double-Anonymous Review for Robotics
Yim, Justin K., Nadan, Paul, Zhu, James, Stutt, Alexandra, Payne, J. Joe, Pavlov, Catherine, Johnson, Aaron M.
However, Prior research has investigated the benefits and costs of even when reviewers self-report as having the highest level double-anonymous review (DAR, also known as double-blind of expertise in their field, their guess accuracy is no better review) in comparison to single-anonymous review (SAR) and than those who are self-reported as less knowledgeable [17]. Several review papers have attempted to Increased editor burden in handling conflict of interest, author compile experimental results in peer review research both burden in anonymizing the manuscript, and reviewer burden broadly and in engineering and computer science specifically in navigating prior work by others and by the authors are also [1-4]. This document summarizes prior research in peer review cited as costs to DAR. that may inform decisions about the format of peer review in Despite these challenges, numerous robotics conferences the field of robotics and makes some recommendations for have already made the shift to DAR, including RSS and a potential next steps for robotics publications. Furthermore, top machine learning conferences such as NeurIPS and CoRL have II. The presence of gender bias and effect of DAR on such bias is a common concern in research into peer review but Based on the current literature, we find that the evidence the conclusions are varied. Many studies do conclude that in support of double-anonymous review is not sufficient to gender can disadvantage authors, particularly women [5, 6] conclusively recommend for implementation in robotics conferences and that DAR can reduce this bias [7].
Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing
Bansal, Ayoosh, Zhao, Yang, Zhu, James, Cheng, Sheng, Gu, Yuliang, Yoon, Hyung-Jin, Kim, Hunmin, Hovakimyan, Naira, Sha, Lui
Perception, Planning, and Control form the essential components of autonomy in advanced air mobility. This work advances the holistic integration of these components to enhance the performance and robustness of the complete cyber-physical system. We adapt Perception Simplex, a system for verifiable collision avoidance amidst obstacle detection faults, to the vertical landing maneuver for autonomous air mobility vehicles. We improve upon this system by replacing static assumptions of control capabilities with dynamic confirmation, i.e., real-time confirmation of control limitations of the system, ensuring reliable fulfillment of safety maneuvers and overrides, without dependence on overly pessimistic assumptions. Parameters defining control system capabilities and limitations, e.g., maximum deceleration, are continuously tracked within the system and used to make safety-critical decisions. We apply these techniques to propose a verifiable collision avoidance solution for autonomous aerial mobility vehicles operating in cluttered and potentially unsafe environments.
Convergent iLQR for Safe Trajectory Planning and Control of Legged Robots
Zhu, James, Payne, J. Joe, Johnson, Aaron M.
In order to perform highly dynamic and agile maneuvers, legged robots typically spend time in underactuated domains (e.g. with feet off the ground) where the system has limited command of its acceleration and a constrained amount of time before transitioning to a new domain (e.g. foot touchdown). Meanwhile, these transitions can instantaneously change the system's state, possibly causing perturbations to be mapped arbitrarily far away from the target trajectory. These properties make it difficult for local feedback controllers to effectively recover from disturbances as the system evolves through underactuated domains and hybrid impact events. To address this, we utilize the fundamental solution matrix that characterizes the evolution of perturbations through a hybrid trajectory and its 2-norm, which represents the worst-case growth of perturbations. In this paper, the worst-case perturbation analysis is used to explicitly reason about the tracking performance of a hybrid trajectory and is incorporated in an iLQR framework to optimize a trajectory while taking into account the closed-loop convergence of the trajectory under an LQR tracking controller. The generated convergent trajectories recover more effectively from perturbations, are more robust to large disturbances, and use less feedback control effort than trajectories generated with traditional methods.
Grounding Robot Navigation in Self-Defense Law
Zhu, James, Shrivastava, Anoushka, Johnson, Aaron M.
Robots operating in close proximity to humans rely heavily on human trust to successfully complete their tasks. But what are the real outcomes when this trust is violated? Self-defense law provides a framework for analyzing tangible failure scenarios that can inform the design of robots and their algorithms. Studying self-defense is particularly important for ground robots since they operate within public environments, where they can pose a legitimate threat to the safety of nearby humans. Moreover, even if ground robots can guarantee human safety, the perception of a physical threat is sufficient to justify human self-defense against robots. In this paper, we synthesize works in law, engineering, and social science to present four actionable recommendations for how the robotics community can craft robots to mitigate the likelihood of self-defense situations arising. We establish how current U.S. self-defense law can justify a human protecting themselves against a robot, discuss the current literature on human attitudes toward robots, and analyze methods that have been produced to allow robots to operate close to humans. Finally, we present hypothetical scenarios that underscore how current robot navigation methods can fail to sufficiently consider self-defense concerns and the need for the recommendations to guide improvements in the field.
Saltation Matrices: The Essential Tool for Linearizing Hybrid Dynamical Systems
Kong, Nathan J., Payne, J. Joe, Zhu, James, Johnson, Aaron M.
I Figure 1: An example 2 mode hybrid system where the domains are shown in black circles D, the dynamics are shown with gray arrows F, the guard for the current domain is shown in red dashed g, and the reset from the current mode to the next mode is shown in blue R. The saltation matrix relies on differentiating the guards B. Saltation matrix derivation and resets so they must be differentiable. Excluding Zeno In this section, the derivation of the saltation matrix (2) is conditions ensures we avoid computing infinite saltation matrices presented, following the geometric derivation from [10] with in finite time, which would clearly be unsound for the addition of reset maps. There are many alternate ways analysis. Transversality ensures that neighboring trajectories to derive (2): a derivation using the chain rule is included in impact the same guard unless the impact point lies on any Appendix A and a derivation using a double limit can be found other guard surface, in which case the Bouligand derivative in [96]. is the appropriate analysis tool [52, 114-117]. Transversality Suppose the nominal trajectory of interest is x(t) as shown also ensures the denominator in (2) does not approach zero. in Figure 1. The trajectory starts in mode I and goes through a In some cases, the saltation matrix for a hybrid transition hybrid transition to mode J at time t. The saltation matrix is a can become an identity transformation.