Goto

Collaborating Authors

 rtr


Efficient machine unlearning with minimax optimality

arXiv.org Machine Learning

There is a growing demand for efficient data removal to comply with regulations like the GDPR and to mitigate the influence of biased or corrupted data. This has motivated the field of machine unlearning, which aims to eliminate the influence of specific data subsets without the cost of full retraining. In this work, we propose a statistical framework for machine unlearning with generic loss functions and establish theoretical guarantees. For squared loss, especially, we develop Unlearning Least Squares (ULS) and establish its minimax optimality for estimating the model parameter of remaining data when only the pre-trained estimator, forget samples, and a small subsample of the remaining data are available. Our results reveal that the estimation error decomposes into an oracle term and an unlearning cost determined by the forget proportion and the forget model bias. We further establish asymptotically valid inference procedures without requiring full retraining. Numerical experiments and real-data applications demonstrate that the proposed method achieves performance close to retraining while requiring substantially less data access.


Supplementary material: Inverse Reinforcement Learning in a ContinuousStateSpacewithFormalGuarantees AProofsoflemmasandtheorems

Neural Information Processing Systems

We note that the interchange of the integral and infinite summation is justified by Section 3.7 in [5], since the coefficients Z Now,define action sequence (a)n such thata1 = a and an = a1 for alln > 1. Then we can use subadditivity of measure to bound the maximum difference across all entries of [kZ]. Therefore, the induced infinity norm error ofbZ isless thanεifthe element wise error isless than ε/k. Therefore,bα>Fφ(s) is ρ-Lipschitz if the absolute value of its derivativeisboundedbyρ,i.e. SincebF has all zeros beyond thek-th column and row, each infinite-matrix bF can be treated as ak k matrix.


Reconfigurable Tendon-Driven Robots: Eliminating Inter-segmental Coupling via Independently Lockable Joints

arXiv.org Artificial Intelligence

With a slender redundant body, the tendon-driven robot (TDR) has a large workspace and great maneuverability while working in complex environments. TDR comprises multiple independently controlled robot segments, each with a set of driving tendons. While increasing the number of robot segments enhances dexterity and expands the workspace, this structural expansion also introduces intensified inter-segmental coupling. Therefore, achieving precise TDR control requires more complex models and additional motors. This paper presents a reconfigurable tendon-driven robot (RTR) equipped with innovative lockable joints. Each joint's state (locked/free) can be individually controlled through a pair of antagonistic tendons, and its structure eliminates the need for a continuous power supply to maintain the state. Operators can selectively actuate the targeted robot segments, and this scheme fundamentally eliminates the inter-segmental coupling, thereby avoiding the requirement for complex coordinated control between segments. The workspace of RTR has been simulated and compared with traditional TDRs' workspace, and RTR's advantages are further revealed. The kinematics and statics models of the RTR have been derived and validation experiments have been conducted. Demonstrations have been performed using a seven-joint RTR prototype to show its reconfigurability and moving ability in complex environments with an actuator pack comprising only six motors.


TuneShield: Mitigating Toxicity in Conversational AI while Fine-tuning on Untrusted Data

arXiv.org Artificial Intelligence

Recent advances in foundation models, such as LLMs, have revolutionized conversational AI. Chatbots are increasingly being developed by customizing LLMs on specific conversational datasets. However, mitigating toxicity during this customization, especially when dealing with untrusted training data, remains a significant challenge. To address this, we introduce TuneShield, a defense framework designed to mitigate toxicity during chatbot fine-tuning while preserving conversational quality. TuneShield leverages LLM-based toxicity classification, utilizing the instruction-following capabilities and safety alignment of LLMs to effectively identify toxic samples, outperforming industry API services. TuneShield generates synthetic conversation samples, termed 'healing data', based on the identified toxic samples, using them to mitigate toxicity while reinforcing desirable behavior during fine-tuning. It performs an alignment process to further nudge the chatbot towards producing desired responses. Our findings show that TuneShield effectively mitigates toxicity injection attacks while preserving conversational quality, even when the toxicity classifiers are imperfect or biased. TuneShield proves to be resilient against adaptive adversarial and jailbreak attacks. Additionally, TuneShield demonstrates effectiveness in mitigating adaptive toxicity injection attacks during dialog-based learning (DBL).


Route to Reason: Adaptive Routing for LLM and Reasoning Strategy Selection

arXiv.org Artificial Intelligence

The inherent capabilities of a language model (LM) and the reasoning strategies it employs jointly determine its performance in reasoning tasks. While test-time scaling is regarded as an effective approach to tackling complex reasoning tasks, it incurs substantial computational costs and often leads to "overthinking", where models become trapped in "thought pitfalls". To address this challenge, we propose Route-To-Reason (RTR), a novel unified routing framework that dynamically allocates both LMs and reasoning strategies according to task difficulty under budget constraints. RTR learns compressed representations of both expert models and reasoning strategies, enabling their joint and adaptive selection at inference time. This method is low-cost, highly flexible, and can be seamlessly extended to arbitrary black-box or white-box models and strategies, achieving true plug-and-play functionality. Extensive experiments across seven open source models and four reasoning strategies demonstrate that RTR achieves an optimal trade-off between accuracy and computational efficiency among all baselines, achieving higher accuracy than the best single model while reducing token usage by over 60%.


Learning Realistic Traffic Agents in Closed-loop

arXiv.org Artificial Intelligence

Realistic traffic simulation is crucial for developing self-driving software in a safe and scalable manner prior to real-world deployment. Typically, imitation learning (IL) is used to learn human-like traffic agents directly from real-world observations collected offline, but without explicit specification of traffic rules, agents trained from IL alone frequently display unrealistic infractions like collisions and driving off the road. This problem is exacerbated in out-of-distribution and long-tail scenarios. On the other hand, reinforcement learning (RL) can train traffic agents to avoid infractions, but using RL alone results in unhuman-like driving behaviors. We propose Reinforcing Traffic Rules (RTR), a holistic closed-loop learning objective to match expert demonstrations under a traffic compliance constraint, which naturally gives rise to a joint IL + RL approach, obtaining the best of both worlds. Our method learns in closed-loop simulations of both nominal scenarios from real-world datasets as well as procedurally generated long-tail scenarios. Our experiments show that RTR learns more realistic and generalizable traffic simulation policies, achieving significantly better tradeoffs between human-like driving and traffic compliance in both nominal and long-tail scenarios. Moreover, when used as a data generation tool for training prediction models, our learned traffic policy leads to considerably improved downstream prediction metrics compared to baseline traffic agents. For more information, visit the project website: https://waabi.ai/rtr


Machine Learning Engineer

#artificialintelligence

Rent the Runway (RTR) is transforming the way we get dressed by pioneering the world's first Closet in the Cloud. Founded in 2009, RTR has disrupted the $2.4 trillion fashion industry by inspiring women with a more joyful, sustainable and financially-savvy way to feel their best every day. As the ultimate destination for circular fashion, the brand now offers infinite points of access to its shared closet via a fully customizable subscription to fashion, one-time rental or ownership. RTR offers designer apparel, accessories and home decor from 700 brand partners and has built in-house proprietary technology and a one-of-a-kind reverse logistics operation. Under CEO and Co-Founder Jennifer Hyman's leadership, RTR has been named to CNBC's "Disruptor 50" five times in ten years, and has been placed on Fast Company's Most Innovative Companies list multiple times, while Hyman herself has been named to the "TIME 100" most influential people in the world and as one of People magazine's "Women Changing the World."


Senior Data Engineer

#artificialintelligence

Rent the Runway (RTR) is transforming the way we get dressed by pioneering the world's first Closet in the Cloud. Founded in 2009, RTR has disrupted the $2.4 trillion fashion industry by inspiring women with a more joyful, sustainable and financially-savvy way to feel their best every day. As the ultimate destination for circular fashion, the brand now offers infinite points of access to its shared closet via a fully customizable subscription to fashion, one-time rental or ownership. RTR offers designer apparel, accessories and home decor from 700 brand partners and has built in-house proprietary technology and a one-of-a-kind reverse logistics operation. Under CEO and Co-Founder Jennifer Hyman's leadership, RTR has been named to CNBC's "Disruptor 50" five times in ten years, and has been placed on Fast Company's Most Innovative Companies list multiple times, while Hyman herself has been named to the "TIME 100" most influential people in the world and as one of People magazine's "Women Changing the World."


Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization

arXiv.org Machine Learning

Modern measurement technologies provide the means to measure high density spatial and geometric data in three-dimensional (3D) coordinate systems, referred to as point clouds. Point cloud data analysis has broad applications in advanced manufacturing and metrology for measuring dimensional accuracy and shape analysis, in geographic information systems (GIS) for digital elevation modeling and analysis of terrains, in computer graphics for shape reconstruction, and in medical imaging for volumetric measurement to name a few. The role of point cloud data in manufacturing is now more important than ever, particularly in the field of smart and additive manufacturing processes, where products with complex shape and geometry are manufactured with the help of advanced technologies (Gibson et al., 2010). In these processes, the dimensional and geometric accuracy of manufactured parts are measured in the form of point clouds using modern sensing devices, including touch-probe coordinate measuring machines (CMM) and optical systems, such as laser scanners. Modeling the relationship of the dimensional accuracy, encapsulated in point clouds, with process parameters and machine settings is vital for variation reduction and process optimization.


Differentially Private Empirical Risk Minimization

arXiv.org Artificial Intelligence

Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $\epsilon$-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance.