Search
Fast Sparse Classification for Generalized Linear and Additive Models
Liu, Jiachang, Zhong, Chudi, Seltzer, Margo, Rudin, Cynthia
We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets.
Minimax Estimation and Identity Testing of Markov Chains
A fundamental problem in statistics is to estimate a probability distribution from independent samples. In order to make the question precise, we choose a notion of distance \(\rho\) between distributions, pick two small numbers \(\delta, \varepsilon 0\) and can for instance say that an estimator is good, when with high probability \(1 - \delta\) over the random choice of the sample, the estimator is \(\varepsilon\) close to the true distribution. The problem of determining \(n_0\) is then typically addressed by providing two distinct answers. On one hand, we construct an estimator that would be good for any distribution given that \(n n_0 {UB}\). Conversely, we set up a hard problem such that no estimator can be good when \(n n_0 {LB}\).
Learning Preconditions of Hybrid Force-Velocity Controllers for Contact-Rich Manipulation
Liang, Jacky, Cheng, Xianyi, Kroemer, Oliver
Robots need to manipulate objects in constrained environments like shelves and cabinets when assisting humans in everyday settings like homes and offices. These constraints make manipulation difficult by reducing grasp accessibility, so robots need to use non-prehensile strategies that leverage object-environment contacts to perform manipulation tasks. To tackle the challenge of planning and controlling contact-rich behaviors in such settings, this work uses Hybrid Force-Velocity Controllers (HFVCs) as the skill representation and plans skill sequences with learned preconditions. While HFVCs naturally enable robust and compliant contact-rich behaviors, solvers that synthesize them have traditionally relied on precise object models and closed-loop feedback on object pose, which are difficult to obtain in constrained environments due to occlusions. We first relax HFVCs' need for precise models and feedback with our HFVC synthesis framework, then learn a point-cloud-based precondition function to classify where HFVC executions will still be successful despite modeling inaccuracies. Finally, we use the learned precondition in a search-based task planner to complete contact-rich manipulation tasks in a shelf domain. Our method achieves a task success rate of $73.2\%$, outperforming the $51.5\%$ achieved by a baseline without the learned precondition. While the precondition function is trained in simulation, it can also transfer to a real-world setup without further fine-tuning. See supplementary materials and videos at https://sites.google.com/view/constrained-manipulation/
Communication-Aware Local Search for Distributed Constraint Optimization
Rachmut, Ben | Zivan, Roie (Ben Gurion University of the Negev) | Yeoh, William (Washington University in St. Louis)
Most studies investigating models and algorithms for distributed constraint optimization problems (DCOPs) assume that messages arrive instantaneously and are never lost. Specifically, distributed local search DCOP algorithms, have been designed as synchronous algorithms (i.e., they perform in synchronous iterations in which each agent exchanges messages with all its neighbors), despite running in asynchronous environments. This is true also for an anytime mechanism that reports the best solution explored during the run of synchronous distributed local search algorithms. Thus, when the assumption of perfect communication is relaxed, the properties that were established for the state-of-the-art local search algorithms and the anytime mechanism may not necessarily apply. In this work, we address this limitation by: (1) Proposing a Communication-Aware DCOP model (CA-DCOP) that can represent scenarios with different communication disturbances; (2) Investigating the performance of existing local search DCOP algorithms, specifically Distributed Stochastic Algorithm (DSA) and Maximum Gain Messages (MGM), in the presence of message latency and message loss; (3) Proposing a latency-aware monotonic distributed local search DCOP algorithm; and (4) Proposing an asynchronous anytime framework for reporting the best solution explored by non-monotonic asynchronous local search DCOP algorithms. Our empirical results demonstrate that imperfect communication has a positive effect on distributed local search algorithms due to increased exploration. Furthermore, the asynchronous anytime framework we proposed allows one to benefit from algorithms with inherent explorative heuristics.
TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack
Cao, Yu, Li, Dianqi, Fang, Meng, Zhou, Tianyi, Gao, Jun, Zhan, Yibing, Tao, Dacheng
We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers. Despite phenomenal progress on general adversarial attacks, few works have investigated the vulnerability and attack specifically for QA models. In this work, we first explore the biases in the existing models and discover that they mainly rely on keyword matching between the question and context, and ignore the relevant contextual relations for answer prediction. Based on two biases above, TASA attacks the target model in two folds: (1) lowering the model's confidence on the gold answer with a perturbed answer sentence; (2) misguiding the model towards a wrong answer with a distracting answer sentence. Equipped with designed beam search and filtering methods, TASA can generate more effective attacks than existing textual attack methods while sustaining the quality of contexts, in extensive experiments on five QA datasets and human evaluations.
Instance-Optimal Differentially Private Estimation
McMillan, Audra, Smith, Adam, Ullman, Jon
While the primary goal of statistical inference is to reveal properties of a population, many statistical estimators also reveal a significant amount of information about their sample, and this becomes a serious problem when the sample contains sensitive private information about individuals. As a response, differential privacy (Dwork et al., 2006) has emerged as a strong formal criterion for a statistical procedure to protect individual privacy. Differentially private algorithms are deployed in a variety of settings, from the public data products for the 2020 US decennial census to Google's keyboard prediction models (McMahan and Thakurta, 2022) and Apple device analytics (Apple Differential Privacy Team, 2017). Differential privacy is a constraint on an estimator that requires the distribution of the estimator's outputs to be insensitive to changing a single individual's data, and it offers a strong semantic guarantee that no attacker can infer much more about any individual than they could have inferred had that individual's data never been collected (Kasiviswanathan and Smith, 2008). This semantic guarantee does not rely on any assumptions about the adversary's background knowledge and capabilities. In contrast, alternative approaches to protecting privacy have often been undermined by underestimating the abilities of the attacker. Although differential privacy is a constraint that significantly limits inference with small sample sizes, most statistical tasks are compatible with differential privacy given a large enough sample. There is now a large body of work on differentially private estimation, which includes minimax optimal differentially private estimators for many estimation tasks (e.g.
Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems
Jiang, Yuchen Eleanor, Cotterell, Ryan, Sachan, Mrinmaya
Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis of the structure of discourse. According to the theory, local coherence of discourse arises from the manner and extent to which successive utterances make reference to the same entities. In this paper, we investigate the connection between centering theory and modern coreference resolution systems. We provide an operationalization of centering and systematically investigate if neural coreference resolvers adhere to the rules of centering theory by defining various discourse metrics and developing a search-based methodology. Our information-theoretic analysis reveals a positive dependence between coreference and centering; but also shows that high-quality neural coreference resolvers may not benefit much from explicitly modeling centering ideas. Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations. Finally, we discuss factors that contribute to coreference which are not modeled by CT such as world knowledge and recency bias. We formulate a version of CT that also models recency and show that it captures coreference information better compared to vanilla CT.
Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification
Komiyama, Junpei, Tsuchiya, Taira, Honda, Junya
We consider the fixed-budget best arm identification problem where the goal is to find the arm of the largest mean with a fixed number of samples. It is known that the probability of misidentifying the best arm is exponentially small to the number of rounds. However, limited characterizations have been discussed on the rate (exponent) of this value. In this paper, we characterize the minimax optimal rate as a result of an optimization over all possible parameters. We introduce two rates, $R^{\mathrm{go}}$ and $R^{\mathrm{go}}_{\infty}$, corresponding to lower bounds on the probability of misidentification, each of which is associated with a proposed algorithm. The rate $R^{\mathrm{go}}$ is associated with $R^{\mathrm{go}}$-tracking, which can be efficiently implemented by a neural network and is shown to outperform existing algorithms. However, this rate requires a nontrivial condition to be achievable. To address this issue, we introduce the second rate $R^{\mathrm{go}}_\infty$. We show that this rate is indeed achievable by introducing a conceptual algorithm called delayed optimal tracking (DOT).
A Rapidly-Exploring Random Trees Motion Planning Algorithm for Hybrid Dynamical Systems
Wang, Nan, Sanfelice, Ricardo G.
This paper proposes a rapidly-exploring random trees (RRT) algorithm to solve the motion planning problem for hybrid systems. At each iteration, the proposed algorithm, called HyRRT, randomly picks a state sample and extends the search tree by flow or jump, which is also chosen randomly when both regimes are possible. Through a definition of concatenation of functions defined on hybrid time domains, we show that HyRRT is probabilistically complete, namely, the probability of failing to find a motion plan approaches zero as the number of iterations of the algorithm increases. This property is guaranteed under mild conditions on the data defining the motion plan, which include a relaxation of the usual positive clearance assumption imposed in the literature of classical systems. The motion plan is computed through the solution of two optimization problems, one associated with the flow and the other with the jumps of the system. The proposed algorithm is applied to a walking robot so as to highlight its generality and computational features.
Assembly Planning from Observations under Physical Constraints
Chabal, Thomas, Strudel, Robin, Arlaud, Etienne, Ponce, Jean, Schmid, Cordelia
This paper addresses the problem of copying an unknown assembly of primitives with known shape and appearance using information extracted from a single photograph by an off-the-shelf procedure for object detection and pose estimation. The proposed algorithm uses a simple combination of physical stability constraints, convex optimization and Monte Carlo tree search to plan assemblies as sequences of pick-and-place operations represented by STRIPS operators. It is efficient and, most importantly, robust to the errors in object detection and pose estimation unavoidable in any real robotic system. The proposed approach is demonstrated with thorough experiments on a UR5 manipulator.