Goto

Collaborating Authors

 tan 2



Learning Intersections of Two Margin Halfspaces under Factorizable Distributions

Diakonikolas, Ilias, Ma, Mingchen, Ren, Lisheng, Tzamos, Christos

arXiv.org Artificial Intelligence

Learning intersections of halfspaces is a central problem in Computational Learning Theory. Even for just two halfspaces, it remains a major open question whether learning is possible in polynomial time with respect to the margin $γ$ of the data points and their dimensionality $d$. The best-known algorithms run in quasi-polynomial time $d^{O(\log(1/γ))}$, and it has been shown that this complexity is unavoidable for any algorithm relying solely on correlational statistical queries (CSQ). In this work, we introduce a novel algorithm that provably circumvents the CSQ hardness barrier. Our approach applies to a broad class of distributions satisfying a natural, previously studied, factorizability assumption. Factorizable distributions lie between distribution-specific and distribution-free settings, and significantly extend previously known tractable cases. Under these distributions, we show that CSQ-based methods still require quasipolynomial time even for weakly learning, whereas our algorithm achieves $poly(d,1/γ)$ time by leveraging more general statistical queries (SQ), establishing a strong separation between CSQ and SQ for this simple realizable PAC learning problem. Our result is grounded in a rigorous analysis utilizing a novel duality framework that characterizes the moment tensor structure induced by the marginal distributions. Building on these structural insights, we propose new, efficient learning algorithms. These algorithms combine a refined variant of Jennrich's Algorithm with PCA over random projections of the moment tensor, along with a gradient-descent-based non-convex optimization framework.


Integrator Forwading Design for Unicycles with Constant and Actuated Velocity in Polar Coordinates

Krstic, Miroslav, Todorovski, Velimir, Kim, Kwang Hak, Astolfi, Alessandro

arXiv.org Artificial Intelligence

Abstract-- In a companion paper, we present a modular framework for unicycle stabilization in polar coordinates that provides smooth steering laws through backstepping. Surprisingly, the same problem also allows application of integrator forwarding. In this work, we leverage this feature and construct new smooth steering laws together with control Lyapunov functions (CLFs), expanding the set of CLFs available for inverse optimal control design. In the case of constant forward velocity (Dubins car), backstepping produces finite-time (deadbeat) parking, and we show that integrator forwarding yields the very same class of solutions. This reveals a fundamental connection between backstepping and forwarding in addressing both the unicycle and, the Dubins car parking problems.


Modular Design of Strict Control Lyapunov Functions for Global Stabilization of the Unicycle in Polar Coordinates

Todorovski, Velimir, Kim, Kwang Hak, Krstic, Miroslav

arXiv.org Artificial Intelligence

Abstract-- Since the mid-1990s, it has been known that, unlike in Cartesian form where Brockett's condition rules out static feedback stabilization, the unicycle is globally asymptotically stabilizable by smooth feedback in polar coordinates. In this note, we introduce a modular framework for designing smooth feedback laws that achieve global asymptotic stabilization in polar coordinates. These laws are bidirectional, enabling efficient parking maneuvers, and are paired with families of strict control Lyapunov functions (CLFs) constructed in a modular fashion. The resulting CLFs guarantee global asymptotic stability with explicit convergence rates and include barrier variants that yield "almost global" stabilization, excluding only zero-measure subsets of the rotation manifolds. The strictness of the CLFs is further leveraged in our companion paper, where we develop inverse-optimal redesigns with meaningful cost functions and infinite gain margins.




Self-Directed Linear Classification

Diakonikolas, Ilias, Kontonis, Vasilis, Tzamos, Christos, Zarifis, Nikos

arXiv.org Artificial Intelligence

In online classification, a learner is presented with a sequence of examples and aims to predict their labels in an online fashion so as to minimize the total number of mistakes. In the self-directed variant, the learner knows in advance the pool of examples and can adaptively choose the order in which predictions are made. Here we study the power of choosing the prediction order and establish the first strong separation between worst-order and random-order learning for the fundamental task of linear classification. Prior to our work, such a separation was known only for very restricted concept classes, e.g., one-dimensional thresholds or axis-aligned rectangles. We present two main results. If $X$ is a dataset of $n$ points drawn uniformly at random from the $d$-dimensional unit sphere, we design an efficient self-directed learner that makes $O(d \log \log(n))$ mistakes and classifies the entire dataset. If $X$ is an arbitrary $d$-dimensional dataset of size $n$, we design an efficient self-directed learner that predicts the labels of $99\%$ of the points in $X$ with mistake bound independent of $n$. In contrast, under a worst- or random-ordering, the number of mistakes must be at least $\Omega(d \log n)$, even when the points are drawn uniformly from the unit sphere and the learner only needs to predict the labels for $1\%$ of them.


Localization with Few Distance Measurements

Halperin, Dan, LaValle, Steven M., Ugav, Barak

arXiv.org Artificial Intelligence

Given a polygon $W$, a depth sensor placed at point $p=(x,y)$ inside $W$ and oriented in direction $\theta$ measures the distance $d=h(x,y,\theta)$ between $p$ and the closest point on the boundary of $W$ along a ray emanating from $p$ in direction $\theta$. We study the following problem: Give a polygon $W$, possibly with holes, with $n$ vertices, preprocess it such that given a query real value $d\geq 0$, one can efficiently compute the preimage $h^{-1}(d)$, namely determine all the possible poses (positions and orientations) of a depth sensor placed in $W$ that would yield the reading $d$. We employ a decomposition of $W\times S^1$, which is an extension of the celebrated trapezoidal decomposition, and which we call rotational trapezoidal decomposition and present an efficient data structure, which computes the preimage in an output-sensitive fashion relative to this decomposition: if $k$ cells of the decomposition contribute to the final result, we will report them in $O(k+1)$ time, after $O(n^2\log n)$ preprocessing time and using $O(n^2)$ storage space. We also analyze the shape of the projection of the preimage onto the polygon $W$; this projection describes the portion of $W$ where the sensor could have been placed. Furthermore, we obtain analogous results for the more useful case (narrowing down the set of possible poses), where the sensor performs two depth measurement from the same point $p$, one in direction $\theta$ and the other in direction $\theta+\pi$. While localizations problems in robotics are often carried out by exploring the full visibility polygon of a sensor placed at a fixed point of the environment, the approach that we propose here opens the door to sufficing with only few depth measurements, which is advantageous as it allows for usage of inexpensive sensors and could also lead to savings in storage and communication costs.


Invariance encoding in sliced-Wasserstein space for image classification with limited training data

Rabbi, Mohammad Shifat E, Zhuang, Yan, Li, Shiying, Rubaiyat, Abu Hasnat Mohammad, Yin, Xuwang, Rohde, Gustavo K.

arXiv.org Artificial Intelligence

Deep convolutional neural networks (CNNs) are broadly considered to be state-of-the-art generic end-to-end image classification systems. However, they are known to underperform when training data are limited and thus require data augmentation strategies that render the method computationally expensive and not always effective. Rather than using a data augmentation strategy to encode invariances as typically done in machine learning, here we propose to mathematically augment a nearest subspace classification model in sliced-Wasserstein space by exploiting certain mathematical properties of the Radon Cumulative Distribution Transform (R-CDT), a recently introduced image transform. We demonstrate that for a particular type of learning problem, our mathematical solution has advantages over data augmentation with deep CNNs in terms of classification accuracy and computational complexity, and is particularly effective under a limited training data setting. The method is simple, effective, computationally efficient, non-iterative, and requires no parameters to be tuned. Python code implementing our method is available at https://github.com/rohdelab/mathematical_augmentation. Our method is integrated as a part of the software package PyTransKit, which is available at https://github.com/rohdelab/PyTransKit.


Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

Luo, Yuwei, Gupta, Varun, Kolar, Mladen

arXiv.org Machine Learning

We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon $T$ with fixed and known cost matrices $Q,R$, but unknown and non-stationary dynamics $\{A_t, B_t\}$. The sequence of dynamics matrices can be arbitrary, but with a total variation, $V_T$, assumed to be $o(T)$ and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all $t$, we present an algorithm that achieves the optimal dynamic regret of $\tilde{\mathcal{O}}\left(V_T^{2/5}T^{3/5}\right)$. With piece-wise constant dynamics, our algorithm achieves the optimal regret of $\tilde{\mathcal{O}}(\sqrt{ST})$ where $S$ is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of $V_T$. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.