Goto

Collaborating Authors

 dist


Going Beyond Heuristics by Imposing Policy Improvement as a Constraint Chi-Chang Lee 1

Neural Information Processing Systems

In many reinforcement learning (RL) applications, incorporating heuristic rewards alongside the task reward is crucial for achieving desirable performance. Heuristics encode prior human knowledge about how a task should be done, providing valuable hints for RL algorithms. However, such hints may not be optimal, limiting the performance of learned policies. The currently established way of using heuristics is to modify the heuristic reward in a manner that ensures that the optimal policy learned with it remains the same as the optimal policy for the task reward (i.e., optimal policy invariance). However, these methods often fail in practical scenarios with limited training data.


Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/\epsilon)$

Neural Information Processing Systems

In this paper, we develop a novel homotopy smoothing (HOPS) algorithm for solving a family of non-smooth problems that is composed of a non-smooth term with an explicit max-structure and a smooth term or a simple non-smooth term whose proximal mapping is easy to compute. The best known iteration complexity for solving such non-smooth optimization problems is O(1/ɛ) without any assumption on the strong convexity.


Geometric Analysis of Nonlinear Manifold Clustering Tianjiao Ding

Neural Information Processing Systems

Manifold clustering is an important problem in motion and video segmentation, natural image clustering, and other applications where high-dimensional data lie on multiple, low-dimensional, nonlinear manifolds. While current state-ofthe-art methods on large-scale datasets such as CIFAR provide good empirical performance, they do not have any proof of theoretical correctness. In this work, we propose a method that clusters data belonging to a union of nonlinear manifolds.


On the Expressivity and Sample Complexity of Node-Individualized Graph Neural Networks

Neural Information Processing Systems

Graph neural networks (GNNs) employing message passing for graph classification are inherently limited by the expressive power of the Weisfeiler-Leman (WL) test for graph isomorphism. Node individualization schemes, which assign unique identifiers to nodes (e.g., by adding random noise to features), are a common approach for achieving universal expressiveness. However, the ability of GNNs endowed with individualization schemes to generalize beyond the training data is still an open question. To address this question, this paper presents a theoretical analysis of the sample complexity of such GNNs from a statistical learning perspective, employing Vapnik-Chervonenkis (VC) dimension and covering number bounds. We demonstrate that node individualization schemes that are permutation-equivariant result in lower sample complexity, and design novel individualization schemes that exploit these results. As an application of this analysis, we also develop a novel architecture that can perform substructure identification (i.e., subgraph isomorphism) while having a lower VC dimension compared to competing methods. Finally, our theoretical findings are validated experimentally on both synthetic and real-world datasets.


Graph Clustering: Block-models and model free results

Neural Information Processing Systems

Clustering graphs under the Stochastic Block Model (SBM) and extensions are well studied. Guarantees of correctness exist under the assumption that the data is sampled from a model. In this paper, we propose a framework, in which we obtain "correctness" guarantees without assuming the data comes from a model. The guarantees we obtain depend instead on the statistics of the data that can be checked. We also show that this framework ties in with the existing model-based framework, and that we can exploit results in model-based recovery, as well as strengthen the results existing in that area of research.


Reinforcement Learning with Convex Constraints

Neural Information Processing Systems

In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward. However, many key aspects of a desired behavior are more naturally expressed as constraints. For instance, the designer may want to limit the use of unsafe actions, increase the diversity of trajectories to enable exploration, or approximate expert trajectories when rewards are sparse. In this paper, we propose an algorithmic scheme that can handle a wide class of constraints in RL tasks, specifically, any constraints that require expected values of some vector measurements (such as the use of an action) to lie in a convex set. This captures previously studied constraints (such as safety and proximity to an expert), but also enables new classes of constraints (such as diversity). Our approach comes with rigorous theoretical guarantees and only relies on the ability to approximately solve standard RL tasks. As a result, it can be easily adapted to work with any model-free or model-based RL algorithm. In our experiments, we show that it matches previous algorithms that enforce safety via constraints, but can also enforce new properties that these algorithms cannot incorporate, such as diversity.


A Algorithms. Algorithm 1 Training DHRL sample D

Neural Information Processing Systems

Then, there exists a constant L > 0 such that x and y, max(Dist(x y), Dist(y x)) L||x y||, where || || is the Euclidean norm, since Dist(x x) = 0. Then, any ϵ/L resolution graph w.r.t the Euclidean norm, whose existence is trivial, is an ϵ resolution graph w.r.t Dist(). The completely failed baselines are occluded by others. As shown in the table above, the wider the initial distribution, the easier it is for the agent to explore the map. In other words, the'fixed initial state distribution' condition we experimented with in this paper is a more difficult condition than the'uniform initial state distribution' that previous graph-guided RL algorithms utilize.


On the cohesion and separability of average-link for hierarchical agglomerative clustering

Neural Information Processing Systems

Average-link is widely recognized as one of the most popular and effective methods for building hierarchical agglomerative clustering. The available theoretical analyses show that this method has a much better approximation than other popular heuristics, as single-linkage and complete-linkage, regarding variants of Dasgupta's cost function [STOC 2016]. However, these analyses do not separate average-link from a random hierarchy and they are not appealing for metric spaces since every hierarchical clustering has a 1/2 approximation with regard to the variant of Dasgupta's function that is employed for dissimilarity measures [Moseley and Yang 2020]. In this paper, we present a comprehensive study of the performance of average-link in metric spaces, regarding several natural criteria that capture separability and cohesion, and are more interpretable than Dasgupta's cost function and its variants. We also present experimental results with real datasets that, together with our theoretical analyses, suggest that average-link is a better choice than other related methods when both cohesion and separability are important goals.


A Additional Related Work

Neural Information Processing Systems

In this section we provide further discussion of the related works. The convergence of FedAvg, also known as Local SGD, has been the subject of intense study in recent years due to the algorithm's effectiveness combined with the difficulties of analyzing it. In homogeneous data settings, local updates are easier to reconcile with solving the global objective, allowing much progress to be made in understanding convergence rates in this case [2-4, 62-66]. In the heterogeneous case multiple works have shown that FedAvg with fixed learning rate may not solve the global objective because the local updates induce a non-vanishing bias by drifting towards local solutions, even with full gradient steps and and strongly convex objectives [5-9, 16, 20, 67, 68]. As a remedy, several papers have analyzed FedAvg with learning rate that decays over communication rounds, and have shown that this approach indeed reaches a stationary point of the global objective, but at sublinear rates [5, 14-17] that can be strictly slower than the convergence rates of D-SGD [5, 18].


Loss landscape Characterization of Neural Networks without Over-Parametrization

Neural Information Processing Systems

Optimization methods play a crucial role in modern machine learning, powering the remarkable empirical achievements of deep learning models. These successes are even more remarkable given the complex non-convex nature of the loss landscape of these models. Yet, ensuring the convergence of optimization methods requires specific structural conditions on the objective function that are rarely satisfied in practice. One prominent example is the widely recognized Polyak-Łojasiewicz (PL) inequality, which has gained considerable attention in recent years. However, validating such assumptions for deep neural networks entails substantial and often impractical levels of over-parametrization. In order to address this limitation, we propose a novel class of functions that can characterize the loss landscape of modern deep models without requiring extensive over-parametrization and can also include saddle points. Crucially, we prove that gradient-based optimizers possess theoretical guarantees of convergence under this assumption.