computational geometry
Persistence-based topological optimization: a survey
Carriere, Mathieu, Ike, Yuichi, Lacombe, Théo, Nishikawa, Naoki
Computational topology provides a tool, persistent homology, to extract quantitative descriptors from structured objects (images, graphs, point clouds, etc). These descriptors can then be involved in optimization problems, typically as a way to incorporate topological priors or to regularize machine learning models. This is usually achieved by minimizing adequate, topologically-informed losses based on these descriptors, which, in turn, naturally raises theoretical and practical questions about the possibility of optimizing such loss functions using gradient-based algorithms. This has been an active research field in the topological data analysis community over the last decade, and various techniques have been developed to enable optimization of persistence-based loss functions with gradient descent schemes. This survey presents the current state of this field, covering its theoretical foundations, the algorithmic aspects, and showcasing practical uses in several applications. It includes a detailed introduction to persistence theory and, as such, aims at being accessible to mathematicians and data scientists newcomers to the field. It is accompanied by an open-source library which implements the different approaches covered in this survey, providing a convenient playground for researchers to get familiar with the field.
- North America > United States (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > Germany (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Hardness of High-Dimensional Linear Classification
Munteanu, Alexander, Omlor, Simon, Phillips, Jeff M.
We establish new exponential in dimension lower bounds for the Maximum Halfspace Discrepancy problem, which models linear classification. Both are fundamental problems in computational geometry and machine learning in their exact and approximate forms. However, only $O(n^d)$ and respectively $\tilde O(1/\varepsilon^d)$ upper bounds are known and complemented by polynomial lower bounds that do not support the exponential in dimension dependence. We close this gap up to polylogarithmic terms by reduction from widely-believed hardness conjectures for Affine Degeneracy testing and $k$-Sum problems. Our reductions yield matching lower bounds of $\tildeΩ(n^d)$ and respectively $\tildeΩ(1/\varepsilon^d)$ based on Affine Degeneracy testing, and $\tildeΩ(n^{d/2})$ and respectively $\tildeΩ(1/\varepsilon^{d/2})$ conditioned on $k$-Sum. The first bound also holds unconditionally if the computational model is restricted to make sidedness queries, which corresponds to a widely spread setting implemented and optimized in many contemporary algorithms and computing paradigms.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
- North America > United States > Utah (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France (0.04)
- North America > United States > New York > New York County > New York City (0.36)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Connecticut > New Haven County > New Haven (0.14)
- (4 more...)
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France (0.04)
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France (0.04)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
Multi-Covering a Point Set by $m$ Disks with Minimum Total Area
Guitouni, Mariem, Loi, Chek-Manh, Fekete, Sándor P., Perk, Michael, Becker, Aaron T.
Abstract-- A common robotics sensing problem is to place sensors to robustly monitor a set of assets, where robustness is assured by requiring asset p to be monitored by at least κ (p) sensors. Given n assets that must be observed by m sensors, each with a disk-shaped sensing region, where should the sensors be placed to minimize the total area observed? W e provide and analyze a fast heuristic for this problem. Subsequently, we enforce separation constraints between the sensors by modifying the integer program formulation and by changing the disk candidate set. I. Introduction Coordinating different kinds of robotic assets is a natural challenge when it comes to problems of surveillance and guidance. As shown in Figure 1, this gives rise to scenarios in which a finite set of drones with downward communication links must maintain control of a finite set of ground assets [7], [15], choosing a set of different altitudes that balance safe separation between drones with reliable communication to the ground. The latter requires sufficient signal strength, so communication areas (and thus energy consumption) depend quadratically on the altitude.
- North America > United States > Texas > Harris County > Houston (0.04)
- Europe > Switzerland (0.04)
Enhancing binary classification: A new stacking method via leveraging computational geometry
Wu, Wei, Tang, Liang, Zhao, Zhongjie, Teo, Chung-Piaw
Binary classification is a fundamental task in machine learning and data science, with applications spanning numerous domains, including spam detection, medical diagnostics, image recognition, credit scoring. The goal is to predict a binary outcome--typically labeled as 0 or 1--based on a set of input features. Various machine learning algorithms, such as logistic regression (LR), k-nearest neighbors (kNN), support vector machines (SVM), and neural network (NN), are commonly employed for binary classification tasks. These algorithms can be mainly divided into two categories: those with interpretability, which are convenient for analysis and control (e.g., LR); and those without interpretability but with potentially good classification performance (e.g., NN). Ensemble learning, a powerful technique in predictive modeling, has gained widespread recognition for its ability to improve model performance by combining the strengths of multiple learning algorithms [1]. Among ensemble methods, stacking stands out by integrating the predictions of diverse base models (different learning algorithms) through a meta-model, resulting in enhanced prediction accuracy compared to only using the best base model [2]. Stacking has demonstrated significant applications in classification problems such as network intrusion detection [3, 4], cancer type classification [5], credit lending [6], and protein-protein binding affinity prediction [7].
- Asia > Japan > Honshū > Chūbu > Shizuoka Prefecture > Shizuoka (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
- North America > United States > Wisconsin (0.04)
- (2 more...)
- Information Technology > Security & Privacy (0.34)
- Health & Medicine > Therapeutic Area > Oncology (0.34)
- Banking & Finance > Credit (0.34)
- Health & Medicine > Diagnostic Medicine (0.34)
Online Epsilon Net and Piercing Set for Geometric Concepts
Bhore, Sujoy, Dey, Devdan, Singh, Satyam
VC-dimension and $\varepsilon$-nets are key concepts in Statistical Learning Theory. Intuitively, VC-dimension is a measure of the size of a class of sets. The famous $\varepsilon$-net theorem, a fundamental result in Discrete Geometry, asserts that if the VC-dimension of a set system is bounded, then a small sample exists that intersects all sufficiently large sets. In online learning scenarios where data arrives sequentially, the VC-dimension helps to bound the complexity of the set system, and $\varepsilon$-nets ensure the selection of a small representative set. This sampling framework is crucial in various domains, including spatial data analysis, motion planning in dynamic environments, optimization of sensor networks, and feature extraction in computer vision, among others. Motivated by these applications, we study the online $\varepsilon$-net problem for geometric concepts with bounded VC-dimension. While the offline version of this problem has been extensively studied, surprisingly, there are no known theoretical results for the online version to date. We present the first deterministic online algorithm with an optimal competitive ratio for intervals in $\mathbb{R}$. Next, we give a randomized online algorithm with a near-optimal competitive ratio for axis-aligned boxes in $\mathbb{R}^d$, for $d\le 3$. Furthermore, we introduce a novel technique to analyze similar-sized objects of constant description complexity in $\mathbb{R}^d$, which may be of independent interest. Next, we focus on the continuous version of this problem, where ranges of the set system are geometric concepts in $\mathbb{R}^d$ arriving in an online manner, but the universe is the entire space, and the objective is to choose a small sample that intersects all the ranges.
- Asia > India > Maharashtra > Mumbai (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)