AITopics

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Neural Information Processing SystemsMar-12-2024, 10:15:53 GMT

Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games

We study a general adversarial online learning problem, in which we are given a decision set X in a reflexive Banach space X and a sequence of reward vectors in the dual space of X. At each iteration, we choose an action from X, based on the observed sequence of previous rewards. Our goal is to minimize regret. Using results from infinite dimensional convex analysis, we generalize the method of Dual Averaging to our setting and obtain upper bounds on the worst-case regret that generalize many previous results. Under the assumption of uniformly continuous rewards, we obtain explicit regret bounds in a setting where the decision set is the set of probability distributions on a compact metric space S. Importantly, we make no convexity assumptions on either S or the reward functions. We also prove a general lower bound on the worst-case regret for any online algorithm. We then apply these results to the problem of learning in repeated two-player zero-sum games on compact metric spaces. In doing so, we first prove that if both players play a Hannan-consistent strategy, then with probability 1 the empirical distributions of play weakly converge to the set of Nash equilibria of the game. We then show that, under mild assumptions, Dual Averaging on the (infinite-dimensional) space of probability distributions indeed achieves Hannan-consistency.

algorithm, assumption, sequence, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.55)

Neural Information Processing SystemsMar-12-2024, 08:00:29 GMT

Active Nearest-Neighbor Learning in Metric Spaces

We propose a pool-based non-parametric active learning algorithm for general metric spaces, called MArgin Regularized Metric Active Nearest Neighbor (MARMANN), which outputs a nearest-neighbor classifier. We give prediction error guarantees that depend on the noisy-margin properties of the input sample, and are competitive with those obtained by previously proposed passive learners. We prove that the label complexity of MARMANN is significantly lower than that of any passive learner with similar error guarantees. Our algorithm is based on a generalized sample compression scheme and a new label-efficient active model-selection procedure.

compression, marmann, passive learner, (14 more...)

Country:

North America > United States > New York (0.04)
Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.62)

arXiv.org Artificial IntelligenceMar-12-2024

Novelty Detection on Radio Astronomy Data using Signatures

Arrubarrena, Paola, Lemercier, Maud, Nikolic, Bojan, Lyons, Terry, Cass, Thomas

We introduce SigNova, a new semi-supervised framework for detecting anomalies in streamed data. While our initial examples focus on detecting radio-frequency interference (RFI) in digitized signals within the field of radio astronomy, it is important to note that SigNova's applicability extends to any type of streamed data. The framework comprises three primary components. Firstly, we use the signature transform to extract a canonical collection of summary statistics from observational sequences. This allows us to represent variable-length visibility samples as finite-dimensional feature vectors. Secondly, each feature vector is assigned a novelty score, calculated as the Mahalanobis distance to its nearest neighbor in an RFI-free training set. By thresholding these scores we identify observation ranges that deviate from the expected behavior of RFI-free visibility samples without relying on stringent distributional assumptions. Thirdly, we integrate this anomaly detector with Pysegments, a segmentation algorithm, to localize consecutive observations contaminated with RFI, if any. This approach provides a compelling alternative to classical windowing techniques commonly used for RFI detection. Importantly, the complexity of our algorithm depends on the RFI pattern rather than on the size of the observation window. We demonstrate how SigNova improves the detection of various types of RFI (e.g., broadband and narrowband) in time-frequency visibility data. We validate our framework on the Murchison Widefield Array (MWA) telescope and simulated data and the Hydrogen Epoch of Reionization Array (HERA).

frequency, rfi, threshold, (16 more...)

2402.14892

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Oceania > Australia (0.04)
North America > United States > New Jersey (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.54)

arXiv.org Artificial IntelligenceFeb-21-2024

Chasing Convex Functions with Long-term Constraints

Lechowicz, Adam, Christianson, Nicolas, Sun, Bo, Bashir, Noman, Hajiesmaili, Mohammad, Wierman, Adam, Shenoy, Prashant

This paper introduces and studies a novel class of online metric problems with long-term demand constraints motivated by emerging applications in the design of sustainable systems. In convex function chasing with a long-term constraint, an online player aims to satisfy a demand by making decisions in a normed vector space, paying a hitting cost based on time-varying convex cost functions which are revealed online, and switching cost defined by the norm. The player is constrained to ensure that the entire demand is satisfied at or before the time horizon ends, and their objective is to minimize their total cost. The generality of this problem makes it applicable to a wide variety of online resource allocation problems; in this paper, we consider one such special case, discussing its connections to other online settings and suggestions towards broad new areas of inquiry in online optimization with long-term constraints. Our motivation to introduce these problems is rooted in an emerging class of carbon-aware control problems for sustainable systems. A shared objective involves minimizing carbon emissions by shifting flexible workloads temporally and/or spatially to better leverage low-carbon electricity generation (e.g., renewables such as solar and wind).

algorithm, constraint, cost function, (16 more...)

2402.14012

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry: Energy > Renewable (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Neural Information Processing SystemsFeb-16-2024, 11:06:48 GMT

Fast, smooth and adaptive regression in metric spaces

It was recently shown that certain nonparametric regressors can escape the curse of dimensionality in the sense that their convergence rates adapt to the intrinsic dimension of data (\cite{BL:65, SK:77}). We prove some stronger results in more general settings. In particular, we consider a regressor which, by combining aspects of both tree-based regression and kernel regression, operates on a general metric space, yields a smooth function, and evaluates in time O(\log n) . We derive a tight convergence rate of the form n {-2/(2 d)} where d is the Assouad dimension of the input space.

adaptive regression, dimension, metric space, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.69)

Bravo, César, Kozachinskiy, Alexander, Rojas, Cristóbal

On dimensionality of feature vectors in MPNNs

arXiv.org Artificial IntelligenceFeb-6-2024

We revisit the classical result of Morris et al.~(AAAI'19) that message-passing graphs neural networks (MPNNs) are equal in their distinguishing power to the Weisfeiler--Leman (WL) isomorphism test. Morris et al.~show their simulation result with ReLU activation function and $O(n)$-dimensional feature vectors, where $n$ is the number of nodes of the graph. Recently, by introducing randomness into the architecture, Aamand et al.~(NeurIPS'22) were able to improve this bound to $O(\log n)$-dimensional feature vectors, although at the expense of guaranteeing perfect simulation only with high probability. In all these constructions, to guarantee equivalence to the WL test, the dimension of feature vectors in the MPNN has to increase with the size of the graphs. However, architectures used in practice have feature vectors of constant dimension. Thus, there is a gap between the guarantees provided by these results and the actual characteristics of architectures used in practice. In this paper we close this gap by showing that, for \emph{any} non-polynomial analytic (like the sigmoid) activation function, to guarantee that MPNNs are equivalent to the WL test, feature vectors of dimension $d=1$ is all we need, independently of the size of the graphs. Our main technical insight is that for simulating multi-sets in the WL-test, it is enough to use linear independence of feature vectors over rationals instead of reals. Countability of the set of rationals together with nice properties of analytic functions allow us to carry out the simulation invariant over the iterations of the WL test without increasing the dimension of the feature vectors.

feature vector, graph, vector, (15 more...)

2402.03966

Country:

South America > Chile (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Feature Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Pattichis, Rebecca, Pattichis, Marios S.

Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps

arXiv.org Artificial IntelligenceJan-31-2024

There is strong interest in developing mathematical methods that can be used to understand complex neural networks used in image analysis. In this paper, we introduce techniques from Linear Algebra to model neural network layers as maps between signal spaces. First, we demonstrate how signal spaces can be used to visualize weight spaces and convolutional layer kernels. We also demonstrate how residual vector spaces can be used to further visualize information lost at each layer. Second, we introduce the concept of invertible networks and an algorithm for computing input images that yield specific outputs. We demonstrate our approach on two invertible networks and ResNet18.

neural network, signal space, vector, (17 more...)

2402.00261

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.64)

Yi, Mingyang, Wang, Bohan

Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space

arXiv.org Artificial IntelligenceJan-25-2024

Recently, optimization on the Riemannian manifold has provided new insights to the optimization community. In this regard, the manifold taken as the probability measure metric space equipped with the second-order Wasserstein distance is of particular interest, since optimization on it can be linked to practical sampling processes. In general, the oracle (continuous) optimization method on Wasserstein space is Riemannian gradient flow (i.e., Langevin dynamics when minimizing KL divergence). In this paper, we aim to enrich the continuous optimization methods in the Wasserstein space by extending the gradient flow into the stochastic gradient descent (SGD) flow and stochastic variance reduction gradient (SVRG) flow. The two flows on Euclidean space are standard stochastic optimization methods, while their Riemannian counterparts are not explored yet. By leveraging the structures in Wasserstein space, we construct a stochastic differential equation (SDE) to approximate the discrete dynamics of desired stochastic methods in the corresponded random vector space. Then, the flows of probability measures are naturally obtained by applying Fokker-Planck equation to such SDE. Furthermore, the convergence rates of the proposed Riemannian stochastic flows are proven, and they match the results in Euclidean space.

continuous-time riemannian sgd, convergence rate, riemannian sgd and svrg flow, (11 more...)

2401.1353

Country:

North America > United States > Massachusetts (0.04)
Europe > Denmark (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.54)

Kalayci, Yusuf Hakan, Kempe, David, Kher, Vikram

Proportional Representation in Metric Spaces and Low-Distortion Committee Selection

arXiv.org Artificial IntelligenceJan-23-2024

We introduce a novel definition for a small set R of k points being "representative" of a larger set in a metric space. Given a set V (e.g., documents or voters) to represent, and a set C of possible representatives, our criterion requires that for any subset S comprising a theta fraction of V, the average distance of S to their best theta*k points in R should not be more than a factor gamma compared to their average distance to the best theta*k points among all of C. This definition is a strengthening of proportional fairness and core fairness, but - different from those notions - requires that large cohesive clusters be represented proportionally to their size. Since there are instances for which - unless gamma is polynomially large - no solutions exist, we study this notion in a resource augmentation framework, implicitly stating the constraints for a set R of size k as though its size were only k/alpha, for alpha > 1. Furthermore, motivated by the application to elections, we mostly focus on the "ordinal" model, where the algorithm does not learn the actual distances; instead, it learns only for each point v in V and each candidate pairs c, c' which of c, c' is closer to v. Our main result is that the Expanding Approvals Rule (EAR) of Aziz and Lee is (alpha, gamma) representative with gamma <= 1 + 6.71 * (alpha)/(alpha-1). Our results lead to three notable byproducts. First, we show that the EAR achieves constant proportional fairness in the ordinal model, giving the first positive result on metric proportional fairness with ordinal information. Second, we show that for the core fairness objective, the EAR achieves the same asymptotic tradeoff between resource augmentation and approximation as the recent results of Li et al., which used full knowledge of the metric. Finally, our results imply a very simple single-winner voting rule with metric distortion at most 44.

algorithm, fairness, voter, (16 more...)

2312.10369

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.62)