Goto

Collaborating Authors

 Supervised Learning


Infinity Search: Approximate Vector Search with Projections on q-Metric Spaces

arXiv.org Artificial Intelligence

Despite the ubiquity of vector search applications, prevailing search algorithms overlook the metric structure of vector embeddings, treating it as a constraint rather than exploiting its underlying properties. In this paper, we demonstrate that in $q$-metric spaces, metric trees can leverage a stronger version of the triangle inequality to reduce comparisons for exact search. Notably, as $q$ approaches infinity, the search complexity becomes logarithmic. Therefore, we propose a novel projection method that embeds vector datasets with arbitrary dissimilarity measures into $q$-metric spaces while preserving the nearest neighbor. We propose to learn an approximation of this projection to efficiently transform query points to a space where euclidean distances satisfy the desired properties. Our experimental results with text and image vector embeddings show that learning $q$-metric approximations enables classic metric tree algorithms -- which typically underperform with high-dimensional data -- to achieve competitive performance against state-of-the-art search methods.


Comparing the Effects of Persistence Barcodes Aggregation and Feature Concatenation on Medical Imaging

arXiv.org Artificial Intelligence

In medical image analysis, feature engineering plays an important role in the design and performance of machine learning models. Persistent homology (PH), from the field of topological data analysis (TDA), demonstrates robustness and stability to data perturbations and addresses the limitation from traditional feature extraction approaches where a small change in input results in a large change in feature representation. Using PH, we store persistent topological and geometrical features in the form of the persistence barcode whereby large bars represent global topological features and small bars encapsulate geometrical information of the data. When multiple barcodes are computed from 2D or 3D medical images, two approaches can be used to construct the final topological feature vector in each dimension: aggregating persistence barcodes followed by featurization or concatenating topological feature vectors derived from each barcode. In this study, we conduct a comprehensive analysis across diverse medical imaging datasets to compare the effects of the two aforementioned approaches on the performance of classification models. The results of this analysis indicate that feature concatenation preserves detailed topological information from individual barcodes, yields better classification performance and is therefore a preferred approach when conducting similar experiments.


Learning Fair And Effective Points-Based Rewards Programs

arXiv.org Artificial Intelligence

Points-based rewards programs are a prevalent way to incentivize customer loyalty; in these programs, customers who make repeated purchases from a seller accumulate points, working toward eventual redemption of a free reward. These programs have recently come under scrutiny due to accusations of unfair practices in their implementation. Motivated by these concerns, we study the problem of fairly designing points-based rewards programs, with a focus on two obstacles that put fairness at odds with their effectiveness. First, due to customer heterogeneity, the seller should set different redemption thresholds for different customers to generate high revenue. Second, the relationship between customer behavior and the number of accumulated points is typically unknown; this requires experimentation which may unfairly devalue customers' previously earned points. We first show that an individually fair rewards program that uses the same redemption threshold for all customers suffers a loss in revenue of at most a factor of $1+\ln 2$, compared to the optimal personalized strategy that differentiates between customers. We then tackle the problem of designing temporally fair learning algorithms in the presence of demand uncertainty. Toward this goal, we design a learning algorithm that limits the risk of point devaluation due to experimentation by only changing the redemption threshold $O(\log T)$ times, over a horizon of length $T$. This algorithm achieves the optimal (up to polylogarithmic factors) $\widetilde{O}(\sqrt{T})$ regret in expectation. We then modify this algorithm to only ever decrease redemption thresholds, leading to improved fairness at a cost of only a constant factor in regret. Extensive numerical experiments show the limited value of personalization in average-case settings, in addition to demonstrating the strong practical performance of our proposed learning algorithms.


Multilingual Information Retrieval with a Monolingual Knowledge Base

arXiv.org Artificial Intelligence

Multilingual information retrieval has emerged as powerful tools for expanding knowledge sharing across languages. On the other hand, resources on high quality knowledge base are often scarce and in limited languages, therefore an effective embedding model to transform sentences from different languages into a feature vector space same as the knowledge base language becomes the key ingredient for cross language knowledge sharing, especially to transfer knowledge available in high-resource languages to low-resource ones. In this paper we propose a novel strategy to fine-tune multilingual embedding models with weighted sampling for contrastive learning, enabling multilingual information retrieval with a monolingual knowledge base. We demonstrate that the weighted sampling strategy produces performance gains compared to standard ones by up to 31.03\% in MRR and up to 33.98\% in Recall@3. Additionally, our proposed methodology is language agnostic and applicable for both multilingual and code switching use cases.


Should Decision-Makers Reveal Classifiers in Online Strategic Classification?

arXiv.org Artificial Intelligence

Strategic classification addresses a learning problem where a decision-maker implements a classifier over agents who may manipulate their features in order to receive favorable predictions. In the standard model of online strategic classification, in each round, the decision-maker implements and publicly reveals a classifier, after which agents perfectly best respond based on this knowledge. However, in practice, whether to disclose the classifier is often debated -- some decision-makers believe that hiding the classifier can prevent misclassification errors caused by manipulation. In this paper, we formally examine how limiting the agents' access to the current classifier affects the decision-maker's performance. Specifically, we consider an extended online strategic classification setting where agents lack direct knowledge about the current classifier and instead manipulate based on a weighted average of historically implemented classifiers. Our main result shows that in this setting, the decision-maker incurs $(1-γ)^{-1}$ or $k_{\text{in}}$ times more mistakes compared to the full-knowledge setting, where $k_{\text{in}}$ is the maximum in-degree of the manipulation graph (representing how many distinct feature vectors can be manipulated to appear as a single one), and $γ$ is the discount factor indicating agents' memory of past classifiers. Our results demonstrate how withholding access to the classifier can backfire and degrade the decision-maker's performance in online strategic classification.


Reviews: Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Neural Information Processing Systems

The authors study the problem of setting (individual) reserve prices in a scenario of repeated contextual second-price auctions. The buyers are assumed strategic, i.e. they optimize a cumulative discounted utility, where their valuations are linear functions of the feature vector of a good. The considered scenario explicitly assumes existence of noise in the market. The seller's goal is to find an algorithm for setting prices that has sub-linear regret. Two algorithms are proposed: - the first one attain O(d log(Td) log(T)) regret bound, when the market noise distribution is known to the seller.


Metric Space Magnitude for Evaluating the Diversity of Latent Representations

Neural Information Processing Systems

The magnitude of a metric space is a novelinvariant that provides a measure of the'effective size' of a space acrossmultiple scales, while also capturing numerous geometrical properties, such as curvature, density, or entropy.We develop a family of magnitude-based measures of the intrinsicdiversity of latent representations, formalising a novel notion ofdissimilarity between magnitude functions of finite metric spaces.Our measures are provably stable under perturbations of the data, can beefficiently calculated, and enable a rigorous multi-scale characterisation and comparison oflatent representations. We show their utility and superior performance across different domains and tasks, includingthe automated estimation of diversity,the detection of mode collapse, andthe evaluation of generative models for text, image, and graph data.


Improved Guarantees for Fully Dynamic k -Center Clustering with Outliers in General Metric Spaces

Neural Information Processing Systems

The metric k -center clustering problem with z outliers, also known as (k,z) -center clustering, involves clustering a given point set P in a metric space (M,d) using at most k balls, minimizing the maximum ball radius while excluding up to z points from the clustering. This problem holds fundamental significance in various domains such as machine learning, data mining, and database systems.This paper addresses the fully dynamic version of the problem, where the point set undergoes continuous updates (insertions and deletions) over time. The objective is to maintain an approximate (k,z) -center clustering with efficient update times. We propose a novel fully dynamic algorithm that maintains a (4 \epsilon) -approximate solution to the (k,z) -center clustering problem that covers all but at most (1 \epsilon)z points at any time in the sequence with probability 1-k/e {\Omega(\log k)} . The algorithm achieves an expected amortized update time of \mathcal{O}(\epsilon {-2} k 6\log(k) \log(\Delta)), and is applicable to general metric spaces.


First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces

Neural Information Processing Systems

From optimal transport to robust dimensionality reduction, many machine learning applicationscan be cast into the min-max optimization problems over Riemannian manifolds. Though manymin-max algorithms have been analyzed in the Euclidean setting, it has been elusive how theseresults translate to the Riemannian case. Zhang et al. (2022) have recently identified that geodesic convexconcave Riemannian problems admit always Sion's saddle point solutions. Immediately, an importantquestion that arises is if a performance gap between the Riemannian and the optimal Euclidean spaceconvex concave algorithms is necessary. Our work is the first to answer the question in the negative:We prove that the Riemannian corrected extragradient (RCEG) method achieves last-iterate at alinear convergence rate at the geodesically strongly convex concave case, matching the euclidean one.Our results also extend to the stochastic or non-smooth case where RCEG & Riemanian gradientascent descent (RGDA) achieve respectively near-optimal convergence rates up to factors dependingon curvature of the manifold.


RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space

Neural Information Processing Systems

Deep Reinforcement Learning (DRL) algorithms have achieved great success in solving many challenging tasks while their black-box nature hinders interpretability and real-world applicability, making it difficult for human experts to interpret and understand DRL policies. Existing works on interpretable reinforcement learning have shown promise in extracting decision tree (DT) based policies from DRL policies with most focus on the single-agent settings while prior attempts to introduce DT policies in multi-agent scenarios mainly focus on heuristic designs which do not provide any quantitative guarantees on the expected return.In this paper, we establish an upper bound on the return gap between the oracle expert policy and an optimal decision tree policy. This enables us to recast the DT extraction problem into a novel non-euclidean clustering problem over the local observation and action values space of each agent, with action values as cluster labels and the upper bound on the return gap as clustering loss.Both the algorithm and the upper bound are extended to multi-agent decentralized DT extractions by an iteratively-grow-DT procedure guided by an action-value function conditioned on the current DTs of other agents. Further, we propose the Return-Gap-Minimization Decision Tree (RGMDT) algorithm, which is a surprisingly simple design and is integrated with reinforcement learning through the utilization of a novel Regularized Information Maximization loss. Evaluations on tasks like D4RL show that RGMDT significantly outperforms heuristic DT-based baselines and can achieve nearly optimal returns under given DT complexity constraints (e.g., maximum number of DT nodes).