Goto

Collaborating Authors

 Fotakis, Dimitris


A Query-Driven Approach to Space-Efficient Range Searching

arXiv.org Artificial Intelligence

We initiate a study of a query-driven approach to designing partition trees for range-searching problems. Our model assumes that a data structure is to be built for an unknown query distribution that we can access through a sampling oracle, and must be selected such that it optimizes a meaningful performance parameter on expectation. Our first contribution is to show that a near-linear sample of queries allows the construction of a partition tree with a near-optimal expected number of nodes visited during querying. We enhance this approach by treating node processing as a classification problem, leveraging fast classifiers like shallow neural networks to obtain experimentally efficient query times. Our second contribution is to develop partition trees using sparse geometric separators. Our preprocessing algorithm, based on a sample of queries, builds a balanced tree with nodes associated with separators that minimize query stabs on expectation; this yields both fast processing of each node and a small number of visited nodes, significantly reducing query time.


Reducing Oversmoothing through Informed Weight Initialization in Graph Neural Networks

arXiv.org Artificial Intelligence

In this work, we generalize the ideas of Kaiming initialization to Graph Neural Networks (GNNs) and propose a new scheme (G-Init) that reduces oversmoothing, leading to very good results in node and graph classification tasks. GNNs are commonly initialized using methods designed for other types of Neural Networks, overlooking the underlying graph topology. We analyze theoretically the variance of signals flowing forward and gradients flowing backward in the class of convolutional GNNs. We then simplify our analysis to the case of the GCN and propose a new initialization method. Our results indicate that the new method (G-Init) reduces oversmoothing in deep GNNs, facilitating their effective use. Experimental validation supports our theoretical findings, demonstrating the advantages of deep networks in scenarios with no feature information for unlabeled nodes (i.e., ``cold start'' scenario).


Partially Trained Graph Convolutional Networks Resist Oversmoothing

arXiv.org Artificial Intelligence

In this work we investigate an observation made by Kipf \& Welling, who suggested that untrained GCNs can generate meaningful node embeddings. In particular, we investigate the effect of training only a single layer of a GCN, while keeping the rest of the layers frozen. We propose a basis on which the effect of the untrained layers and their contribution to the generation of embeddings can be predicted. Moreover, we show that network width influences the dissimilarity of node embeddings produced after the initial node features pass through the untrained part of the model. Additionally, we establish a connection between partially trained GCNs and oversmoothing, showing that they are capable of reducing it. We verify our theoretical results experimentally and show the benefits of using deep networks that resist oversmoothing, in a ``cold start'' scenario, where there is a lack of feature information for unlabeled nodes.


GLANCE: Global Actions in a Nutshell for Counterfactual Explainability

arXiv.org Artificial Intelligence

Counterfactual explanations have emerged as an important tool to understand, debug, and audit complex machine learning models. To offer global counterfactual explainability, state-of-the-art methods construct summaries of local explanations, offering a trade-off among conciseness, counterfactual effectiveness, and counterfactual cost or burden imposed on instances. In this work, we provide a concise formulation of the problem of identifying global counterfactuals and establish principled criteria for comparing solutions, drawing inspiration from Pareto dominance. We introduce innovative algorithms designed to address the challenge of finding global counterfactuals for either the entire input space or specific partitions, employing clustering and decision trees as key components. Additionally, we conduct a comprehensive experimental evaluation, considering various instances of the problem and comparing our proposed algorithms with state-of-the-art methods. The results highlight the consistent capability of our algorithms to generate meaningful and interpretable global counterfactual explanations.


Fairness in Ranking: Robustness through Randomization without the Protected Attribute

arXiv.org Artificial Intelligence

There has been great interest in fairness in machine learning, especially in relation to classification problems. In ranking-related problems, such as in online advertising, recommender systems, and HR automation, much work on fairness remains to be done. Two complications arise: first, the protected attribute may not be available in many applications. Second, there are multiple measures of fairness of rankings, and optimization-based methods utilizing a single measure of fairness of rankings may produce rankings that are unfair with respect to other measures. In this work, we propose a randomized method for post-processing rankings, which do not require the availability of the protected attribute. In an extensive numerical study, we show the robustness of our methods with respect to P-Fairness and effectiveness with respect to Normalized Discounted Cumulative Gain (NDCG) from the baseline ranking, improving on previously proposed methods.


Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods

arXiv.org Machine Learning

Deep Neural Networks and Reinforcement Learning methods have empirically shown great promise in tackling challenging combinatorial problems. In those methods a deep neural network is used as a solution generator which is then trained by gradient-based methods (e.g., policy gradient) to successively obtain better solution distributions. In this work we introduce a novel theoretical framework for analyzing the effectiveness of such methods. We ask whether there exist generative models that (i) are expressive enough to generate approximately optimal solutions; (ii) have a tractable, i.e, polynomial in the size of the input, number of parameters; (iii) their optimization landscape is benign in the sense that it does not contain sub-optimal stationary points. Our main contribution is a positive answer to this question. Our result holds for a broad class of combinatorial problems including Max- and Min-Cut, Max-$k$-CSP, Maximum-Weight-Bipartite-Matching, and the Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla gradient descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.


Fairness Aware Counterfactuals for Subgroups

arXiv.org Artificial Intelligence

We start with revisiting (and generalizing) existing notions and introducing new, more refined notions of subgroup fairness. We aim to (a) formulate different aspects of the difficulty of individuals in certain subgroups to achieve recourse, i.e. receive the desired outcome, either at the micro level, considering members of the subgroup individually, or at the macro level, considering the subgroup as a whole, and (b) introduce notions of subgroup fairness that are robust, if not totally oblivious, to the cost of achieving recourse. We accompany these notions with an efficient, model-agnostic, highly parameterizable, and explainable framework for evaluating subgroup fairness. We demonstrate the advantages, the wide applicability, and the efficiency of our approach through a thorough experimental evaluation of different benchmark datasets.


Learning Augmented Online Facility Location

arXiv.org Artificial Intelligence

Online algorithms is a field that deals with algorithmic problems in which the input is not entirely known in advance, but rather arrives in a sequential way. The algorithm is required to make irrevocable decisions, only based on the input received at a given point, and to incur the corresponding irrevocable cost for each of them. Traditionally, in the analysis of online algorithms we assume, rather pessimistically, that an adversary always presents the algorithm with the worst case input. More precisely, the performance of online algorithms is evaluated by the competitive ratio [5], which is the worst-case ratio of total algorithm's cost to the cost of a computationally unlimited optimal algorithm that is aware of the entire request sequence in advance. On the other hand, one of the main goals in the field of machine learning is to predict the unknown based on data, and learn what the world looks like, rather than preparing for the worst that can happen. In an effort to exploit this, there has been a trend in recent years that tries to use machine learning predictions in order to deal with the inherent uncertainty in online algorithms, while still providing worst case performance guarantees. Specifically, one might think that directly using machine learning in online problems should enhance their performance, since by knowing, with some error, how the input looks like, we should be able to come up with almost optimal solutions/approximations. In reality, this turns out not to be true, since the error of the learner does not necessarily remain constant, and it could propagate during different phases of the algorithm and cause a much bigger error. In a recent work, Lykouris and Vassilvitski [4] tried to formulate a framework to provide formal guarantees for these learning augmented online algorithms, in terms of consistency and robustness.


Perfect Sampling from Pairwise Comparisons

arXiv.org Artificial Intelligence

In this work, we study how to efficiently obtain perfect samples from a discrete distribution $\mathcal{D}$ given access only to pairwise comparisons of elements of its support. Specifically, we assume access to samples $(x, S)$, where $S$ is drawn from a distribution over sets $\mathcal{Q}$ (indicating the elements being compared), and $x$ is drawn from the conditional distribution $\mathcal{D}_S$ (indicating the winner of the comparison) and aim to output a clean sample $y$ distributed according to $\mathcal{D}$. We mainly focus on the case of pairwise comparisons where all sets $S$ have size 2. We design a Markov chain whose stationary distribution coincides with $\mathcal{D}$ and give an algorithm to obtain exact samples using the technique of Coupling from the Past. However, the sample complexity of this algorithm depends on the structure of the distribution $\mathcal{D}$ and can be even exponential in the support of $\mathcal{D}$ in many natural scenarios. Our main contribution is to provide an efficient exact sampling algorithm whose complexity does not depend on the structure of $\mathcal{D}$. To this end, we give a parametric Markov chain that mixes significantly faster given a good approximation to the stationary distribution. We can obtain such an approximation using an efficient learning from pairwise comparisons algorithm (Shah et al., JMLR 17, 2016). Our technique for speeding up sampling from a Markov chain whose stationary distribution is approximately known is simple, general and possibly of independent interest.


Metric-Distortion Bounds under Limited Information

Journal of Artificial Intelligence Research

In this work, we study the metric distortion problem in voting theory under a limited amount of ordinal information. Our primary contribution is threefold. First, we consider mechanisms that perform a sequence of pairwise comparisons between candidates. We show that a popular deterministic mechanism employed in many knockout phases yields distortion O(log m) while eliciting only m − 1 out of the Θ(m2 ) possible pairwise comparisons, where m represents the number of candidates. Our analysis for this mechanism leverages a powerful technical lemma developed by Kempe (AAAI ‘20). We also provide a matching lower bound on its distortion. In contrast, we prove that any mechanism which performs fewer than m−1 pairwise comparisons is destined to have unbounded distortion. Moreover, we study the power of deterministic mechanisms under incomplete rankings. Most notably, when agents provide their k-top preferences we show an upper bound of 6m/k + 1 on the distortion, for any k ∈ {1, 2, . . . , m}. Thus, we substantially improve over the previous bound of 12m/k established by Kempe (AAAI ‘20), and we come closer to matching the best-known lower bound. Finally, we are concerned with the sample complexity required to ensure near-optimal distortion with high probability. Our main contribution is to show that a random sample of Θ(m/ϵ2 ) voters suffices to guarantee distortion 3 + ϵ with high probability, for any sufficiently small ϵ > 0. This result is based on analyzing the sensitivity of the deterministic mechanism introduced by Gkatzelis, Halpern, and Shah (FOCS ‘20). Importantly, all of our sample-complexity bounds are distribution-independent. From an experimental standpoint, we present several empirical findings on real-life voting applications, comparing the scoring systems employed in practice with a mechanism explicitly minimizing (metric) distortion. Interestingly, for our case studies, we find that the winner in the actual competition is typically the candidate who minimizes the distortion.