Goto

Collaborating Authors

 flattening


Flattening a Hierarchical Clustering through Active Learning

Neural Information Processing Systems

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to {\em linear-time} implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.


Reviews: Flattening a Hierarchical Clustering through Active Learning

Neural Information Processing Systems

This paper derives complexity results for active learning queries to hierarchical clustering. The result is a partition or "cut", c, of the cluster tree, where the "flat" clustering is defined by the clusters at the leaves of a subtree of nodes AB(c) that have the same root as the original cluster tree. Learning occurs by making pairwise judgments on items (leaf nodes). All pairwise judgments form a "ground truth" matrix \Sigma. Given consistency conditions, this is an equivalent way to represent a clustering.


Reviews: Flattening a Hierarchical Clustering through Active Learning

Neural Information Processing Systems

The reviewers appreciate the fact that the algorithm can achieve sharp query complexity guarantees under challenging noisy settings. The only weakness of the paper is motivation - what is a practical scenario where we have these two sources of data?


Flattening a Hierarchical Clustering through Active Learning

Neural Information Processing Systems

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to {\em linear-time} implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.


The Flattening of AI

#artificialintelligence

Much in the same way that our eyes refresh our view through continual movement, the means to rise up and upend our thinking are found in this tumbling of relations. Unflattening, we remind ourselves of what it is to open our eyes to the world for the first time.¹ There has been a flattening of AI. But first, let's rewind a few seconds… Artificial Intelligence was originally about Strong AI, starting back in the 1950s and even before that under various other names. Strong AI essentially means human-level intelligence. And this was a massive mountain to tackle.


Forecast Analysis of the COVID-19 Incidence in Lebanon: Prediction of Future Epidemiological Trends to Plan More Effective Control Programs

Falou, Salah El, Trad, Fouad

arXiv.org Artificial Intelligence

Ever since the COVID-19 pandemic started, all the governments have been trying to limit its effects on their citizens and countries. This pandemic was harsh on different levels for almost all populations worldwide and this is what drove researchers and scientists to get involved and work on several kinds of simulations to get a better insight into this virus and be able to stop it the earliest possible. In this study, we simulate the spread of COVID-19 in Lebanon using an Agent-Based Model where people are modeled as agents that have specific characteristics and behaviors determined from statistical distributions using Monte Carlo Algorithm. These agents can go into the world, interact with each other, and thus, infect each other. This is how the virus spreads. During the simulation, we can introduce different Non-Pharmaceutical Interventions - or more commonly NPIs - that aim to limit the spread of the virus (wearing a mask, closing locations, etc). Our Simulator was first validated on concepts (e.g. Flattening the Curve and Second Wave scenario), and then it was applied on the case of Lebanon. We studied the effect of opening schools and universities on the pandemic situation in the country since the Lebanese Ministry of Education is planning to do so progressively, starting from 21 April 2021. Based on the results we obtained, we conclude that it would be better to delay the school openings while the vaccination campaign is still slow in the country.


Flattening a Hierarchical Clustering through Active Learning

Vitale, Fabio, Rajagopalan, Anand, Gentile, Claudio

Neural Information Processing Systems

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to {\em linear-time} implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.


Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting

Grover, Aditya, Song, Jiaming, Agarwal, Alekh, Tran, Kenneth, Kapoor, Ashish, Horvitz, Eric, Ermon, Stefano

arXiv.org Machine Learning

A learned generative model often produces biased statistics relative to the underlying data distribution. A standard technique to correct this bias is importance sampling, where samples from the model are weighted by the likelihood ratio under model and true distributions. When the likelihood ratio is unknown, it can be estimated by training a probabilistic classifier to distinguish samples from the two distributions. In this paper, we employ this likelihood-free importance weighting framework to correct for the bias in state-of-the-art deep generative models. We find that this technique consistently improves standard goodness-of-fit metrics for evaluating the sample quality of state-of-the-art generative models, suggesting reduced bias. Finally, we demonstrate its utility on representative applications in a) data augmentation for classification using generative adversarial networks, and b) model-based policy evaluation using off-policy data.


Inconsistent Node Flattening for Improving Top-down Hierarchical Classification

Naik, Azad, Rangwala, Huzefa

arXiv.org Machine Learning

Large-scale classification of data where classes are structurally organized in a hierarchy is an important area of research. Top-down approaches that exploit the hierarchy during the learning and prediction phase are efficient for large scale hierarchical classification. However, accuracy of top-down approaches is poor due to error propagation i.e., prediction errors made at higher levels in the hierarchy cannot be corrected at lower levels. One of the main reason behind errors at the higher levels is the presence of inconsistent nodes that are introduced due to the arbitrary process of creating these hierarchies by domain experts. In this paper, we propose two different data-driven approaches (local and global) for hierarchical structure modification that identifies and flattens inconsistent nodes present within the hierarchy. Our extensive empirical evaluation of the proposed approaches on several image and text datasets with varying distribution of features, classes and training instances per class shows improved classification performance over competing hierarchical modification approaches. Specifically, we see an improvement upto 7% in Macro-F1 score with our approach over best TD baseline. SOURCE CODE: http://www.cs.gmu.edu/~mlbio/InconsistentNodeFlattening