jegelka
Performance Heterogeneity in Graph Neural Networks: Lessons for Architecture Design and Preprocessing
Graph Neural Networks have emerged as the most popular architecture for graph-level learning, including graph classification and regression tasks, which frequently arise in areas such as biochemistry and drug discovery. Achieving good performance in practice requires careful model design. Due to gaps in our understanding of the relationship between model and data characteristics, this often requires manual architecture and hyperparameter tuning. This is particularly pronounced in graph-level tasks, due to much higher variation in the input data than in node-level tasks. To work towards closing these gaps, we begin with a systematic analysis of individual performance in graph-level tasks. Our results establish significant performance heterogeneity in both message-passing and transformer-based architectures. We then investigate the interplay of model and data characteristics as drivers of the observed heterogeneity. Our results suggest that graph topology alone cannot explain heterogeneity. Using the Tree Mover's Distance, which jointly evaluates topological and feature information, we establish a link between class-distance ratios and performance heterogeneity in graph classification. These insights motivate model and data preprocessing choices that account for heterogeneity between graphs. We propose a selective rewiring approach, which only targets graphs whose individual performance benefits from rewiring. We further show that the optimal network depth depends on the graph's spectrum, which motivates a heuristic for choosing the number of GNN layers. Our experiments demonstrate the utility of both design choices in practice.
- Education (0.54)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.50)
Indirect Query Bayesian Optimization with Integrated Feedback
Zhang, Mengyan, Bouabid, Shahine, Ong, Cheng Soon, Flaxman, Seth, Sejdinovic, Dino
We develop the framework of Indirect Query Bayesian Optimization (IQBO), a new class of Bayesian optimization problems where the integrated feedback is given via a conditional expectation of the unknown function $f$ to be optimized. The underlying conditional distribution can be unknown and learned from data. The goal is to find the global optimum of $f$ by adaptively querying and observing in the space transformed by the conditional distribution. This is motivated by real-world applications where one cannot access direct feedback due to privacy, hardware or computational constraints. We propose the Conditional Max-Value Entropy Search (CMES) acquisition function to address this novel setting, and propose a hierarchical search algorithm to address the multi-resolution setting and improve the computational efficiency. We show regret bounds for our proposed methods and demonstrate the effectiveness of our approaches on simulated optimization tasks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Massachusetts (0.04)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
FSW-GNN: A Bi-Lipschitz WL-Equivalent Graph Neural Network
Sverdlov, Yonatan, Davidson, Yair, Dym, Nadav, Amir, Tal
Many of the most popular graph neural networks fall into the category of message-passing neural networks (MPNNs). Famously, MPNNs' ability to distinguish between graphs is limited to graphs separable by the Weisfeiler-Lemann (WL) graph isomorphism test, and the strongest MPNNs, in terms of separation power, are WL-equivalent. Recently, it was shown that the quality of separation provided by standard WL-equivalent MPNN can be very low, resulting in WL-separable graphs being mapped to very similar, hardly distinguishable features. This paper addresses this issue by seeking bi-Lipschitz continuity guarantees for MPNNs. We demonstrate that, in contrast with standard summation-based MPNNs, which lack bi-Lipschitz properties, our proposed model provides a bi-Lipschitz graph embedding with respect to two standard graph metrics. Empirically, we show that our MPNN is competitive with standard MPNNs for several graph learning tasks and is far more accurate in over-squashing long-range tasks.
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- North America > United States > Texas (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
Unpacking the "black box" to build better AI models
When deep learning models are deployed in the real world, perhaps to detect financial fraud from credit card activity or identify cancer in medical images, they are often able to outperform humans. But what exactly are these deep learning models learning? Does a model trained to spot skin cancer in clinical images, for example, actually learn the colors and textures of cancerous tissue, or is it flagging some other features or patterns? These powerful machine-learning models are typically based on artificial neural networks that can have millions of nodes that process data to make predictions. Due to their complexity, researchers often call these models "black boxes" because even the scientists who build them don't understand everything that is going on under the hood.
- Health & Medicine > Therapeutic Area (0.70)
- Education > Educational Setting (0.48)
Distributionally Robust Bayesian Optimization
Kirschner, Johannes, Bogunovic, Ilija, Jegelka, Stefanie, Krause, Andreas
Robustness to distributional shift is one of the key challenges of contemporary machine learning. Attaining such robustness is the goal of distributionally robust optimization, which seeks a solution to an optimization problem that is worst-case robust under a specified distributional shift of an uncontrolled covariate. In this paper, we study such a problem when the distributional shift is measured via the maximum mean discrepancy (MMD). For the setting of zeroth-order, noisy optimization, we present a novel distributionally robust Bayesian optimization algorithm (DRBO). Our algorithm provably obtains sub-linear robust regret in various settings that differ in how the uncertain covariate is observed. We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- Europe > Italy (0.04)
Data diversity
When data sets get too big, sometimes the only way to do anything useful with them is to extract much smaller subsets and analyze those instead. Those subsets have to preserve certain properties of the full sets, however, and one property that's useful in a wide range of applications is diversity. If, for instance, you're using your data to train a machine-learning system, you want to make sure that the subset you select represents the full range of cases that the system will have to confront. Last week at the Conference on Neural Information Processing Systems, researchers from MIT's Computer Science and Artificial Intelligence Laboratory and its Laboratory for Information and Decision Systems presented a new algorithm that makes the selection of diverse subsets much more practical. Whereas the running times of earlier subset-selection algorithms depended on the number of data points in the complete data set, the running time of the new algorithm depends on the number of data points in the subset.
On Unconstrained Quasi-Submodular Function Optimization
Mei, Jincheng (Shanghai Jiao Tong University) | Zhao, Kang (Shanghai Jiao Tong University) | Lu, Bao-Liang (Shanghai Jiao Tong University)
With the extensive application of submodularity, its generalizations are constantly being proposed. However, most of them are tailored for special problems. In this paper, we focus on quasi-submodularity, a universal generalization, which satisfies weaker properties than submodularity but still enjoys favorable performance in optimization. Similar to the diminishing return property of submodularity, we first define a corresponding property called the single sub-crossing , then we propose two algorithms for unconstrained quasi-submodular function minimization and maximization, respectively. The proposed algorithms return the reduced lattices in O(n) iterations, and guarantee the objective function values are strictly monotonically increased or decreased after each iteration. Moreover, any local and global optima are definitely contained in the reduced lattices. Experimental results verify the effectiveness and efficiency of the proposed algorithms on lattice reduction.
- Asia > China > Shanghai > Shanghai (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Inferring and Learning from Neuronal Correspondences
Kapoor, Ashish, Frady, E. Paxon, Jegelka, Stefanie, Kristan, William B., Horvitz, Eric
We introduce and study methods for inferring and learning from correspondences among neurons. The approach enables alignment of data from distinct multiunit studies of nervous systems. We show that the methods for inferring correspondences combine data effectively from cross-animal studies to make joint inferences about behavioral decision making that are not possible with the data from a single animal. We focus on data collection, machine learning, and prediction in the representative and long-studied invertebrate nervous system of the European medicinal leech. Acknowledging the computational intractability of the general problem of identifying correspondences among neurons, we introduce efficient computational procedures for matching neurons across animals. The methods include techniques that adjust for missing cells or additional cells in the different data sets that may reflect biological or experimental variation.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)