partitioner
BioBlobs: Differentiable Graph Partitioning for Protein Representation Learning
Protein function is driven by coherent substructures which vary in size and topology, yet current protein representation learning models (PRL) distort these signals by relying on rigid substructures such as k-hop and fixed radius neighbourhoods. We introduce BioBlobs, a plug-and-play, fully differentiable module that represents proteins by dynamically partitioning structures into flexibly-sized, non-overlapping substructures ("blobs"). The resulting blobs are quantized into a shared and interpretable codebook, yielding a discrete vocabulary of function-relevant protein substructures used to compute protein embeddings. We show that BioBlobs representations improve the performance of widely used protein encoders such as GVP-GNN across various PRL tasks. Our approach highlights the value of architectures that directly capture function-relevant protein substructures, enabling both improved predictive performance and mechanistic insight into protein function.
A Definition of a batch normalization layer When applying batch normalization to convolutional layers, the inputs and outputs of normalization layers are 4-dimensional tensors, which we denote by I
For distributed training, the batch statistics are usually estimated locally on a subset of the training minibatch ("ghost batch normalization" [ We now define the three models in full. These inputs first pass through a single fully connected linear layer of width 1000. We then apply a series of residual blocks. LeCun normal initialization [48] to preserve the variance in the absence of non-linearities. We then apply a series of residual blocks.
Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations
Sinha, Pranav, Jha, Sumit Kumar, Raj, Sunny
--We are in the midst of the noisy intermediate-scale quantum (NISQ) era, where quantum computers are limited by noisy gates, some of which are more error-prone than others and can render the final computation incomprehensible. Quantum circuit compilation algorithms attempt to minimize these noisy gates when mapping quantum algorithms onto quantum hardware but face computational challenges that restrict their application to circuits with no more than 5-6 qubits, necessitating the need to partition large circuits before the application of noisy quantum gate minimization algorithms. The existing generation of these algorithms is heuristic in nature and does not account for downstream gate minimization tasks. Large language models (LLMs) have the potential to change this and help improve quantum circuit partitions. This paper investigates the use of LLMs, such as Llama and Mistral, for partitioning quantum circuits by capitalizing on their abilities to understand and generate code, including QASM. Specifically, we teach LLMs to partition circuits using the quick partition approach of the Berkeley Quantum Synthesis T oolkit. Through experimental evaluations, we show that careful fine-tuning of open source LLMs enables us to obtain an accuracy of 53.4% for the partition task while over-the-shelf LLMs are unable to correctly partition circuits, using standard 1-shot and few-shot training approaches. Quantum circuit compilation converts a quantum algorithm written in a high-level language into elementary quantum gates supported by the quantum hardware. This conversion consists of multiple steps, including first converting the high-level language into intermediate languages (QASM) before mapping onto the quantum computing hardware. Different quantum computing hardware supports different gates, and as can be expected, the mapping from QASM, which is written using generalized gate sets, to the quantum hardware is an involved and complicated process.
Predictable and Performant Reactive Synthesis Modulo Theories via Functional Synthesis
Rodríguez, Andoni, Gorostiaga, Felipe, Sánchez, César
Reactive synthesis is the process of generating correct controllers from temporal logic specifications. Classical LTL reactive synthesis handles (propositional) LTL as a specification language. Boolean abstractions allow reducing LTLt specifications (i.e., LTL with propositions replaced by literals from a theory calT), into equi-realizable LTL specifications. In this paper we extend these results into a full static synthesis procedure. The synthesized system receives from the environment valuations of variables from a rich theory calT and outputs valuations of system variables from calT. We use the abstraction method to synthesize a reactive Boolean controller from the LTL specification, and we combine it with functional synthesis to obtain a static controller for the original LTLt specification. We also show that our method allows responses in the sense that the controller can optimize its outputs in order to e.g., always provide the smallest safe values. This is the first full static synthesis method for LTLt, which is a deterministic program (hence predictable and efficient).
K-SpecPart: Supervised embedding algorithms and cut overlay for improved hypergraph partitioning
Bustany, Ismail, Kahng, Andrew B., Koutis, Ioannis, Pramanik, Bodhisatta, Wang, Zhiang
State-of-the-art hypergraph partitioners follow the multilevel paradigm that constructs multiple levels of progressively coarser hypergraphs that are used to drive cut refinement on each level of the hierarchy. Multilevel partitioners are subject to two limitations: (i) hypergraph coarsening processes rely on local neighborhood structure without fully considering the global structure of the hypergraph; and (ii) refinement heuristics risk entrapment in local minima. In this paper, we describe K-SpecPart, a supervised spectral framework for multi-way partitioning that directly tackles these two limitations. K-SpecPart relies on the computation of generalized eigenvectors and supervised dimensionality reduction techniques to generate vertex embeddings. These are computational primitives that are fast and capture global structural properties of the hypergraph that are not explicitly considered by existing partitioners. K-SpecPart then converts the vertex embeddings into multiple partitioning solutions. K-SpecPart introduces the idea of ''ensembling'' multiple solutions via a cut-overlay clustering technique that often enables the use of computationally demanding partitioning methods such as ILP (integer linear programming). Using the output of a standard partitioner as a supervision hint, K-SpecPart effectively combines the strengths of established multilevel partitioning techniques with the benefits of spectral graph theory and other combinatorial algorithms. K-SpecPart significantly extends ideas and algorithms that first appeared in our previous work on the bipartitioner SpecPart. Our experiments demonstrate the effectiveness of K-SpecPart. For bipartitioning, K-SpecPart produces solutions with up to 15% cutsize improvement over SpecPart. For multi-way partitioning, K-SpecPart produces solutions with up to 20% cutsize improvement over leading partitioners hMETIS and KaHyPar.
Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR
Alabed, Sami, Grewe, Dominik, Franco, Juliana, Chrzaszcz, Bart, Natan, Tom, Norman, Tamara, Rink, Norman A., Vytiniotis, Dimitrios, Schaarschmidt, Michael
Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline partitioning; and optimizer sharding techniques. However, identifying efficient combinations for many model architectures and accelerator systems requires significant manual analysis. In this work, we present an automatic partitioner that identifies these combinations through a goal-oriented search. Our key findings are that a Monte Carlo Tree Search-based partitioner leveraging partition-specific compiler analysis directly into the search and guided goals matches expert-level strategies for various models.
Robustness Analysis of Neural Networks via Efficient Partitioning: Theory and Applications in Control Systems
Everett, Michael, Habibi, Golnaz, How, Jonathan P.
Neural networks (NNs) are now routinely implemented on systems that must operate in uncertain environments, but the tools for formally analyzing how this uncertainty propagates to NN outputs are not yet commonplace. Computing tight bounds on NN output sets (given an input set) provides a measure of confidence associated with the NN decisions and is essential to deploy NNs on safety-critical systems. Recent works approximate the propagation of sets through nonlinear activations or partition the uncertainty set to provide a guaranteed outer bound on the set of possible NN outputs. However, the bound looseness causes excessive conservatism and/or the computation is too slow for online analysis. This paper unifies propagation and partition approaches to provide a family of robustness analysis algorithms that give tighter bounds than existing works for the same amount of computation time (or reduced computational effort for a desired accuracy level). Moreover, we provide new partitioning techniques that are aware of their current bound estimates and desired boundary shape (e.g., lower bounds, weighted $\ell_\infty$-ball, convex hull), leading to further improvements in the computation-tightness tradeoff. The paper demonstrates the tighter bounds and reduced conservatism of the proposed robustness analysis framework with examples from model-free RL and forward kinematics learning.
Parallel K-Means Clustering With Reducer Function - DZone AI
In functional programming, a fold is a higher-order function, also known as reduce and accumulate, whose purpose is to reduce a given data structure, typically a sequence of elements, into a single value. For example, a reduction could return an average value for a series of numbers, calculating a summation, maximum value, or minimum value. The fold function takes an initial value, commonly called the accumulator, which is used as a container for intermediate results. Then, the second argument it takes is a binary expression that acts as a reduction function to apply against each element in the sequence to finally return the new value for the accumulator. In general, reduction works as follows.
Introducing Social Hash Partitioner, a scalable distributed hypergraph partitioner
As a single host has limited storage and compute resources, our storage systems shard data items over multiple hosts and our batch jobs execute over clusters of thousands of workers, to scale and speed-up the computation. Our VLDB'17 paper, Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner, describes a new method for partitioning bipartite graphs while minimizing fan-out. We describe the resulting framework as a Social Hash Partitioner (SHP) because it can be used as the hypergraph partitioning component of the Social Hash framework introduced in our earlier NSDI'16 paper. The fan-out reduction model is applicable to many infrastructure optimization problems at Facebook, like data sharding, query routing and index compression.
Introducing Social Hash Partitioner, a scalable distributed hypergraph partitioner
Facebook serves billions of people each day. To support this scale, we distribute our workloads at many different levels. Our users' traffic is routed to one of several worldwide data-centers to improve scalability, fault tolerance and latency. As a single host has limited storage and compute resources, our storage systems shard data items over multiple hosts and our batch jobs execute over clusters of thousands of workers, to scale and speed-up the computation. At the heart of these systems, is a fundamental decision on how to assign a set of elements (requests, data items, or jobs) to one of several groups (data centers, hosts, or workers).