Goto

Collaborating Authors

 permutation problem


SPECTra: Scalable Multi-Agent Reinforcement Learning with Permutation-Free Networks

Park, Hyunwoo, Seong, Baekryun, Ko, Sang-Ki

arXiv.org Artificial Intelligence

In cooperative multi-agent reinforcement learning (MARL), the permutation problem where the state space grows exponentially with the number of agents reduces sample efficiency. Additionally, many existing architectures struggle with scalability, relying on a fixed structure tied to a specific number of agents, limiting their applicability to environments with a variable number of entities. While approaches such as graph neural networks (GNNs) and self-attention mechanisms have progressed in addressing these challenges, they have significant limitations as dense GNNs and self-attention mechanisms incur high computational costs. To overcome these limitations, we propose a novel agent network and a non-linear mixing network that ensure permutation-equivariance and scalability, allowing them to generalize to environments with various numbers of agents. Our agent network significantly reduces computational complexity, and our scalable hypernetwork enables efficient weight generation for non-linear mixing. Additionally, we introduce curriculum learning to improve training efficiency. Experiments on SMACv2 and Google Research Football (GRF) demonstrate that our approach achieves superior learning performance compared to existing methods. By addressing both permutation-invariance and scalability in MARL, our work provides a more efficient and adaptable framework for cooperative MARL. Our code is available at https://github.com/funny-rl/SPECTra.


On Fitness Landscape Analysis of Permutation Problems: From Distance Metrics to Mutation Operator Selection

Cicirello, Vincent A.

arXiv.org Artificial Intelligence

In this paper, we explore the theory and expand upon the practice of fitness landscape analysis for optimization problems over the space of permutations. Many of the computational and analytical tools for fitness landscape analysis, such as fitness distance correlation, require identifying a distance metric for measuring the similarity of different solutions to the problem. We begin with a survey of the available distance metrics for permutations, and then use principal component analysis to classify these metrics. The result of this analysis aligns with existing classifications of permutation problem types produced through less formal means, including the A-permutation, R-permutation, and P-permutation types, which classifies problems by whether absolute position of permutation elements, relative positions of elements, or general precedence of pairs of elements, is the dominant influence over solution fitness. Additionally, the formal analysis identifies subtypes within these problem categories. We see that the classification can assist in identifying appropriate metrics based on optimization problem feature for use in fitness landscape analysis. Using optimization problems of each class, we also demonstrate how the classification scheme can subsequently inform the choice of mutation operator within an evolutionary algorithm. From this, we present a classification of a variety of mutation operators as a counterpart to that of the metrics. Our implementations of the permutation metrics, permutation mutation operators, and associated evolutionary algorithm, are available in a pair of open source Java libraries. All of the code necessary to recreate our analysis and experimental results are also available as open source.


Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds

Kinoshita, Keisuke, Delcroix, Marc, Tawara, Naohiro

arXiv.org Machine Learning

Recent diarization technologies can be categorized into two approaches, i.e., clustering and end-to-end neural approaches, which have different pros and cons. The clustering-based approaches assign speaker labels to speech regions by clustering speaker embeddings such as x-vectors. While it can be seen as a current state-of-the-art approach that works for various challenging data with reasonable robustness and accuracy, it has a critical disadvantage that it cannot handle overlapped speech that is inevitable in natural conversational data. In contrast, the end-to-end neural diarization (EEND), which directly predicts diarization labels using a neural network, was devised to handle the overlapped speech. While the EEND, which can easily incorporate emerging deep-learning technologies, has started outperforming the x-vector clustering approach in some realistic database, it is difficult to make it work for `long' recordings (e.g., recordings longer than 10 minutes) because of, e.g., its huge memory consumption. Block-wise independent processing is also difficult because it poses an inter-block label permutation problem, i.e., an ambiguity of the speaker label assignments between blocks. In this paper, we propose a simple but effective hybrid diarization framework that works with overlapped speech and for long recordings containing an arbitrary number of speakers. It modifies the conventional EEND framework to simultaneously output global speaker embeddings so that speaker clustering can be performed across blocks to solve the permutation problem. With experiments based on simulated noisy reverberant 2-speaker meeting-like data, we show that the proposed framework works significantly better than the original EEND especially when the input data is long.


Convex Relaxations for Permutation Problems

Fogel, Fajwel, Jenatton, Rodolphe, Bach, Francis, D', Aspremont, Alexandre

Neural Information Processing Systems

Seriation seeks to reconstruct a linear order between variables using unsorted similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We prove the equivalence between the seriation and the combinatorial 2-sum problem (a quadratic minimization problem over permutations) over a class of similarity matrices. The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we produce a convex relaxation for the 2-sum problem to improve the robustness of solutions in a noisy setting. This relaxation also allows us to impose additional structural constraints on the solution, to solve semi-supervised seriation problems. Papers published at the Neural Information Processing Systems Conference.


Kernels of Mallows Models under the Hamming Distance for solving the Quadratic Assignment Problem

Arza, Etor, Perez, Aritz, Irurozki, Ekhine, Ceberio, Josu

arXiv.org Machine Learning

The Quadratic Assignment Problem (QAP) is a well-known permutation-based combinatorial optimization problem with real applications in industrial and logistics environments. Motivated by the challenge that this NPhard problem represents, it has captured the attention of the evolutionary computation community for decades. As a result, a large number of algorithms have been proposed to optimize this algorithm. Among these, exact methods are only able to solve instances of size n 40, and thus, many heuristic and metaheuristic methods have been applied to the QAP. In this work, we follow this direction by approaching the QAP through Estimation of Distribution Algorithms (EDAs). Particularly, a nonparametric distance-based exponential probabilistic model is used. Based on the analysis of the characteristics of the QAP, and previous work in the area, we introduce Kernels of Mallows Model under the Hamming distance to the context of EDAs. Conducted experiments point out that the performance of the proposed algorithm in the QAP is superior to (i) the classical EDAs adapted to deal with the QAP, and also (ii) to the specific EDAs proposed in the literature to deal with permutation problems.1. Introduction The Quadratic Assignment Problem (QAP) [30] is a well known combinatorial optimization problem. Along with other problems, such as the traveling salesman problem, the linear ordering problem and the flowshop scheduling problem, it belongs to the family of permutation-based (a permutation is a bijection of the set {1,...,n } onto itself) problems [10]. The QAP has been applied in many different environments over the years, to name but a few notable examples, selecting optimal hospital layouts [24], optimally placing components on circuit boards [44], assigning gates at airports [23] or optimizing data transmission [38]. Sahni and Gonzalez [45] proved that the QAP is an NPhard optimization problem, and as such, no polynomial-time exact algorithm can solve this problem unless P NP. M: The size of the set of selected solutions. S: The number of new solutions generated at each iteration.


Multi-scenario deep learning for multi-speaker source separation

Zegers, Jeroen, Van hamme, Hugo

arXiv.org Machine Learning

Research in deep learning for multi-speaker source separation has received a boost in the last years. However, most studies are restricted to mixtures of a specific number of speakers, called a specific scenario. While some works included experiments for different scenarios, research towards combining data of different scenarios or creating a single model for multiple scenarios have been very rare. In this work it is shown that data of a specific scenario is relevant for solving another scenario. Furthermore, it is concluded that a single model, trained on different scenarios is capable of matching performance of scenario specific models.


Monaural Audio Speaker Separation with Source Contrastive Estimation

Stephenson, Cory, Callier, Patrick, Ganesh, Abhinav, Ni, Karl

arXiv.org Machine Learning

We propose an algorithm to separate simultaneously speaking persons from each other, the "cocktail party problem", using a single microphone. Our approach involves a deep recurrent neural networks regression to a vector space that is descriptive of independent speakers. Such a vector space can embed empirically determined speaker characteristics and is optimized by distinguishing between speaker masks. We call this technique source-contrastive estimation. The methodology is inspired by negative sampling, which has seen success in natural language processing, where an embedding is learned by correlating and de-correlating a given input vector with output weights. Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference. Doing so will ensure that source separation is explicitly speaker-independent. Our approach is similar to recent deep neural network clustering and permutation-invariant training research; we use weighted spectral features and masks to augment individual speaker frequencies while filtering out other speakers. We avoid, however, the severe computational burden of other approaches with our technique. Furthermore, by training a vector space rather than combinations of different speakers or differences thereof, we avoid the so-called permutation problem during training. Our algorithm offers an intuitive, computationally efficient response to the cocktail party problem, and most importantly boasts better empirical performance than other current techniques.