Calgary
Multichannel-to-Multichannel Target Sound Extraction Using Direction and Timestamp Clues
We propose a multichannel-to-multichannel target sound extraction (M2M-TSE) framework for separating multichannel target signals from a multichannel mixture of sound sources. Target sound extraction (TSE) isolates a specific target signal using user-provided clues, typically focusing on single-channel extraction with class labels or temporal activation maps. However, to preserve and utilize spatial information in multichannel audio signals, it is essential to extract multichannel signals of a target sound source. Moreover, the clue for extraction can also include spatial or temporal cues like direction-of-arrival (DoA) or timestamps of source activation. To address these challenges, we present an M2M framework that extracts a multichannel sound signal based on spatio-temporal clues. We demonstrate that our transformer-based architecture can successively accomplish the M2M-TSE task for multichannel signals synthesized from audio signals of diverse classes in different room environments. Furthermore, we show that the multichannel extraction task introduces sufficient inductive bias in the DNN, allowing it to directly handle DoA clues without utilizing hand-crafted spatial features.
Steinmetz Neural Networks for Complex-Valued Data
Venkatasubramanian, Shyam, Pezeshki, Ali, Tarokh, Vahid
In this work, we introduce a new approach to processing complex-valued data using DNNs consisting of parallel real-valued subnetworks with coupled outputs. Our proposed class of architectures, referred to as Steinmetz Neural Networks, leverages multi-view learning to construct more interpretable representations within the latent space. Subsequently, we present the Analytic Neural Network, which implements a consistency penalty that encourages analytic signal representations in the Steinmetz neural network's latent space. This penalty enforces a deterministic and orthogonal relationship between the real and imaginary components. Utilizing an information-theoretic construction, we demonstrate that the upper bound on the generalization error posited by the analytic neural network is lower than that of the general class of Steinmetz neural networks. Our numerical experiments demonstrate the improved performance and robustness to additive noise, afforded by our proposed networks on benchmark datasets and synthetic examples.
Towards Multi-view Graph Anomaly Detection with Similarity-Guided Contrastive Clustering
Zheng, Lecheng, Birge, John R., Zhang, Yifang, He, Jingrui
Anomaly detection on graphs plays an important role in many real-world applications. Usually, these data are composed of multiple types (e.g., user information and transaction records for financial data), thus exhibiting view heterogeneity. Therefore, it can be challenging to leverage such multi-view information and learn the graph's contextual information to identify rare anomalies. To tackle this problem, many deep learning-based methods utilize contrastive learning loss as a regularization term to learn good representations. However, many existing contrastive-based methods show that traditional contrastive learning losses fail to consider the semantic information (e.g., class membership information). In addition, we theoretically show that clustering-based contrastive learning also easily leads to a sub-optimal solution. To address these issues, in this paper, we proposed an autoencoder-based clustering framework regularized by a similarity-guided contrastive loss to detect anomalous nodes. Specifically, we build a similarity map to help the model learn robust representations without imposing a hard margin constraint between the positive and negative pairs. Theoretically, we show that the proposed similarity-guided loss is a variant of contrastive learning loss, and how it alleviates the issue of unreliable pseudo-labels with the connection to graph spectral clustering. Experimental results on several datasets demonstrate the effectiveness and efficiency of our proposed framework.
Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption
Lai, Zhizheng, Zhou, Yufei, Zheng, Peijia, Chen, Lin
The recently proposed Kolmogorov-Arnold Networks (KANs) offer enhanced interpretability and greater model expressiveness. However, KANs also present challenges related to privacy leakage during inference. Homomorphic encryption (HE) facilitates privacy-preserving inference for deep learning models, enabling resource-limited users to benefit from deep learning services while ensuring data security. Yet, the complex structure of KANs, incorporating nonlinear elements like the SiLU activation function and B-spline functions, renders existing privacy-preserving inference techniques inadequate. To address this issue, we propose an accurate and efficient privacy-preserving inference scheme tailored for KANs. Our approach introduces a task-specific polynomial approximation for the SiLU activation function, dynamically adjusting the approximation range to ensure high accuracy on real-world datasets. Additionally, we develop an efficient method for computing B-spline functions within the HE domain, leveraging techniques such as repeat packing, lazy combination, and comparison functions. We evaluate the effectiveness of our privacy-preserving KAN inference scheme on both symbolic formula evaluation and image classification. The experimental results show that our model achieves accuracy comparable to plaintext KANs across various datasets and outperforms plaintext MLPs. Additionally, on the CIFAR-10 dataset, our inference latency achieves over 7 times speedup compared to the naive method.
Probabilistic Spatiotemporal Modeling of Day-Ahead Wind Power Generation with Input-Warped Gaussian Processes
Wind power is one of the fastest-growing renewable energy sectors and a key pillar for the transition to a carbon-free economy. In 2023, energy from wind accounted for 10.2% of all U.S. utility-scale electricity generation [54]. Being intrinsically weather-driven, wind power injects uncertainty into the balancing of power demand and generation. On the daily operational time scale, quantifying the asset-specific and area-wide uncertainty of renewable generation for the next day is an essential ingredient of grid management. Specifically, grid operators need probabilistic spatiotemporal forecasting of wind power in order to appropriately set grid reserves, ensure grid stability, and optimize dispatch of grid resources. Our goal is to develop a statistical framework for short-term wind power generation simulations across space and time. This project is motivated by working with a large dataset of wind generation in the Electric Reliability Council of Texas (ERCOT) region and is geared to the concrete practical concerns faced by electricity grid operators. We refer to our team's related publications [8, 7, 52, 38] that employ similar simulations for various downstream risk management tasks; other use cases are discussed, among others, in [27, 33, 35, 58].
Time-Distributed Feature Learning for Internet of Things Network Traffic Classification
Manjunath, Yoga Suhas Kuruba, Zhao, Sihao, Zhang, Xiao-Ping, Zhao, Lian
Deep learning-based network traffic classification (NTC) techniques, including conventional and class-of-service (CoS) classifiers, are a popular tool that aids in the quality of service (QoS) and radio resource management for the Internet of Things (IoT) network. Holistic temporal features consist of inter-, intra-, and pseudo-temporal features within packets, between packets, and among flows, providing the maximum information on network services without depending on defined classes in a problem. Conventional spatio-temporal features in the current solutions extract only space and time information between packets and flows, ignoring the information within packets and flow for IoT traffic. Therefore, we propose a new, efficient, holistic feature extraction method for deep-learning-based NTC using time-distributed feature learning to maximize the accuracy of the NTC. We apply a time-distributed wrapper on deep-learning layers to help extract pseudo-temporal features and spatio-temporal features. Pseudo-temporal features are mathematically complex to explain since, in deep learning, a black box extracts them. However, the features are temporal because of the time-distributed wrapper; therefore, we call them pseudo-temporal features. Since our method is efficient in learning holistic-temporal features, we can extend our method to both conventional and CoS NTC. Our solution proves that pseudo-temporal and spatial-temporal features can significantly improve the robustness and performance of any NTC. We analyze the solution theoretically and experimentally on different real-world datasets. The experimental results show that the holistic-temporal time-distributed feature learning method, on average, is 13.5% more accurate than the state-of-the-art conventional and CoS classifiers.
Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data
Timonen, Juho, Lähdesmäki, Harri
Gaussian process (GP) models that combine both categorical and continuous input variables have found use in longitudinal data analysis of and computer experiments. However, standard inference for these models has the typical cubic scaling, and common scalable approximation schemes for GPs cannot be applied since the covariance function is non-continuous. In this work, we derive a basis function approximation scheme for mixed-domain covariance functions, which scales linearly with respect to the number of observations and total number of basis functions. The proposed approach is naturally applicable to also Bayesian GP regression with discrete observation models. We demonstrate the scalability of the approach and compare model reduction techniques for additive GP models in a longitudinal data context. We confirm that we can approximate the exact GP model accurately in a fraction of the runtime compared to fitting the corresponding exact model. In addition, we demonstrate a scalable model reduction workflow for obtaining smaller and more interpretable models when dealing with a large number of candidate predictors.
Task-Oriented Communication for Graph Data: A Graph Information Bottleneck Approach
Li, Shujing, Wang, Yanhu, Guo, Shuaishuai, Feng, Chenyuan
Graph data, essential in fields like knowledge representation and social networks, often involves large networks with many nodes and edges. Transmitting these graphs can be highly inefficient due to their size and redundancy for specific tasks. This paper introduces a method to extract a smaller, task-focused subgraph that maintains key information while reducing communication overhead. Our approach utilizes graph neural networks (GNNs) and the graph information bottleneck (GIB) principle to create a compact, informative, and robust graph representation suitable for transmission. The challenge lies in the irregular structure of graph data, making GIB optimization complex. We address this by deriving a tractable variational upper bound for the objective function. Additionally, we propose the VQ-GIB mechanism, integrating vector quantization (VQ) to convert subgraph representations into a discrete codebook sequence, compatible with existing digital communication systems. Our experiments show that this GIB-based method significantly lowers communication costs while preserving essential task-related information. The approach demonstrates robust performance across various communication channels, suitable for both continuous and discrete systems.
Fair Railway Network Design
He, Zixu, Botan, Sirin, Lang, Jérôme, Saffidine, Abdallah, Sikora, Florian, Workman, Silas
When designing a public transportation network in a country, one may want to minimise the sum of travel duration of all inhabitants. This corresponds to a purely utilitarian view and does not involve any fairness consideration, as the resulting network will typically benefit the capital city and/or large central cities while leaving some peripheral cities behind. On the other hand, a more egalitarian view will allow some people to travel between peripheral cities without having to go through a central city. We define a model, propose algorithms for computing solution networks, and report on experiments based on real data.
Optimal L-Systems for Stochastic L-system Inference Problems
This paper presents two novel theorems that address two open problems in stochastic Lindenmayer-system (L-system) inference, specifically focusing on the construction of an optimal stochastic L-system capable of generating a given sequence of strings. The first theorem delineates a method for crafting a stochastic L-system that maximizes the likelihood of producing a given sequence of words through a singular derivation. Furthermore, the second theorem determines the stochastic L-systems with the highest probability of producing a given sequence of words with multiple possible derivations. From these, we introduce an algorithm to infer an optimal stochastic L-system from a given sequence. This algorithm incorporates sophisticated optimization techniques, such as interior point methods, ensuring production of a stochastically optimal stochastic L-system suitable for generating the given sequence. This allows for the use of using stochastic L-systems as model for machine learning using only positive data for training.