local expert
TimeExpert: Boosting Long Time Series Forecasting with Temporal Mix of Experts
Ma, Xiaowen, Ge, Shuning, Yang, Fan, Li, Xiangyu, Chen, Yun, Ma, Mengting, Zhang, Wei, Liu, Zhipeng
Transformer-based architectures dominate time series modeling by enabling global attention over all timestamps, yet their rigid 'one-size-fits-all' context aggregation fails to address two critical challenges in real-world data: (1) inherent lag effects, where the relevance of historical timestamps to a query varies dynamically; (2) anomalous segments, which introduce noisy signals that degrade forecasting accuracy. To resolve these problems, we propose the Temporal Mix of Experts (TMOE), a novel attention-level mechanism that reimagines key-value (K-V) pairs as local experts (each specialized in a distinct temporal context) and performs adaptive expert selection for each query via localized filtering of irrelevant timestamps. Complementing this local adaptation, a shared global expert preserves the Transformer's strength in capturing long-range dependencies. We then replace the vanilla attention mechanism in popular time-series Transformer frameworks (i.e., PatchTST and Timer) with TMOE, without extra structural modifications, yielding our specific version TimeExpert and general version TimeExpert-G. Extensive experiments on seven real-world long-term forecasting benchmarks demonstrate that TimeExpert and TimeExpert-G outperform state-of-the-art methods. Code is available at https://github.com/xwmaxwma/TimeExpert.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
A Statistical Mixture-of-Experts Framework for EMG Artifact Removal in EEG: Empirical Insights and a Proof-of-Concept Application
Choi, Benjamin J., Milsap, Griffin, Scholl, Clara A., Tenore, Francesco, Ogg, Mattson
Effective control of neural interfaces is limited by poor signal quality. While neural network-based electroencephalography (EEG) denoising methods for electromyogenic (EMG) artifacts have improved in recent years, current state-of-the-art (SOTA) models perform suboptimally in settings with high noise. To address the shortcomings of current machine learning (ML)-based denoising algorithms, we present a signal filtration algorithm driven by a new mixture-of-experts (MoE) framework. Our algorithm leverages three new statistical insights into the EEG-EMG denoising problem: (1) EMG artifacts can be partitioned into quantifiable subtypes to aid downstream MoE classification, (2) local experts trained on narrower signal-to-noise ratio (SNR) ranges can achieve performance increases through specialization, and (3) correlation-based objective functions, in conjunction with rescaling algorithms, can enable faster convergence in a neural network-based denoising context. We empirically demonstrate these three insights into EMG artifact removal and use our findings to create a new downstream MoE denoising algorithm consisting of convolutional (CNN) and recurrent (RNN) neural networks. We tested all results on a major benchmark dataset (EEGdenoiseNet) collected from 67 subjects. We found that our MoE denoising model achieved competitive overall performance with SOTA ML denoising algorithms and superior lower bound performance in high noise settings. These preliminary results highlight the promise of our MoE framework for enabling advances in EMG artifact removal for EEG processing, especially in high noise settings. Further research and development will be necessary to assess our MoE framework on a wider range of real-world test cases and explore its downstream potential to unlock more effective neural interfaces.
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Maryland > Prince George's County > Laurel (0.04)
- Health & Medicine > Therapeutic Area (0.34)
- Health & Medicine > Health Care Technology (0.34)
Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models
Leoni, Jessica, Breschi, Valentina, Formentin, Simone, Tanelli, Mara
These approaches fall into four categories: physicconstrained, Over recent decades, advances in mechanics and electronics serial, parallel, and ensemble strategies. In have led to the development of increasingly sophisticated the physic-constrained category, techniques either integrate systems with complex and multi-physics dynamics, exposing physically meaningful features from first principles into limitations in first principle-based representations [17]. ML models or explicitly include physical constraints, such Modeling these advanced systems purely based on domain as boundary conditions, into the loss function (see, e.g., knowledge may inadequately capture the overall system behavior, the working principle of physics-informed neural networks often necessitating the formulation of complex partial (PINN)) [7,?].
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.66)
FedMS: Federated Learning with Mixture of Sparsely Activated Foundations Models
Wu, Panlong, Li, Kangshuo, Wang, Ting, Wang, Fangxin
Foundation models have shown great success in natural language processing, computer vision, and multimodal tasks. FMs have a large number of model parameters, thus requiring a substantial amount of data to help optimize the model during the training. Federated learning has revolutionized machine learning by enabling collaborative learning from decentralized data while still preserving the data privacy of clients. Despite the great benefits foundation models can have empowered by federated learning, they face severe computation, communication, and statistical challenges. In this paper, we propose a novel two-stage federated learning algorithm called FedMS. A global expert is trained in the first stage and a local expert is trained in the second stage to provide better personalization. We construct a Mixture of Foundation Models (MoFM) with these two experts and design a gate neural network with an inserted gate adapter that joins the aggregation every communication round in the second stage. To further adapt to edge computing scenarios with limited computational resources, we design a novel Sparsely Activated LoRA (SAL) algorithm that freezes the pre-trained foundation model parameters inserts low-rank adaptation matrices into transformer blocks and activates them progressively during the training. We employ extensive experiments to verify the effectiveness of FedMS, results show that FedMS outperforms other SOTA baselines by up to 55.25% in default settings.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Hong Kong (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Gaussian Graphical Models as an Ensemble Method for Distributed Gaussian Processes
Jalali, Hamed, Kasneci, Gjergji
Distributed Gaussian process (DGP) is a popular approach to scale GP to big data which divides the training data into some subsets, performs local inference for each partition, and aggregates the results to acquire global prediction. To combine the local predictions, the conditional independence assumption is used which basically means there is a perfect diversity between the subsets. Although it keeps the aggregation tractable, it is often violated in practice and generally yields poor results. In this paper, we propose a novel approach for aggregating the Gaussian experts' predictions by Gaussian graphical model (GGM) where the target aggregation is defined as an unobserved latent variable and the local predictions are the observed variables. We first estimate the joint distribution of latent and observed variables using the Expectation-Maximization (EM) algorithm. The interaction between experts can be encoded by the precision matrix of the joint distribution and the aggregated predictions are obtained based on the property of conditional Gaussian distribution. Using both synthetic and real datasets, our experimental evaluations illustrate that our new method outperforms other state-of-the-art DGP approaches.
- North America > United States (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
Aggregating Dependent Gaussian Experts in Local Approximation
Jalali, Hamed, Kasneci, Gjergji
Distributed Gaussian processes (DGPs) are prominent local approximation methods to scale Gaussian processes (GPs) to large datasets. Instead of a global estimation, they train local experts by dividing the training set into subsets, thus reducing the time complexity. This strategy is based on the conditional independence assumption, which basically means that there is a perfect diversity between the local experts. In practice, however, this assumption is often violated, and the aggregation of experts leads to sub-optimal and inconsistent solutions. In this paper, we propose a novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence. The dependency between experts is determined by using a Gaussian graphical model, which yields the precision matrix. The precision matrix encodes conditional dependencies between experts and is used to detect strongly dependent experts and construct an improved aggregation. Using both synthetic and real datasets, our experimental evaluations illustrate that our new method outperforms other state-of-the-art (SOTA) DGP approaches while being substantially more time-efficient than SOTA approaches, which build on independent experts.
- North America > United States (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Lebanon (0.04)
Ensemble of Sparse Gaussian Process Experts for Implicit Surface Mapping with Streaming Data
Stork, Johannes A., Stoyanov, Todor
Creating maps is an essential task in robotics and provides the basis for effective planning and navigation. In this paper, we learn a compact and continuous implicit surface map of an environment from a stream of range data with known poses. For this, we create and incrementally adjust an ensemble of approximate Gaussian process (GP) experts which are each responsible for a different part of the map. Instead of inserting all arriving data into the GP models, we greedily trade-off between model complexity and prediction error. Our algorithm therefore uses less resources on areas with few geometric features and more where the environment is rich in variety. We evaluate our approach on synthetic and real-world data sets and analyze sensitivity to parameters and measurement noise. The results show that we can learn compact and accurate implicit surface models under different conditions, with a performance comparable to or better than that of exact GP regression with subsampled data.
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- Europe > Sweden (0.04)
- Asia > Japan > Honshū > Chūbu > Aichi Prefecture > Nagoya (0.04)
Using machine learning to accelerate ecological research
Using machine learning to accelerate ecological research Using machine learning to accelerate ecological research Share Pushmeet Kohli * External authors The Serengeti is one of the last remaining sites in the world that hosts an intact community of large mammals. These animals roam over vast swaths of land, some migrating thousands of miles across multiple countries following seasonal rainfall. As human encroachment around the park becomes more intense, these species are forced to alter their behaviours in order to survive. Increasing agriculture, poaching, and climate abnormalities contribute to changes in animal behaviours and population dynamics, but these changes have occurred at spatial and temporal scales which are difficult to monitor using traditional research methods. There is a great urgency to understand how these animal communities function as human pressures grow, both in order to understand the dynamics of these last pristine ecosystems, and to formulate effective management plans to conserve and protect the integrity of this unique biodiversity hotspot.
Towards Collaborative Conceptual Exploration
In domains with high knowledge distribution a natural objective is to create principle foundations for collaborative interactive learning environments. We present a first mathematical characterization of a collaborative learning group, a consortium, based on closure systems of attribute sets and the well-known attribute exploration algorithm from formal concept analysis. To this end, we introduce (weak) local experts for subdomains of a given knowledge domain. These entities are able to refute and potentially accept a given (implicational) query for some closure system that is a restriction of the whole domain. On this we build up a consortial expert and show first insights about the ability of such an expert to answer queries. Furthermore, we depict techniques on how to cope with falsely accepted implications and on combining counterexamples. Using notions from combinatorial design theory we further expand those insights as far as providing first results on the decidability problem if a given consortium is able to explore some target domain. Applications in conceptual knowledge acquisition as well as in collaborative interactive ontology learning are at hand.
- Oceania > Australia (0.04)
- North America > United States > New York > Montgomery County > Amsterdam (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (2 more...)