specialisation
Association-sensory spatiotemporal hierarchy and functional gradient-regularised recurrent neural network with implications for schizophrenia
Abulikemu, Subati, Radmard, Puria, Mamalakis, Michail, Suckling, John
The human neocortex is functionally organised at its highest level along a continuous sensory-to-association (AS) hierarchy. This study characterises the AS hierarchy of patients with schizophrenia in a comparison with controls. Using a large fMRI dataset (N=355), we extracted individual AS gradients via spectral analysis of brain connectivity, quantified hierarchical specialisation by gradient spread, and related this spread with connectivity geometry. We found that schizophrenia compresses the AS hierarchy indicating reduced functional differentiation. By modelling neural timescale with the Ornstein-Uhlenbeck process, we observed that the most specialised, locally cohesive regions at the gradient extremes exhibit dynamics with a longer time constant, an effect that is attenuated in schizophrenia. To study computation, we used the gradients to regularise subject-specific recurrent neural networks (RNNs) trained on working memory tasks. Networks endowed with greater gradient spread learned more efficiently, plateaued at lower task loss, and maintained stronger alignment to the prescribed AS hierarchical geometry. Fixed point linearisation showed that high-range networks settled into more stable neural states during memory delay, evidenced by lower energy and smaller maximal Jacobian eigenvalues. This gradient-regularised RNN framework therefore links large-scale cortical architecture with fixed point stability, providing a mechanistic account of how gradient de-differentiation could destabilise neural computations in schizophrenia, convergently supported by empirical timescale flattening and model-based evidence of less stable fixed points.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States (0.14)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Adaptive Heterogeneous Mixtures of Normalising Flows for Robust Variational Inference
Wiriyapong, Benjamin, Karakuş, Oktay, Sidorov, Kirill
Normalising-flow variational inference (VI) can approximate complex posteriors, yet single-flow models often behave inconsistently across qualitatively different distributions. We propose Adaptive Mixture Flow Variational Inference (AMF-VI), a heterogeneous mixture of complementary flows (MAF, Re-alNVP, RBIG) trained in two stages: (i) sequential expert training of individual flows, and (ii) adaptive global weight estimation via likelihood-driven updates, without per-sample gating or architectural changes. Evaluated on six canonical posterior families of banana, X-shape, two-moons, rings, a bimodal, and a five-mode mixture, AMF-VI achieves consistently lower negative log-likelihood than each single-flow baseline and delivers stable gains in transport metrics (Wasserstein-2) and maximum mean discrepancy (MDD), indicating improved robustness across shapes and modalities. The procedure is efficient and architecture-agnostic, incurring minimal overhead relative to standard flow training, and demonstrates that adaptive mixtures of diverse flows provide a reliable route to robust VI across diverse posterior families whilst preserving each expert's inductive bias.
- Europe > United Kingdom (0.04)
- Asia > Middle East > Jordan (0.04)
A Theory of Initialisation's Impact on Specialisation
Jarvis, Devon, Lee, Sebastian, Dominé, Clémentine Carla Juliette, Saxe, Andrew M, Mannelli, Stefano Sarao
Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. {Finally, we show that specialization by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique.
A case for specialisation in non-human entities
El-Mhamdi, El-Mahdi, Hoang, Lê-Nguyên, Tighanimine, Mariame
With the rise of large multi-modal AI models, fuelled by recent interest in large language models (LLMs), the notion of artificial general intelligence (AGI) went from being restricted to a fringe community, to dominate mainstream large AI development programs. In contrast, in this paper, we make a \emph{case for specialisation}, by reviewing the pitfalls of generality and stressing the industrial value of specialised systems. Our contribution is threefold. First, we review the most widely accepted arguments \emph{against} specialisation, and discuss how their relevance in the context of human labour is actually an argument \emph{for} specialisation in the case of non human agents, be they algorithms or human organisations. Second, we propose four arguments \emph{in favor of} specialisation, ranging from machine learning robustness, to computer security, social sciences and cultural evolution. Third, we finally make a case for \emph{specification}, discuss how the machine learning approach to AI has so far failed to catch up with good practices from safety-engineering and formal verification of software, and discuss how some emerging good practices in machine learning help reduce this gap. In particular, we justify the need for \emph{specified governance} for hard-to-specify systems.
- North America > Canada > Ontario (0.14)
- Europe > United Kingdom > England (0.14)
- Oceania > Australia (0.14)
- (5 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (3 more...)
Efficient rule induction by ignoring pointless rules
Cropper, Andrew, Cerna, David M.
The goal of inductive logic programming (ILP) is to find a set of logical rules that generalises training examples and background knowledge. We introduce an ILP approach that identifies pointless rules. A rule is pointless if it contains a redundant literal or cannot discriminate against negative examples. We show that ignoring pointless rules allows an ILP system to soundly prune the hypothesis space. Our experiments on multiple domains, including visual reasoning and game playing, show that our approach can reduce learning times by 99% whilst maintaining predictive accuracies.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- (8 more...)
SEKE: Specialised Experts for Keyword Extraction
Martinc, Matej, Tran, Hanh Thi Hong, Pollak, Senja, Koloski, Boshko
Keyword extraction involves identifying the most descriptive words in a document, allowing automatic categorisation and summarisation of large quantities of diverse textual data. Relying on the insight that real-world keyword detection often requires handling of diverse content, we propose a novel supervised keyword extraction approach based on the mixture of experts (MoE) technique. MoE uses a learnable routing sub-network to direct information to specialised experts, allowing them to specialize in distinct regions of the input space. SEKE, a mixture of Specialised Experts for supervised Keyword Extraction, uses DeBERTa as the backbone model and builds on the MoE framework, where experts attend to each token, by integrating it with a recurrent neural network (RNN), to allow successful extraction even on smaller corpora, where specialisation is harder due to lack of training data. The MoE framework also provides an insight into inner workings of individual experts, enhancing the explainability of the approach. We benchmark SEKE on multiple English datasets, achieving state-of-the-art performance compared to strong supervised and unsupervised baselines. Our analysis reveals that depending on data size and type, experts specialize in distinct syntactic and semantic components, such as punctuation, stopwords, parts-of-speech, or named entities. Code is available at: https://github.com/matejMartinc/SEKE_keyword_extraction
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Japan (0.04)
- North America > Canada (0.04)
- (14 more...)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
LISTN: Lexicon induction with socio-temporal nuance
In-group language is an important signifier of group dynamics. This paper proposes a novel method for inducing lexicons of in-group language, which incorporates its socio-temporal context. Existing methods for lexicon induction do not capture the evolving nature of in-group language, nor the social structure of the community. Using dynamic word and user embeddings trained on conversations from online anti-women communities, our approach outperforms prior methods for lexicon induction. We develop a test set for the task of lexicon induction and a new lexicon of manosphere language, validated by human experts, which quantifies the relevance of each term to a specific sub-community at a given point in time. Finally, we present novel insights on in-group language which illustrate the utility of this approach.
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
- Oceania > Australia (0.04)
- North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
- Law Enforcement & Public Safety (0.93)
- Health & Medicine (0.68)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
Tessera, Kale-ab Abebe, Rahman, Arrasy, Albrecht, Stefano V.
Balancing individual specialisation and shared behaviours is a critical challenge in multi-agent reinforcement learning (MARL). Existing methods typically focus on encouraging diversity or leveraging shared representations. Full parameter sharing (FuPS) improves sample efficiency but struggles to learn diverse behaviours when required, while no parameter sharing (NoPS) enables diversity but is computationally expensive and sample inefficient. To address these challenges, we introduce HyperMARL, a novel approach using hypernetworks to balance efficiency and specialisation. HyperMARL generates agent-specific actor and critic parameters, enabling agents to adaptively exhibit diverse or homogeneous behaviours as needed, without modifying the learning objective or requiring prior knowledge of the optimal diversity. Furthermore, HyperMARL decouples agent-specific and state-based gradients, which empirically correlates with reduced policy gradient variance, potentially offering insights into its ability to capture diverse behaviours. Across MARL benchmarks requiring homogeneous, heterogeneous, or mixed behaviours, HyperMARL consistently matches or outperforms FuPS, NoPS, and diversity-focused methods, achieving NoPS-level diversity with a shared architecture. These results highlight the potential of hypernetworks as a versatile approach to the trade-off between specialisation and shared behaviours in MARL.
- South America > Brazil > São Paulo (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York (0.04)
- (2 more...)
Learning logic programs by finding minimal unsatisfiable subprograms
Cropper, Andrew, Hocquette, Céline
The goal of inductive logic programming (ILP) is to search for a logic program that generalises training examples and background knowledge. We introduce an ILP approach that identifies minimal unsatisfiable subprograms (MUSPs). We show that finding MUSPs allows us to efficiently and soundly prune the search space. Our experiments on multiple domains, including program synthesis and game playing, show that our approach can reduce learning times by 99%.
Learning big logical rules by joining small rules
Hocquette, Céline, Niskanen, Andreas, Morel, Rolf, Järvisalo, Matti, Cropper, Andrew
A major challenge in inductive logic programming is learning big rules. To address this challenge, we introduce an approach where we join small rules to learn big rules. We implement our approach in a constraint-driven system and use constraint solvers to efficiently join rules. Our experiments on many domains, including game playing and drug design, show that our approach can (i) learn rules with more than 100 literals, and (ii) drastically outperform existing approaches in terms of predictive accuracies.
- Europe > Finland > Uusimaa > Helsinki (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Bulgaria > Plovdiv Province > Plovdiv (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)