Goto

Collaborating Authors

 self-organizing map


Novel sparse matrix algorithm expands the feasible size of a self-organizing map of the knowledge indexed by a database of peer-reviewed medical literature

Amos, Andrew, Lee, Joanne, Gupta, Tarun Sen, Malau-Aduli, Bunmi S.

arXiv.org Artificial Intelligence

Past efforts to map the Medline database have been limited to small subsets of the available data because of the exponentially increasing memory and processing demands of existing algorithms. We designed a novel algorithm for sparse matrix multiplication that allowed us to apply a self-organizing map to the entire Medline dataset, allowing for a more complete map of existing medical knowledge. The algorithm also increases the feasibility of refining the self-organizing map to account for changes in the dataset over time.


Saturation Self-Organizing Map

Urbanik, Igor, Gajewski, Paweł

arXiv.org Artificial Intelligence

Intelligent agents navigating real-world environments must continuously learn, adapting to new information while retaining prior knowledge [14]. This ability, known as continual or lifelong learning, poses a significant challenge in modern machine learning. Most artificial neural systems struggle with catastrophic forgetting [5], where training on new tasks or data distributions abruptly erases previously learned information. This phenomenon stems from the shared nature of representations in standard neural networks, where updating weights for new data can overwrite information critical for past tasks. Overcoming catastrophic forgetting is crucial for developing robust, adaptable systems that can learn incrementally from data streams, rather than being retrained from scratch. Numerous approaches have been proposed to mitigate catastrophic forgetting, ranging from regularization techniques and memory replay to architectural modifications. However, many state-of-the-art solutions, despite showing promising results, require substantial changes to model structure or training procedures. This often limits their compatibility with widely used and well-understood machine learning frameworks.


torchsom: The Reference PyTorch Library for Self-Organizing Maps

Berthier, Louis, Shokry, Ahmed, Moreaud, Maxime, Ramelet, Guillaume, Moulines, Eric

arXiv.org Machine Learning

This paper introduces torchsom, an open-source Python library that provides a reference implementation of the Self-Organizing Map (SOM) in PyTorch. This package offers three main features: (i) dimensionality reduction, (ii) clustering, and (iii) friendly data visualization. It relies on a PyTorch backend, enabling (i) fast and efficient training of SOMs through GPU acceleration, and (ii) easy and scalable integrations with PyTorch ecosystem. Moreover, torchsom follows the scikit-learn API for ease of use and extensibility.


Class Incremental Continual Learning with Self-Organizing Maps and Variational Autoencoders Using Synthetic Replay

Thapa, Pujan, Ororbia, Alexander, Desell, Travis

arXiv.org Artificial Intelligence

This work introduces a novel generative continual learning framework based on self-organizing maps (SOMs) and variational autoencoders (VAEs) to enable memory-efficient replay, eliminating the need to store raw data samples or task labels. For high-dimensional input spaces, such as of CIFAR-10 and CIFAR-100, we design a scheme where the SOM operates over the latent space learned by a VAE, whereas, for lower-dimensional inputs, such as those found in MNIST and FashionMNIST, the SOM operates in a standalone fashion. Our method stores a running mean, variance, and covariance for each SOM unit, from which synthetic samples are then generated during future learning iterations. For the VAE-based method, generated samples are then fed through the decoder to then be used in subsequent replay. Experimental results on standard class-incremental benchmarks show that our approach performs competitively with state-of-the-art memory-based methods and outperforms memory-free methods, notably improving over best state-of-the-art single class incremental performance on CIFAR-10 and CIFAR-100 by nearly $10$\% and $7$\%, respectively. Our methodology further facilitates easy visualization of the learning process and can also be utilized as a generative model post-training. Results show our method's capability as a scalable, task-label-free, and memory-efficient solution for continual learning.


A new classification system of beer categories and styles based on large-scale data mining and self-organizing maps of beer recipes

Bonatto, Diego

arXiv.org Artificial Intelligence

A data-driven quantitative approach was used to develop a novel classification system for beer categories and styles. Sixty-two thousand one hundred twenty-one beer recipes were mined and analyzed, considering ingredient profiles, fermentation parameters, and recipe vital statistics. Statistical analyses combined with self-organizing maps (SOMs) identified four major superclusters that showed distinctive malt and hop usage patterns, style characteristics, and historical brewing traditions. Cold fermented styles showed a conservative grain and hop composition, whereas hot fermented beers exhibited high heterogeneity, reflecting regional preferences and innovation. This new taxonomy offers a reproducible and objective framework beyond traditional sensory-based classifications, providing brewers, researchers, and educators with a scalable tool for recipe analysis and beer development. The findings in this work provide an understanding of beer diversity and open avenues for linking ingredient usage with fermentation profiles and flavor outcomes.


Enhancing Math Learning in an LMS Using AI-Driven Question Recommendations

Råmunddal, Justus

arXiv.org Artificial Intelligence

This paper presents an AI-driven approach to enhance math learning in a modern Learning Management System (LMS) by recommending similar math questions. Deep embeddings for math questions are generated using Meta's Llama-3.2-11B-Vision-Instruct model, and three recommendation methods-cosine similarity, Self-Organizing Maps (SOM), and Gaussian Mixture Models (GMM)-are applied to identify similar questions. User interaction data, including session durations, response times, and correctness, are used to evaluate the methods. Our findings suggest that while cosine similarity produces nearly identical question matches, SOM yields higher user satisfaction whereas GMM generally underperforms, indicating that introducing variety to a certain degree may enhance engagement and thereby potential learning outcomes until variety is no longer balanced reasonably, which our data about the implementations of all three methods demonstrate.


Simple Self Organizing Map with Visual Transformer

Luo, Alan, Yuan, Kaiwen

arXiv.org Artificial Intelligence

Vision Transformers (ViTs) have demonstrated exceptional performance in various vision tasks. However, they tend to underperform on smaller datasets due to their inherent lack of inductive biases. Current approaches address this limitation implicitly-often by pairing ViTs with pretext tasks or by distilling knowledge from convolutional neural networks (CNNs) to strengthen the prior. In contrast, Self-Organizing Maps (SOMs), a widely adopted self-supervised framework, are inherently structured to preserve topology and spatial organization, making them a promising candidate to directly address the limitations of ViTs in limited or small training datasets. Despite this potential, equipping SOMs with modern deep learning architectures remains largely unexplored. In this study, we conduct a novel exploration on how Vision Transformers (ViTs) and Self-Organizing Maps (SOMs) can empower each other, aiming to bridge this critical research gap. Our findings demonstrate that these architectures can synergistically enhance each other, leading to significantly improved performance in both unsupervised and supervised tasks. Code will be publicly available.


A Survey on Recent Advances in Self-Organizing Maps

Guérin, Axel, Chauvet, Pierre, Saubion, Frédéric

arXiv.org Artificial Intelligence

The Self-Organising Map algorithm is a well-known approach for unsupervised learning, designed to distill a high-dimensional dataset into a more manageable, typically two-dimensional, representation. Imagine a dataset full of p measured variables across n observations. A Self-Organising Map elegantly organises similar observations into groups and visually displays them on a map. This model, also known as Kohonen maps or Kohonen networks, has been introduced by Teuvo Kohonen [Koh82, Koh97]. Unlike conventional neural networks, which rely on error correction, SOM training relies on competitive principles. Kohonen drew inspiration from biological paradigms, in particular the neural models [MP69] and Alan Turing's pioneering theories of morphogenesis [Tur52]. Basically, self-organising maps serve as powerful tools for dissecting and visualising complex data landscapes, facilitating a deeper understanding of the intricate structures and relationships that permeate multidimensional datasets. Self-organising maps, like most artificial neural network architectures, operate in two distinct modes: training and mapping.


Application of Unsupervised Artificial Neural Network (ANN) Self_Organizing Map (SOM) in Identifying Main Car Sales Factors

Taghavi, Mazyar

arXiv.org Artificial Intelligence

Factors which attract customers and persuade them to buy new car are various regarding different consumer tastes. There are some methods to extract pattern form mass data. In this case we firstly asked passenger car marketing experts to rank more important factors which affect customer decision making behavior using fuzzy Delphi technique, then we provided a sample set from questionnaires and tried to apply a useful artificial neural network method called selforganizing map (SOM) to find out which factors have more effect on Iranian customer's buying decision making. Fuzzy tools were applied to adjust the study to be more real. MATLAB software was used for developing and training network. Results report four factors are more important rather than the others. Results are rather different from marketing expert rankings. Such results would help manufacturers to focus on more important factors and increase company sales level.


Hybrid Machine Learning Approach For Real-Time Malicious Url Detection Using Som-Rmo And Rbfn With Tabu Search Optimization

T, Swetha, M, Seshaiah, KL, Hemalatha, BH, ManjunathaKumar, SVN, Murthy

arXiv.org Artificial Intelligence

The proliferation of malicious URLs has become a significant threat to internet security, encompassing SPAM, phishing, malware, and defacement attacks. Traditional detection methods struggle to keep pace with the evolving nature of these threats. Detecting malicious URLs in real-time requires advanced techniques capable of handling large datasets and identifying novel attack patterns. The challenge lies in developing a robust model that combines efficient feature extraction with accurate classification. We propose a hybrid machine learning approach combining Self-Organizing Map based Radial Movement Optimization (SOM-RMO) for feature extraction and Radial Basis Function Network (RBFN) based Tabu Search for classification. SOM-RMO effectively reduces dimensionality and highlights significant features, while RBFN, optimized with Tabu Search, classifies URLs with high precision. The proposed model demonstrates superior performance in detecting various malicious URL attacks. On a benchmark dataset, our approach achieved an accuracy of 96.5%, precision of 95.2%, recall of 94.8%, and an F1-score of 95.0%, outperforming traditional methods significantly.