granmo
A Methodology for Transparent Logic-Based Classification Using a Multi-Task Convolutional Tsetlin Machine
Shende, Mayur Kishor, Granmo, Ole-Christoffer, Helin, Runar, Zadorozhny, Vladimir I., Shafik, Rishad
Abstract--The Tsetlin Machine (TM) is a novel machine learning paradigm that employs finite-state automata for learning and utilizes propositional logic to represent patterns. Due to its simplistic approach, TMs are inherently more interpretable than learning algorithms based on Neural Networks. The Con-volutional TM has shown comparable performance on various datasets such as MNIST, K-MNIST, F-MNIST and CIF AR-2. In this paper, we explore the applicability of the TM architecture for large-scale multi-channel (RGB) image classification. We propose a methodology to generate both local interpretations and global class representations. The local interpretations can be used to explain the model predictions while the global class representations aggregate important patterns for each class. These interpretations summarize the knowledge captured by the convolutional clauses, which can be visualized as images. We evaluate our methods on MNIST and CelebA datasets, using models that achieve 98.5% accuracy on MNIST and 86.56% F1-score on CelebA (compared to 88.07% for ResNet50) respectively. We show that the TM performs competitively to this deep learning model while maintaining its interpretability, even in large-scale complex training environments.
- Europe > Norway (0.05)
- North America > United States (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
Scalable Multi-phase Word Embedding Using Conjunctive Propositional Clauses
Kadhim, Ahmed K., Jiao, Lei, Shafik, Rishad, Granmo, Ole-Christoffer, Bhattarai, Bimal
The Tsetlin Machine (TM) architecture has recently demonstrated effectiveness in Machine Learning (ML), particularly within Natural Language Processing (NLP). It has been utilized to construct word embedding using conjunctive propositional clauses, thereby significantly enhancing our understanding and interpretation of machine-derived decisions. The previous approach performed the word embedding over a sequence of input words to consolidate the information into a cohesive and unified representation. However, that approach encounters scalability challenges as the input size increases. In this study, we introduce a novel approach incorporating two-phase training to discover contextual embeddings of input sequences. Specifically, this method encapsulates the knowledge for each input word within the dataset's vocabulary, subsequently constructing embeddings for a sequence of input words utilizing the extracted knowledge. This technique not only facilitates the design of a scalable model but also preserves interpretability. Our experimental findings revealed that the proposed method yields competitive performance compared to the previous approaches, demonstrating promising results in contrast to human-generated benchmarks. Furthermore, we applied the proposed approach to sentiment analysis on the IMDB dataset, where the TM embedding and the TM classifier, along with other interpretable classifiers, offered a transparent end-to-end solution with competitive performance.
- Research Report > Promising Solution (0.34)
- Research Report > New Finding (0.34)
Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings
Kadhim, Ahmed K., Jiao, Lei, Shafik, Rishad, Granmo, Ole-Christoffer
In recent years, text generation tools utilizing Artificial Intelligence (AI) have occasionally been misused across various domains, such as generating student reports or creative writings. This issue prompts plagiarism detection services to enhance their capabilities in identifying AI-generated content. Adversarial attacks are often used to test the robustness of AI-text generated detectors. This work proposes a novel textual adversarial attack on the detection models such as Fast-DetectGPT. The method employs embedding models for data perturbation, aiming at reconstructing the AI generated texts to reduce the likelihood of detection of the true origin of the texts. Specifically, we employ different embedding techniques, including the Tsetlin Machine (TM), an interpretable approach in machine learning for this purpose. By combining synonyms and embedding similarity vectors, we demonstrates the state-of-the-art reduction in detection scores against Fast-DetectGPT. Particularly, in the XSum dataset, the detection score decreased from 0.4431 to 0.2744 AUROC, and in the SQuAD dataset, it dropped from 0.5068 to 0.3532 AUROC.
- Information Technology > Security & Privacy (1.00)
- Government > Military (0.91)
- Education > Educational Technology > Educational Software (0.34)
The Probabilistic Tsetlin Machine: A Novel Approach to Uncertainty Quantification
Abeyrathna, K. Darshana, Mekkaoui, Sara El, Hafver, Andreas, Agrell, Christian
Tsetlin Machines (TMs) have emerged as a compelling alternative to conventional deep learning methods, offering notable advantages such as smaller memory footprint, faster inference, fault-tolerant properties, and interpretability. Although various adaptations of TMs have expanded their applicability across diverse domains, a fundamental gap remains in understanding how TMs quantify uncertainty in their predictions. In response, this paper introduces the Probabilistic Tsetlin Machine (PTM) framework, aimed at providing a robust, reliable, and interpretable approach for uncertainty quantification. Unlike the original TM, the PTM learns the probability of staying on each state of each Tsetlin Automaton (TA) across all clauses. These probabilities are updated using the feedback tables that are part of the TM framework: Type I and Type II feedback. During inference, TAs decide their actions by sampling states based on learned probability distributions, akin to Bayesian neural networks when generating weight values. In our experimental analysis, we first illustrate the spread of the probabilities across TA states for the noisy-XOR dataset. Then we evaluate the PTM alongside benchmark models using both simulated and real-world datasets. The experiments on the simulated dataset reveal the PTM's effectiveness in uncertainty quantification, particularly in delineating decision boundaries and identifying regions of high uncertainty. Moreover, when applied to multiclass classification tasks using the Iris dataset, the PTM demonstrates competitive performance in terms of predictive entropy and expected calibration error, showcasing its potential as a reliable tool for uncertainty estimation. Our findings underscore the importance of selecting appropriate models for accurate uncertainty quantification in predictive tasks, with the PTM offering a particularly interpretable and effective solution.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Norway (0.04)
- Research Report > New Finding (0.68)
- Research Report > Promising Solution (0.50)
Pruning Literals for Highly Efficient Explainability at Word Level
Yadav, Rohan Kumar, Bhattarai, Bimal, Jana, Abhik, Jiao, Lei, Yimam, Seid Muhie
Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated combination of literals (propositional logic) in the clause that makes the model difficult for humans to comprehend, despite having a transparent learning process. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause thereby making the model more efficiently interpretable than the vanilla TM. Experiments on the publicly available YELP-HAT Dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map. In addition, the pairwise similarity measure also surpasses the attention map-based neural network models. In terms of accuracy, the proposed pruning method does not degrade the accuracy significantly but rather enhances the performance up to 4% to 9% in some test data.
- Europe > Austria > Vienna (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Norway (0.04)
- (9 more...)
Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines
Halenka, Vojtech, Kadhim, Ahmed K., Clarke, Paul F. A., Bhattarai, Bimal, Saha, Rupsa, Granmo, Ole-Christoffer, Jiao, Lei, Andersen, Per-Arne
Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.
An Optimized Toolbox for Advanced Image Processing with Tsetlin Machine Composites
Grønningsæter, Ylva, Smørvik, Halvor S., Granmo, Ole-Christoffer
The Tsetlin Machine (TM) has achieved competitive results on several image classification benchmarks, including MNIST, K-MNIST, F-MNIST, and CIFAR-2. However, color image classification is arguably still in its infancy for TMs, with CIFAR-10 being a focal point for tracking progress. Over the past few years, TM's CIFAR-10 accuracy has increased from around 61% in 2020 to 75.1% in 2023 with the introduction of Drop Clause. In this paper, we leverage the recently proposed TM Composites architecture and introduce a range of TM Specialists that use various image processing techniques. These include Canny edge detection, Histogram of Oriented Gradients, adaptive mean thresholding, adaptive Gaussian thresholding, Otsu's thresholding, color thermometers, and adaptive color thermometers. In addition, we conduct a rigorous hyperparameter search, where we uncover optimal hyperparameters for several of the TM Specialists. The result is a toolbox that provides new state-of-the-art results on CIFAR-10 for TMs with an accuracy of 82.8%. In conclusion, our toolbox of TM Specialists forms a foundation for new TM applications and a landmark for further research on TM Composites in image analysis.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Norway (0.04)
- North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
- Europe > United Kingdom (0.04)
The Sparse Tsetlin Machine: Sparse Representation with Active Literals
Østby, Sebastian, Brambo, Tobias M., Glimsdal, Sondre
This paper introduces the Sparse Tsetlin Machine (STM), a novel Tsetlin Machine (TM) that processes sparse data efficiently. Traditionally, the TM does not consider data characteristics such as sparsity, commonly seen in NLP applications and other bag-of-word-based representations. Consequently, a TM must initialize, store, and process a significant number of zero values, resulting in excessive memory usage and computational time. Previous attempts at creating a sparse TM have predominantly been unsuccessful, primarily due to their inability to identify which literals are sufficient for TM training. By introducing Active Literals (AL), the STM can focus exclusively on literals that actively contribute to the current data representation, significantly decreasing memory footprint and computational time while demonstrating competitive classification performance.
- Information Technology (0.69)
- Media (0.46)
MATADOR: Automated System-on-Chip Tsetlin Machine Design Generation for Edge Applications
Rahman, Tousif, Mao, Gang, Maheshwari, Sidharth, Shafik, Rishad, Yakovlev, Alex
System-on-Chip Field-Programmable Gate Arrays (SoC-FPGAs) offer significant throughput gains for machine learning (ML) edge inference applications via the design of co-processor accelerator systems. However, the design effort for training and translating ML models into SoC-FPGA solutions can be substantial and requires specialist knowledge aware trade-offs between model performance, power consumption, latency and resource utilization. Contrary to other ML algorithms, Tsetlin Machine (TM) performs classification by forming logic proposition between boolean actions from the Tsetlin Automata (the learning elements) and boolean input features. A trained TM model, usually, exhibits high sparsity and considerable overlapping of these logic propositions both within and among the classes. The model, thus, can be translated to RTL-level design using a miniscule number of AND and NOT gates. This paper presents MATADOR, an automated boolean-to-silicon tool with GUI interface capable of implementing optimized accelerator design of the TM model onto SoC-FPGA for inference at the edge. It offers automation of the full development pipeline: model training, system level design generation, design verification and deployment. It makes use of the logic sharing that ensues from propositional overlap and creates a compact design by effectively utilizing the TM model's sparsity. MATADOR accelerator designs are shown to be up to 13.4x faster, up to 7x more resource frugal and up to 2x more power efficient when compared to the state-of-the-art Quantized and Binary Deep Neural Network implementations.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
- Asia > India (0.04)
TMComposites: Plug-and-Play Collaboration Between Specialized Tsetlin Machines
Tsetlin Machines (TMs) provide a fundamental shift from arithmetic-based to logic-based machine learning. Supporting convolution, they deal successfully with image classification datasets like MNIST, Fashion-MNIST, and CIFAR-2. However, the TM struggles with getting state-of-the-art performance on CIFAR-10 and CIFAR-100, representing more complex tasks. This paper introduces plug-and-play collaboration between specialized TMs, referred to as TM Composites. The collaboration relies on a TM's ability to specialize during learning and to assess its competence during inference. When teaming up, the most confident TMs make the decisions, relieving the uncertain ones. In this manner, a TM Composite becomes more competent than its members, benefiting from their specializations. The collaboration is plug-and-play in that members can be combined in any way, at any time, without fine-tuning. We implement three TM specializations in our empirical evaluation: Histogram of Gradients, Adaptive Gaussian Thresholding, and Color Thermometers. The resulting TM Composite increases accuracy on Fashion-MNIST by two percentage points, CIFAR-10 by twelve points, and CIFAR-100 by nine points, yielding new state-of-the-art results for TMs. Overall, we envision that TM Composites will enable an ultra-low energy and transparent alternative to state-of-the-art deep learning on more tasks and datasets.
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Norway (0.04)
- (2 more...)