Goto

Collaborating Authors

 Rayong


Growable and Interpretable Neural Control with Online Continual Learning for Autonomous Lifelong Locomotion Learning Machines

Srisuchinnawong, Arthicha, Manoonpong, Poramate

arXiv.org Artificial Intelligence

Continual locomotion learning faces four challenges: incomprehensibility, sample inefficiency, lack of knowledge exploitation, and catastrophic forgetting. Thus, this work introduces Growable Online Locomotion Learning Under Multicondition (GOLLUM), which exploits the interpretability feature to address the aforementioned challenges. GOLLUM has two dimensions of interpretability: layer-wise interpretability for neural control function encoding and column-wise interpretability for robot skill encoding. With this interpretable control structure, GOLLUM utilizes neurogenesis to unsupervisely increment columns (ring-like networks); each column is trained separately to encode and maintain a specific primary robot skill. GOLLUM also transfers the parameters to new skills and supplements the learned combination of acquired skills through another neural mapping layer added (layer-wise) with online supplementary learning. On a physical hexapod robot, GOLLUM successfully acquired multiple locomotion skills (e.g., walking, slope climbing, and bouncing) autonomously and continuously within an hour using a simple reward function. Furthermore, it demonstrated the capability of combining previous learned skills to facilitate the learning process of new skills while preventing catastrophic forgetting. Compared to state-of-the-art locomotion learning approaches, GOLLUM is the only approach that addresses the four challenges above mentioned without human intervention. It also emphasizes the potential exploitation of interpretability to achieve autonomous lifelong learning machines.


Distilling Two-Timed Flow Models by Separately Matching Initial and Terminal Velocities

Khungurn, Pramook, Piyawongwisal, Pratch, Sriswasdi, Sira, Suwajanakorn, Supasorn

arXiv.org Artificial Intelligence

A flow matching model learns a time-dependent vector field $v_t(x)$ that generates a probability path $\{ p_t \}_{0 \leq t \leq 1}$ that interpolates between a well-known noise distribution ($p_0$) and the data distribution ($p_1$). It can be distilled into a two-timed flow model (TTFM) $ϕ_{s,x}(t)$ that can transform a sample belonging to the distribution at an initial time $s$ to another belonging to the distribution at a terminal time $t$ in one function evaluation. We present a new loss function for TTFM distillation called the \emph{initial/terminal velocity matching} (ITVM) loss that extends the Lagrangian Flow Map Distillation (LFMD) loss proposed by Boffi et al. by adding redundant terms to match the initial velocities at time $s$, removing the derivative from the terminal velocity term at time $t$, and using a version of the model under training, stabilized by exponential moving averaging (EMA), to compute the target terminal average velocity. Preliminary experiments show that our loss leads to better few-step generation performance on multiple types of datasets and model architectures over baselines.


Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots

Leung, Binggwong, Haomachai, Worasuchad, Pedersen, Joachim Winther, Risi, Sebastian, Manoonpong, Poramate

arXiv.org Artificial Intelligence

Abstract-- Artificial neural networks can be used to solve a variety of robotic tasks. However, they risk failing catastrophically when faced with out-of-distribution (OOD) situations. Several approaches have employed a type of synaptic plasticity known as Hebbian learning that can dynamically adjust weights based on local neural activities. Research has shown that synaptic plasticity can make policies more robust and help them adapt to unforeseen changes in the environment. In this work, we improve the Hebbian network with a weight normalization mechanism for preventing weight divergence, analyze the principal components of the Hebbian's weights, The disadvantages of these In the field of machine learning research, deep neural types of solutions are that they extend the necessary training networks (DNNs) have been shown to be useful across a time or risk, resulting in an architecture that is overly specific wide range of tasks [1], [2], including robotics [3], [4], [5]. to the task for which it was designed [11], [12]. However, policies for agent control based on deep neural Animals, on the other hand, demonstrate remarkable networks tend to be brittle [6], meaning that they are at risk adaptability in adjusting their motor patterns to accomplish of catastrophic failure when faced with out-of-distribution various tasks. Synaptic plasticity is thought to play (OOD) situations [7], [8].


An Interpretable Neural Control Network with Adaptable Online Learning for Sample Efficient Robot Locomotion Learning

Srisuchinnawong, Arthicha, Manoonpong, Poramate

arXiv.org Artificial Intelligence

Robot locomotion learning using reinforcement learning suffers from training sample inefficiency and exhibits the non-understandable/black-box nature. Thus, this work presents a novel SME-AGOL to address such problems. Firstly, Sequential Motion Executor (SME) is a three-layer interpretable neural network, where the first produces the sequentially propagating hidden states, the second constructs the corresponding triangular bases with minor non-neighbor interference, and the third maps the bases to the motor commands. Secondly, the Adaptable Gradient-weighting Online Learning (AGOL) algorithm prioritizes the update of the parameters with high relevance score, allowing the learning to focus more on the highly relevant ones. Thus, these two components lead to an analyzable framework, where each sequential hidden state/basis represents the learned key poses/robot configuration. Compared to state-of-the-art methods, the SME-AGOL requires 40% fewer samples and receives 150% higher final reward/locomotion performance on a simulated hexapod robot, while taking merely 10 minutes of learning time from scratch on a physical hexapod robot. Taken together, this work not only proposes the SME-AGOL for sample efficient and understandable locomotion learning but also emphasizes the potential exploitation of interpretability for improving sample efficiency and learning performance.


Nature's All-in-One: Multitasking Robots Inspired by Dung Beetles

Leung, Binggwong, Gorb, Stanislav, Manoonpong, Poramate

arXiv.org Artificial Intelligence

Dung beetles impressively coordinate their six legs simultaneously to effectively roll large dung balls. They are also capable of rolling dung balls varying in the weight on different terrains. The mechanisms underlying how their motor commands are adapted to walk and simultaneously roll balls (multitasking behavior) under different conditions remain unknown. Therefore, this study unravels the mechanisms of how dung beetles roll dung balls and adapt their leg movements to stably roll balls over different terrains for multitasking robots. We synthesize a modular neural-based loco-manipulation control inspired by and based on ethological observations of the ball-rolling behavior of dung beetles. The proposed neural-based control contains various neural modules, including a central pattern generator (CPG) module, a pattern formation network (PFN) module, and a robot orientation control (ROC) module. The integrated neural control mechanisms can successfully control a dung beetle-like robot (ALPHA) with biomechanical feet to perform adaptive robust (multitasking) loco-manipulation (walking and ball-rolling) on various terrains (flat and uneven). It can also deal with different ball weights (2.0 and 4.6 kg) and ball types (soft and rigid). The control mechanisms can serve as guiding principles for solving complex sensory-motor coordination for multitasking robots. Furthermore, this study contributes to biological research by enhancing our scientific understanding of sensory-motor coordination for complex adaptive (multitasking) loco-manipulation behavior in animals.


MixNet: Joining Force of Classical and Modern Approaches Toward the Comprehensive Pipeline in Motor Imagery EEG Classification

Autthasan, Phairot, Chaisaen, Rattanaphon, Phan, Huy, De Vos, Maarten, Wilaiprasitporn, Theerawit

arXiv.org Artificial Intelligence

Recent advances in deep learning (DL) have significantly impacted motor imagery (MI)-based brain-computer interface (BCI) systems, enhancing the decoding of electroencephalography (EEG) signals. However, most studies struggle to identify discriminative patterns across subjects during MI tasks, limiting MI classification performance. In this article, we propose MixNet, a novel classification framework designed to overcome this limitation by utilizing spectral-spatial signals from MI data, along with a multitask learning architecture named MIN2Net, for classification. Here, the spectral-spatial signals are generated using the filter-bank common spatial patterns (FBCSPs) method on MI data. Since the multitask learning architecture is used for the classification task, the learning in each task may exhibit different generalization rates and potential overfitting across tasks. To address this issue, we implement adaptive gradient blending, simultaneously regulating multiple loss weights and adjusting the learning pace for each task based on its generalization/overfitting tendencies. Experimental results on six benchmark data sets of different data sizes demonstrate that MixNet consistently outperforms all state-of-the-art algorithms in subject-dependent and -independent settings. Finally, the low-density EEG MI classification results show that MixNet outperforms all state-of-the-art algorithms, offering promising implications for Internet of Thing (IoT) applications, such as lightweight and portable EEG wearable devices based on low-density montages.


PyThaiNLP: Thai Natural Language Processing in Python

Phatthiyaphaibun, Wannaphong, Chaovavanich, Korakot, Polpanumas, Charin, Suriyawongkul, Arthit, Lowphansirikul, Lalita, Chormai, Pattarawat, Limkonchotiwat, Peerat, Suntorntip, Thanathip, Udomcharoenchaikit, Can

arXiv.org Artificial Intelligence

We present PyThaiNLP, a free and open-source natural language processing (NLP) library for Thai language implemented in Python. It provides a wide range of software, models, and datasets for Thai language. We first provide a brief historical context of tools for Thai language prior to the development of PyThaiNLP. We then outline the functionalities it provided as well as datasets and pre-trained language models. We later summarize its development milestones and discuss our experience during its development. We conclude by demonstrating how industrial and research communities utilize PyThaiNLP in their work. The library is freely available at https://github.com/pythainlp/pythainlp.


Combining EEG and NLP Features for Predicting Students' Lecture Comprehension using Ensemble Classification

Natnithikarat, Phantharach, Wilaiprasitporn, Theerawit, Kongwudhikunakorn, Supavit

arXiv.org Artificial Intelligence

Electroencephalography (EEG) and Natural Language Processing (NLP) can be applied for education to measure students' comprehension in classroom lectures; currently, the two measures have been used separately. In this work, we propose a classification framework for predicting students' lecture comprehension in two tasks: (i) students' confusion after listening to the simulated lecture and (ii) the correctness of students' responses to the post-lecture assessment. The proposed framework includes EEG and NLP feature extraction, processing, and classification. EEG and NLP features are extracted to construct integrated features obtained from recorded EEG signals and sentence-level syntactic analysis, which provide information about specific biomarkers and sentence structures. An ensemble stacking classification method -- a combination of multiple individual models that produces an enhanced predictive model -- is studied to learn from the features to make predictions accurately. Furthermore, we also utilized subjective confusion ratings as another integrated feature to enhance classification performance. By doing so, experiment results show that this framework performs better than the baselines, which achieved F1 up to 0.65 for predicting confusion and 0.78 for predicting correctness, highlighting that utilizing this has helped improve the classification performance.


Structure to Property: Chemical Element Embeddings and a Deep Learning Approach for Accurate Prediction of Chemical Properties

Shermukhamedov, Shokirbek, Mamurjonova, Dilorom, Probst, Michael

arXiv.org Artificial Intelligence

The application of machine learning (ML) techniques in computational chemistry has led to significant advances in predicting molecular properties, accelerating drug discovery, and material design. ML models can extract hidden patterns and relationships from complex and large datasets, allowing for the prediction of various chemical properties with high accuracy. The use of such methods has enabled the discovery of molecules and materials that were previously difficult to identify. This paper introduces a new ML model based on deep learning techniques, such as a multilayer encoder and decoder architecture, for classification tasks. We demonstrate the opportunities offered by our approach by applying it to various types of input data, including organic and inorganic compounds. In particular, we developed and tested the model using the Matbench and Moleculenet benchmarks, which include crystal properties and drug design-related benchmarks. We also conduct a comprehensive analysis of vector representations of chemical compounds, shedding light on the underlying patterns in molecular data. The models used in this work exhibit a high degree of predictive power, underscoring the progress that can be made with refined machine learning when applied to molecular and material datasets. For instance, on the Tox21 dataset, we achieved an average accuracy of 96%, surpassing the previous best result by 10%. Our code is publicly available at https://github.com/dmamur/elembert.


PseudoCell: Hard Negative Mining as Pseudo Labeling for Deep Learning-Based Centroblast Cell Detection

Seesawad, Narongrid, Ittichaiwong, Piyalitt, Sudhawiyangkul, Thapanun, Sawangjai, Phattarapong, Thuwajit, Peti, Boonsakan, Paisarn, Sripodok, Supasan, Veerakanjana, Kanyakorn, Luenam, Phoomraphee, Charngkaew, Komgrid, Pongpaibul, Ananya, Angkathunyakul, Napat, Hnoohom, Narit, Yuenyong, Sumeth, Thuwajit, Chanitra, Wilaiprasitporn, Theerawit

arXiv.org Artificial Intelligence

Patch classification models based on deep learning have been utilized in whole-slide images (WSI) of H&E-stained tissue samples to assist pathologists in grading follicular lymphoma patients. However, these approaches still require pathologists to manually identify centroblast cells and provide refined labels for optimal performance. To address this, we propose PseudoCell, an object detection framework to automate centroblast detection in WSI (source code is available at https://github.com/IoBT-VISTEC/PseudoCell.git). This framework incorporates centroblast labels from pathologists and combines them with pseudo-negative labels obtained from undersampled false-positive predictions using the cell's morphological features. By employing PseudoCell, pathologists' workload can be reduced as it accurately narrows down the areas requiring their attention during examining tissue. Depending on the confidence threshold, PseudoCell can eliminate 58.18-99.35% of non-centroblasts tissue areas on WSI. This study presents a practical centroblast prescreening method that does not require pathologists' refined labels for improvement. Detailed guidance on the practical implementation of PseudoCell is provided in the discussion section.