Goto

Collaborating Authors

 Fuzzy Logic


Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK

arXiv.org Artificial Intelligence

High-order Takagi-Sugeno-Kang (TSK) fuzzy classifiers possess powerful classification performance yet have fewer fuzzy rules, but always be impaired by its exponential growth training time and poorer interpretability owing to High-order polynomial used in consequent part of fuzzy rule, while Low-order TSK fuzzy classifiers run quickly with high interpretability, however they usually require more fuzzy rules and perform relatively not very well. Address this issue, a novel TSK fuzzy classifier embeded with knowledge distillation in deep learning called HTSK-LLM-DKD is proposed in this study. HTSK-LLM-DKD achieves the following distinctive characteristics: 1) It takes High-order TSK classifier as teacher model and Low-order TSK fuzzy classifier as student model, and leverages the proposed LLM-DKD (Least Learning Machine based Decoupling Knowledge Distillation) to distill the fuzzy dark knowledge from High-order TSK fuzzy classifier to Low-order TSK fuzzy classifier, which resulting in Low-order TSK fuzzy classifier endowed with enhanced performance surpassing or at least comparable to High-order TSK classifier, as well as high interpretability; specifically 2) The Negative Euclidean distance between the output of teacher model and each class is employed to obtain the teacher logits, and then it compute teacher/student soft labels by the softmax function with distillating temperature parameter; 3) By reformulating the Kullback-Leibler divergence, it decouples fuzzy dark knowledge into target class knowledge and non-target class knowledge, and transfers them to student model. The advantages of HTSK-LLM-DKD are verified on the benchmarking UCI datasets and a real dataset Cleveland heart disease, in terms of classification performance and model interpretability.


Isotopic envelope identification by analysis of the spatial distribution of components in MALDI-MSI data

arXiv.org Artificial Intelligence

One of the significant steps in the process leading to the identification of proteins is mass spectrometry, which allows for obtaining information about the structure of proteins. Removing isotope peaks from the mass spectrum is vital and it is done in a process called deisotoping. There are different algorithms for deisotoping, but they have their limitations, they are dedicated to different methods of mass spectrometry. Data from experiments performed with the MALDI-ToF technique are characterized by high dimensionality. This paper presents a method for identifying isotope envelopes in MALDI-ToF molecular imaging data based on the Mamdani-Assilan fuzzy system and spatial maps of the molecular distribution of peaks included in the isotopic envelope. Several image texture measures were used to evaluate spatial molecular distribution maps. The algorithm was tested on eight datasets obtained from the MALDI-ToF experiment on samples from the National Institute of Oncology in Gliwice from patients with cancer of the head and neck region. The data were subjected to pre-processing and feature extraction. The results were collected and compared with three existing deisotoping algorithms. The analysis of the obtained results showed that the method for identifying isotopic envelopes proposed in this paper enables the detection of overlapping envelopes by using the approach oriented to study peak pairs. Moreover, the proposed algorithm enables the analysis of large data sets.


A Lifetime Extended Energy Management Strategy for Fuel Cell Hybrid Electric Vehicles via Self-Learning Fuzzy Reinforcement Learning

arXiv.org Artificial Intelligence

Modeling difficulty, time-varying model, and uncertain external inputs are the main challenges for energy management of fuel cell hybrid electric vehicles. In the paper, a fuzzy reinforcement learning-based energy management strategy for fuel cell hybrid electric vehicles is proposed to reduce fuel consumption, maintain the batteries' long-term operation, and extend the lifetime of the fuel cells system. Fuzzy Q-learning is a model-free reinforcement learning that can learn itself by interacting with the environment, so there is no need for modeling the fuel cells system. In addition, frequent startup of the fuel cells will reduce the remaining useful life of the fuel cells system. The proposed method suppresses frequent fuel cells startup by considering the penalty for the times of fuel cell startups in the reward of reinforcement learning. Moreover, applying fuzzy logic to approximate the value function in Q-Learning can solve continuous state and action space problems. Finally, a python-based training and testing platform verify the effectiveness and self-learning improvement of the proposed method under conditions of initial state change, model change and driving condition change.


Antifragile Control Systems: The case of mobile robot trajectory tracking in the presence of uncertainty

arXiv.org Artificial Intelligence

Mobile robots are ubiquitous. Such vehicles benefit from well-designed and calibrated control algorithms ensuring their task execution under precise uncertainty bounds. Yet, in tasks involving humans in the loop, such as elderly or mobility impaired, the problem takes a new dimension. In such cases, the system needs not only to compensate for uncertainty and volatility in its operation but at the same time to anticipate and offer responses that go beyond robust. Such robots operate in cluttered, complex environments, akin to human residences, and need to face during their operation sensor and, even, actuator faults, and still operate. This is where our thesis comes into the foreground. We propose a new control design framework based on the principles of antifragility. Such a design is meant to offer a high uncertainty anticipation given previous exposure to failures and faults, and exploit this anticipation capacity to provide performance beyond robust. In the current instantiation of antifragile control applied to mobile robot trajectory tracking, we provide controller design steps, the analysis of performance under parametrizable uncertainty and faults, as well as an extended comparative evaluation against state-of-the-art controllers. We believe in the potential antifragile control has in achieving closed-loop performance in the face of uncertainty and volatility by using its exposures to uncertainty to increase its capacity to anticipate and compensate for such events.


The Impact of Data Distribution on Q-learning with Function Approximation

arXiv.org Artificial Intelligence

We study the interplay between the data distribution and Q-learning-based algorithms with function approximation. We provide a unified theoretical and empirical analysis as to how different properties of the data distribution influence the performance of Q-learning-based algorithms. We connect different lines of research, as well as validate and extend previous results. We start by reviewing theoretical bounds on the performance of approximate dynamic programming algorithms. We then introduce a novel four-state MDP specifically tailored to highlight the impact of the data distribution in the performance of Q-learning-based algorithms with function approximation, both online and offline. Finally, we experimentally assess the impact of the data distribution properties on the performance of two offline Q-learning-based algorithms under different environments. According to our results: (i) high entropy data distributions are well-suited for learning in an offline manner; and (ii) a certain degree of data diversity (data coverage) and data quality (closeness to optimal policy) are jointly desirable for offline learning.


Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation

arXiv.org Artificial Intelligence

Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-making using a pre-collected dataset, without further interaction with the environment. Recent theoretical progress has focused on developing sample-efficient offline RL algorithms with various relaxed assumptions on data coverage and function approximators, especially to handle the case with excessively large state-action spaces. Among them, the framework based on the linear-programming (LP) reformulation of Markov decision processes has shown promise: it enables sample-efficient offline RL with function approximation, under only partial data coverage and realizability assumptions on the function classes, with favorable computational tractability. In this work, we revisit the LP framework for offline RL, and provide a new reformulation that advances the existing results in several aspects, relaxing certain assumptions and achieving optimal statistical rates in terms of sample size. Our key enabler is to introduce proper constraints in the reformulation, instead of using any regularization as in the literature, also with careful choices of the function classes and initial state distributions. We hope our insights bring into light the use of LP formulations and the induced primal-dual minimax optimization, in offline RL.


An Expert System to Diagnose Spinal Disorders

arXiv.org Artificial Intelligence

Objective: Until now, traditional invasive approaches have been the only means being leveraged to diagnose spinal disorders. Traditional manual diagnostics require a high workload, and diagnostic errors are likely to occur due to the prolonged work of physicians. In this research, we develop an expert system based on a hybrid inference algorithm and comprehensive integrated knowledge for assisting the experts in the fast and high-quality diagnosis of spinal disorders. Methods: First, for each spinal anomaly, the accurate and integrated knowledge was acquired from related experts and resources. Second, based on probability distributions and dependencies between symptoms of each anomaly, a unique numerical value known as certainty effect value was assigned to each symptom. Third, a new hybrid inference algorithm was designed to obtain excellent performance, which was an incorporation of the Backward Chaining Inference and Theory of Uncertainty. Results: The proposed expert system was evaluated in two different phases, real-world samples, and medical records evaluation. Evaluations show that in terms of real-world samples analysis, the system achieved excellent accuracy. Application of the system on the sample with anomalies revealed the degree of severity of disorders and the risk of development of abnormalities in unhealthy and healthy patients. In the case of medical records analysis, our expert system proved to have promising performance, which was very close to those of experts. Conclusion: Evaluations suggest that the proposed expert system provides promising performance, helping specialists to validate the accuracy and integrity of their diagnosis. It can also serve as an intelligent educational software for medical students to gain familiarity with spinal disorder diagnosis process, and related symptoms.


A Fuzzy-set-based Joint Distribution Adaptation Method for Regression and its Application to Online Damage Quantification for Structural Digital Twin

arXiv.org Artificial Intelligence

Online damage quantification suffers from insufficient labeled data that weakens its accuracy. In this context, adopting the domain adaptation on historical labeled data from similar structures/damages or simulated digital twin data to assist the current diagnosis task would be beneficial. However, most domain adaptation methods are designed for classification and cannot efficiently address damage quantification, a regression problem with continuous real-valued labels. This study first proposes a novel domain adaptation method, the Online Fuzzy-set-based Joint Distribution Adaptation for Regression, to address this challenge. By converting the continuous real-valued labels to fuzzy class labels via fuzzy sets, the marginal and conditional distribution discrepancy are simultaneously measured to achieve the domain adaptation for the damage quantification task. Thanks to the superiority of the proposed method, a state-of-the-art online damage quantification framework based on domain adaptation is presented. Finally, the framework has been comprehensively demonstrated with a damaged helicopter panel, in which three types of damage domain adaptations (across different damage locations, across different damage types, and from simulation to experiment) are all conducted, proving the accuracy of damage quantification can be significantly improved in a realistic environment. It is expected that the proposed approach to be applied to the fleet-level digital twin considering the individual differences.


Offline Learning in Markov Games with General Function Approximation

arXiv.org Artificial Intelligence

Offline RL aims to learn a good policy from a pre-collected historical dataset. It has emerged as an important paradigm for bringing RL to real-life scenarios due to its non-interative nature, especially in applications where deploying adaptive algorithms in the real system is financially costly and/or ethically problematic [Levine et al., 2020]. While offline RL has been extensively studied in the single-agent setting, many real-world applications involve the strategic interactions between multiple agents. This renders the necessity of bringing in game-theoretic reasoning, often modeled using Markov games [Shapley, 1953] in the RL theory literature. Markov games can be viewed as the multi-agent extension of Markov Decision Processes (MDPs), where agents share the same state information and the dynamics is determined by the joint action of all agents. While online RL in Markov games has seen significant developments in recent years [Bai and Jin, 2020, Liu et al., 2021, Song et al., 2021, Jin et al., 2021b], offline learning in Markov games has only started to attract attention from the community. Earlier works [Cui and Du, 2022b, Zhong et al., 2022] focus on tabular cases or linear function approximation, which cannot handle complex environments that require advanced function-approximation techniques. Although there has been a rich literature on single-agent RL with general function approximation [Jiang et al., 2017, Jin et al., 2021a, Wang et al., 2020, Huang et al., 2021a], whether and how they can be extended to offline Markov games remains largely unclear.


Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity

arXiv.org Artificial Intelligence

We initiate the study of federated reinforcement learning under environmental heterogeneity by considering a policy evaluation problem. Our setup involves $N$ agents interacting with environments that share the same state and action space but differ in their reward functions and state transition kernels. Assuming agents can communicate via a central server, we ask: Does exchanging information expedite the process of evaluating a common policy? To answer this question, we provide the first comprehensive finite-time analysis of a federated temporal difference (TD) learning algorithm with linear function approximation, while accounting for Markovian sampling, heterogeneity in the agents' environments, and multiple local updates to save communication. Our analysis crucially relies on several novel ingredients: (i) deriving perturbation bounds on TD fixed points as a function of the heterogeneity in the agents' underlying Markov decision processes (MDPs); (ii) introducing a virtual MDP to closely approximate the dynamics of the federated TD algorithm; and (iii) using the virtual MDP to make explicit connections to federated optimization. Putting these pieces together, we rigorously prove that in a low-heterogeneity regime, exchanging model estimates leads to linear convergence speedups in the number of agents.