Goto

Collaborating Authors

 Fuzzy Logic


Revisiting Data Complexity Metrics Based on Morphology for Overlap and Imbalance: Snapshot, New Overlap Number of Balls Metrics and Singular Problems Prospect

arXiv.org Machine Learning

Data Science and Machine Learning have become fundamental assets for companies and research institutions alike. As one of its fields, supervised classification allows for class prediction of new samples, learning from given training data. However, some properties can cause datasets to be problematic to classify. In order to evaluate a dataset a priori, data complexity metrics have been used extensively. They provide information regarding different intrinsic characteristics of the data, which serve to evaluate classifier compatibility and a course of action that improves performance. However, most complexity metrics focus on just one characteristic of the data, which can be insufficient to properly evaluate the dataset towards the classifiers' performance. In fact, class overlap, a very detrimental feature for the classification process (especially when imbalance among class labels is also present) is hard to assess. This research work focuses on revisiting complexity metrics based on data morphology. In accordance to their nature, the premise is that they provide both good estimates for class overlap, and great correlations with the classification performance. For that purpose, a novel family of metrics have been developed. Being based on ball coverage by classes, they are named after Overlap Number of Balls. Finally, some prospects for the adaptation of the former family of metrics to singular (more complex) problems are discussed.


Efficient Planning in Large MDPs with Weak Linear Function Approximation

arXiv.org Machine Learning

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak requirements: low approximation error for the optimal value function, and a small set of "core" states whose features span those of other states. In particular, we make no assumptions about the representability of policies or value functions of non-optimal policies. Our algorithm produces almost-optimal actions for any state using a generative oracle (simulator) for the MDP, while its computation time scales polynomially with the number of features, core states, and actions and the effective horizon.


Fuzzy Integral = Contextual Linear Order Statistic

arXiv.org Artificial Intelligence

The fuzzy integral is a powerful parametric nonlin-ear function with utility in a wide range of applications, from information fusion to classification, regression, decision making,interpolation, metrics, morphology, and beyond. While the fuzzy integral is in general a nonlinear operator, herein we show that it can be represented by a set of contextual linear order statistics(LOS). These operators can be obtained via sampling the fuzzy measure and clustering is used to produce a partitioning of the underlying space of linear convex sums. Benefits of our approach include scalability, improved integral/measure acquisition, generalizability, and explainable/interpretable models. Our methods are both demonstrated on controlled synthetic experiments, and also analyzed and validated with real-world benchmark data sets.


Diagnosis of Coronary Artery Disease Using Artificial Intelligence Based Decision Support System

arXiv.org Artificial Intelligence

This research is about the development a fuzzy decision support system for the diagnosis of coronary artery disease based on evidence. The coronary artery disease data sets taken from University California Irvine (UCI) are used. The knowledge base of fuzzy decision support system is taken by using rules extraction method based on Rough Set Theory. The rules then are selected and fuzzified based on information from discretization of numerical attributes. Fuzzy rules weight is proposed using the information from support of extracted rules. UCI heart disease data sets collected from U.S., Switzerland and Hungary, data from Ipoh Specialist Hospital Malaysia are used to verify the proposed system. The results show that the system is able to give the percentage of coronary artery blocking better than cardiologists and angiography. The results of the proposed system were verified and validated by three expert cardiologists and are considered to be more efficient and useful.


Machine Learning with the Sugeno Integral: The Case of Binary Classification

arXiv.org Machine Learning

In this paper, we elaborate on the use of the Sugeno integral in the context of machine learning. More specifically, we propose a method for binary classification, in which the Sugeno integral is used as an aggregation function that combines several local evaluations of an instance, pertaining to different features or measurements, into a single global evaluation. Due to the specific nature of the Sugeno integral, this approach is especially suitable for learning from ordinal data, that is, when measurements are taken from ordinal scales. This is a topic that has not received much attention in machine learning so far. The core of the learning problem itself consists of identifying the capacity underlying the Sugeno integral. To tackle this problem, we develop an algorithm based on linear programming. The algorithm also includes a suitable technique for transforming the original feature values into local evaluations (local utility scores), as well as a method for tuning a threshold on the global evaluation. To control the flexibility of the classifier and mitigate the problem of overfitting the training data, we generalize our approach toward $k$-maxitive capacities, where $k$ plays the role of a hyper-parameter of the learner. We present experimental studies, in which we compare our method with competing approaches on several benchmark data sets.


Multi-Kernel Fusion for RBF Neural Networks

arXiv.org Machine Learning

A simple yet effective architectural design of radial basis function neural networks (RBFNN) makes them amongst the most popular conventional neural networks. The current generation of radial basis function neural network is equipped with multiple kernels which provide significant performance benefits compared to the previous generation using only a single kernel. In existing multi-kernel RBF algorithms, multi-kernel is formed by the convex combination of the base/primary kernels. In this paper, we propose a novel multi-kernel RBFNN in which every base kernel has its own (local) weight. This novel flexibility in the network provides better performance such as faster convergence rate, better local minima and resilience against stucking in poor local minima. These performance gains are achieved at a competitive computational complexity compared to the contemporary multi-kernel RBF algorithms. The proposed algorithm is thoroughly analysed for performance gain using mathematical and graphical illustrations and also evaluated on three different types of problems namely: (i) pattern classification, (ii) system identification and (iii) function approximation. Empirical results clearly show the superiority of the proposed algorithm compared to the existing state-of-the-art multi-kernel approaches.


Real-Time Monitoring and Driver Feedback to Promote Fuel Efficient Driving

arXiv.org Artificial Intelligence

Improving the fuel efficiency of vehicles is imperative to reduce costs and protect the environment. While the efficient engine and vehicle designs, as well as intelligent route planning, are well-known solutions to enhance the fuel efficiency, research has also demonstrated that the adoption of fuel-efficient driving behaviors could lead to further savings. In this work, we propose a novel framework to promote fuel-efficient driving behaviors through real-time automatic monitoring and driver feedback. In this framework, a random-forest based classification model developed using historical data to identifies fuel-inefficient driving behaviors. The classifier considers driver-dependent parameters such as speed and acceleration/deceleration pattern, as well as environmental parameters such as traffic, road topography, and weather to evaluate the fuel efficiency of one-minute driving events. When an inefficient driving action is detected, a fuzzy logic inference system is used to determine what the driver should do to maintain fuel-efficient driving behavior. The decided action is then conveyed to the driver via a smartphone in a non-intrusive manner. Using a dataset from a long-distance bus, we demonstrate that the proposed classification model yields an accuracy of 85.2% while increasing the fuel efficiency up to 16.4%.


Online learning in MDPs with linear function approximation and bandit feedback

arXiv.org Machine Learning

We consider an online learning problem where the learner interacts with a Markov decision process in a sequence of episodes, where the reward function is allowed to change between episodes in an adversarial manner and the learner only gets to observe the rewards associated with its actions. We allow the state space to be arbitrarily large, but we assume that all action-value functions can be represented as linear functions in terms of a known low-dimensional feature map, and that the learner has access to a simulator of the environment that allows generating trajectories from the true MDP dynamics. Our main contribution is developing a computationally efficient algorithm that we call MDP-LinExp3, and prove that its regret is bounded by $\widetilde{\mathcal{O}}\big(H^2 T^{2/3} (dK)^{1/3}\big)$, where $T$ is the number of episodes, $H$ is the number of steps in each episode, $K$ is the number of actions, and $d$ is the dimension of the feature map. We also show that the regret can be improved to $\widetilde{\mathcal{O}}\big(H^2 \sqrt{TdK}\big)$ under much stronger assumptions on the MDP dynamics. To our knowledge, MDP-LinExp3 is the first provably efficient algorithm for this problem setting.


Exponentially Weighted l_2 Regularization Strategy in Constructing Reinforced Second-order Fuzzy Rule-based Model

arXiv.org Machine Learning

In the conventional Takagi-Sugeno-Kang (TSK)-type fuzzy models, constant or linear functions are usually utilized as the consequent parts of the fuzzy rules, but they cannot effectively describe the behavior within local regions defined by the antecedent parts. In this article, a theoretical and practical design methodology is developed to address this problem. First, the information granulation (Fuzzy C-Means) method is applied to capture the structure in the data and split the input space into subspaces, as well as form the antecedent parts. Second, the quadratic polynomials (QPs) are employed as the consequent parts. Compared with constant and linear functions, QPs can describe the input-output behavior within the local regions (subspaces) by refining the relationship between input and output variables. However, although QP can improve the approximation ability of the model, it could lead to the deterioration of the prediction ability of the model (e.g., overfitting). To handle this issue, we introduce an exponential weight approach inspired by the weight function theory encountered in harmonic analysis. More specifically, we adopt the exponential functions as the targeted penalty terms, which are equipped with l2 regularization (l2) (i.e., exponential weighted l2, ewl_2) to match the proposed reinforced second-order fuzzy rule-based model (RSFRM) properly. The advantage of el 2 compared to ordinary l2 lies in separately identifying and penalizing different types of polynomial terms in the coefficient estimation, and its results not only alleviate the overfitting and prevent the deterioration of generalization ability but also effectively release the prediction potential of the model.


End-to-End AI-Based Point-of-Care Diagnosis System for Classifying Respiratory Illnesses and Early Detection of COVID-19

arXiv.org Artificial Intelligence

Respiratory symptoms can be a caused by different underlying conditions, and are often caused by viral infections, such as Influenza-like illnesses or other emerging viruses like the Coronavirus. These respiratory viruses, often, have common symptoms, including coughing, high temperature, congested nose, and difficulty breathing. However, early diagnosis of the type of the virus, can be crucial, especially in cases such as the recent COVID-19 pandemic. One of the factors that contributed to the spread of the pandemic, was the late diagnosis or confusing it with regular flu-like symptoms. Science has proved that one of the possible differentiators of the underlying causes of these different respiratory diseases is coughing, which comes in different types and forms. Therefore, a reliable lab-free tool for early and more accurate diagnosis that can differentiate between different respiratory diseases is very much needed. This paper proposes an end-to-end portable system that can record data from patients with symptom, including coughs (voluntary or involuntary) and translate them into health data for diagnosis, and with the aid of machine learning, classify them into different respiratory illnesses, including COVID-19. With the ongoing efforts to stop the spread of the COVID-19 disease everywhere today, and against similar diseases in the future, our proposed low cost and user-friendly solution can play an important part in the early diagnosis.