Goto

Collaborating Authors

 Fuzzy Logic


Neural Networks are Function Approximation Algorithms

#artificialintelligence

Supervised learning in machine learning can be described in terms of function approximation. Given a dataset comprised of inputs and outputs, we assume that there is an unknown underlying function that is consistent in mapping inputs to outputs in the target domain and resulted in the dataset. We then use supervised learning algorithms to approximate this function. Neural networks are an example of a supervised machine learning algorithm that is perhaps best understood in the context of function approximation. This can be demonstrated with examples of neural networks approximating simple one-dimensional functions that aid in developing the intuition for what is being learned by the model.


On the equivalence between graph isomorphism testing and function approximation with GNNs

Neural Information Processing Systems

Graph neural networks (GNNs) have achieved lots of success on graph-structured data. In light of this, there has been increasing interest in studying their representation power. One line of work focuses on the universal approximation of permutation-invariant functions by certain classes of GNNs, and another demonstrates the limitation of GNNs via graph isomorphism tests. Our work connects these two perspectives and proves their equivalence. We further develop a framework of the representation power of GNNs with the language of sigma-algebra, which incorporates both viewpoints.


Learning nonlinear level sets for dimensionality reduction in function approximation

Neural Information Processing Systems

We developed a Nonlinear Level-set Learning (NLL) method for dimensionality reduction in high-dimensional function approximation with small data. This work is motivated by a variety of design tasks in real-world engineering applications, where practitioners would replace their computationally intensive physical models (e.g., high-resolution fluid simulators) with fast-to-evaluate predictive machine learning models, so as to accelerate the engineering design processes. There are two major challenges in constructing such predictive models: (a) high-dimensional inputs (e.g., many independent design parameters) and (b) small training data, generated by running extremely time-consuming simulations. Thus, reducing the input dimension is critical to alleviate the over-fitting issue caused by data insufficiency. Existing methods, including sliced inverse regression and active subspace approaches, reduce the input dimension by learning a linear coordinate transformation; our main contribution is to extend the transformation approach to a nonlinear regime.


Finite-Sample Analysis for SARSA with Linear Function Approximation

Neural Information Processing Systems

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ setting, where a single sample trajectory is available. With a Lipschitz continuous policy improvement operator that is smooth enough, SARSA has been shown to converge asymptotically. However, its non-asymptotic analysis is challenging and remains unsolved due to the non-i.i.d. In this paper, we develop a novel technique to explicitly characterize the stochastic bias of a type of stochastic approximation procedures with time-varying Markov transition kernels.


Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle

Neural Information Processing Systems

Q-learning with function approximation is one of the most popular methods in reinforcement learning. Though the idea of using function approximation was proposed at least 60 years ago, even in the simplest setup, i.e, approximating Q-functions with linear functions, it is still an open problem how to design a provably efficient algorithm that learns a near-optimal policy. The key challenges are how to efficiently explore the state space and how to decide when to stop exploring in conjunction with the function approximation scheme. The current paper presents a provably efficient algorithm for Q-learning with linear function approximation. Under certain regularity assumptions, our algorithm, Difference Maximization Q-learning, combined with linear function approximation, returns a near-optimal policy using polynomial number of trajectories.


Variance Reduced Policy Evaluation with Smooth Function Approximation

Neural Information Processing Systems

Policy evaluation with smooth and nonlinear function approximation has shown great potential for reinforcement learning. Compared to linear function approxi- mation, it allows for using a richer class of approximation functions such as the neural networks. Traditional algorithms are based on two timescales stochastic approximation whose convergence rate is often slow. This paper focuses on an offline setting where a trajectory of $m$ state-action pairs are observed. We formulate the policy evaluation problem as a non-convex primal-dual, finite-sum optimization problem, whose primal sub-problem is non-convex and dual sub-problem is strongly concave.


Adaptive binarization based on fuzzy integrals

arXiv.org Machine Learning

Adaptive binarization methodologies threshold the intensity of the pixels with respect to adjacent pixels exploiting the integral images. In turn, the integral images are generally computed optimally using the summed-area-table algorithm (SAT). This document presents a new adaptive binarization technique based on fuzzy integral images through an efficient design of a modified SAT for fuzzy integrals. We define this new methodology as FLAT (Fuzzy Local Adaptive Thresholding). The experimental results show that the proposed methodology have produced an image quality thresholding often better than traditional algorithms and saliency neural networks. We propose a new generalization of the Sugeno and CF 1,2 integrals to improve existing results with an efficient integral image computation. Therefore, these new generalized fuzzy integrals can be used as a tool for grayscale processing in real-time and deep-learning applications. Index Terms: Image Thresholding, Image Processing, Fuzzy Integrals, Aggregation Functions


MBGD-RDA Training and Rule Pruning for Concise TSK Fuzzy Regression Models

arXiv.org Machine Learning

To effectively train Takagi-Sugeno-Kang (TSK) fuzzy systems for regression problems, a Mini-Batch Gradient Descent with Regularization, DropRule, and AdaBound (MBGD-RDA) algorithm was recently proposed. It has demonstrated superior performances; however, there are also some limitations, e.g., it does not allow the user to specify the number of rules directly, and only Gaussian MFs can be used. This paper proposes two variants of MBGD-RDA to remedy these limitations, and show that they outperform the original MBGD-RDA and the classical ANFIS algorithms with the same number of rules. Furthermore, we also propose a rule pruning algorithm for TSK fuzzy systems, which can reduce the number of rules without significantly sacrificing the regression performance. Experiments showed that the rules obtained from pruning are generally better than training them from scratch directly, especially when Gaussian MFs are used.


Novel Meta-Heuristic Model for Discrimination between Iron Deficiency Anemia and B-Thalassemia with CBC Indices Based on Dynamic Harmony Search

arXiv.org Machine Learning

In recent decades, attention has been directed at anemia classification for various medical purposes, such as thalassemia screening and predicting iron deficiency anemia (IDA). In this study, a new method has been successfully tested for discrimination between IDA and \b{eta}-thalassemia trait (\b{eta}-TT). The method is based on a Dynamic Harmony Search (DHS). Complete blood count (CBC), a fast and inexpensive laboratory test, is used as the input of the system. Other models, such as a genetic programming method called structured representation on genetic algorithm in non-linear function fitting (STROGANOFF), an artificial neural network (ANN), an adaptive neuro-fuzzy inference system (ANFIS), a support vector machine (SVM), k-nearest neighbor (KNN), and certain traditional methods, are compared with the proposed method.


Pattern Similarity-based Machine Learning Methods for Mid-term Load Forecasting: A Comparative Study

arXiv.org Machine Learning

Pattern similarity-based methods are widely used in classification and regression problems. Repeated, similar-shaped cycles observed in seasonal time series encourage to apply these methods for forecasting. In this paper we use the pattern similarity-based methods for forecasting monthly electricity demand expressing annual seasonality. An integral part of the models is the time series representation using patterns of time series sequences. Pattern representation ensures the input and output data unification through trend filtering and variance equalization. Consequently, pattern representation simplifies the forecasting problem and allows us to use models based on pattern similarity. We consider four such models: nearest neighbor model, fuzzy neighborhood model, kernel regression model and general regression neural network. A regression function is constructed by aggregation output patterns with weights dependent on the similarity between input patterns. The advantages of the proposed models are: clear principle of operation, small number of parameters to adjust, fast optimization procedure, good generalization ability, working on the newest data without retraining, robustness to missing input variables, and generating a vector as an output. In the experimental part of the work the proposed models were used to forecasting the monthly demand for 35 European countries. The model performances were compared with the performances of the classical models such as ARIMA and exponential smoothing as well as state-of-the-art models such as multilayer perceptron, neuro-fuzzy system and long short-term memory model. The results show high performance of the proposed models which outperform the comparative models in accuracy, simplicity and ease of optimization.