Goto

Collaborating Authors

 Fuzzy Logic


Amortized SHAP values via sparse Fourier function approximation

arXiv.org Artificial Intelligence

SHAP values are a popular local feature-attribution method widely used in interpretable and explainable AI. We tackle the problem of efficiently computing these values. We cover both the model-agnostic (black-box) setting, where one only has query access to the model and also the case of (ensembles of) trees where one has access to the structure of the tree. For both the black-box and the tree setting we propose a two-stage approach for estimating SHAP values. Our algorithm's first step harnesses recent results showing that many real-world predictors have a spectral bias that allows us to either exactly represent (in the case of ensembles of decision trees), or efficiently approximate them (in the case of neural networks) using a compact Fourier representation. In the second step of the algorithm, we use the Fourier representation to exactly compute SHAP values. The second step is computationally very cheap because firstly, the representation is compact and secondly, we prove that there exists a closed-form expression for SHAP values for the Fourier basis functions. Furthermore, the expression we derive effectively linearizes the computation into a simple summation and is amenable to parallelization on multiple cores or a GPU. Since the function approximation (first step) is only done once, it allows us to produce Shapley values in an amortized way. We show speedups compared to relevant baseline methods equal levels of accuracy for both the tree and black-box settings. Moreover, this approach introduces a reliable and fine-grained continuous trade-off between computation and accuracy through the sparsity of the Fourier approximation, a feature previously unavailable in all black-box methods.


Reducing fuzzy relation equations via concept lattices

arXiv.org Artificial Intelligence

This paper has taken into advantage the relationship between Fuzzy Relation Equations (FRE) and Concept Lattices in order to introduce a procedure to reduce a FRE, without losing information. Specifically, attribute reduction theory in property-oriented and object-oriented concept lattices has been considered in order to present a mechanism for detecting redundant equations. As a first consequence, the computation of the whole solution set of a solvable FRE is reduced. Moreover, we will also introduce a novel method for computing approximate solutions of unsolvable FRE related to a (real) dataset with uncertainty/imprecision data.


Covering Numbers for Deep ReLU Networks with Applications to Function Approximation and Nonparametric Regression

arXiv.org Machine Learning

Covering numbers of families of (deep) ReLU networks have been used to characterize their approximation-theoretic performance, upper-bound the prediction error they incur in nonparametric regression, and quantify their classification capacity. These results are based on covering number upper bounds obtained through the explicit construction of coverings. Lower bounds on covering numbers do not seem to be available in the literature. The present paper fills this gap by deriving tight (up to a multiplicative constant) lower and upper bounds on the covering numbers of fully-connected networks with bounded weights, sparse networks with bounded weights, and fully-connected networks with quantized weights. Thanks to the tightness of the bounds, a fundamental understanding of the impact of sparsity, quantization, bounded vs. unbounded weights, and network output truncation can be developed. Furthermore, the bounds allow to characterize the fundamental limits of neural network transformation, including network compression, and lead to sharp upper bounds on the prediction error in nonparametric regression through deep networks. Specifically, we can remove a $\log^6(n)$-factor in the best-known sample complexity rate in the estimation of Lipschitz functions through deep networks thereby establishing optimality. Finally, we identify a systematic relation between optimal nonparametric regression and optimal approximation through deep networks, unifying numerous results in the literature and uncovering general underlying principles.


Reviews: Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces

Neural Information Processing Systems

Strengths 1. Considering dynamic programming problems in continuous time such that the methodologies and tools of dynamical systems and stochastic di_x000b_eren- tial equations is interesting, and the authors do a good job of motivating the generalities of the problem context. The parameterizations considered of the value functions at the end of the day belong to discrete time, due to the need to discretize the SDEs and sample the state-action-reward triples. Given this discrete implementa- tion, and the fact that experimentally the authors run into the conven- tional di_x000e_culties of discrete time algorithms with continuous state-action function approximation, I am a little bewildered as to what the actual bene_x000c_t is of this problem formulation, especially since it requires a re- de_x000c_nition of the value function as one that is compatible with SDEs (eqn. That is, the intrinsic theoretical bene_x000c_ts of this perspective are not clear, especially since the main theorem is expressed in terms of RKHS only. However, these methods are fundamentally limited by their sample complexity bottleneck, i.e., the quadratic complexity in the sample size.



Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach

Neural Information Processing Systems

We propose an efficient method to estimate the accuracy of classifiers using only unlabeled data. We consider a setting with multiple classification problems where the target classes may be tied together through logical constraints. For example, a set of classes may be mutually exclusive, meaning that a data instance can belong to at most one of them. The proposed method is based on the intuition that: (i) when classifiers agree, they are more likely to be correct, and (ii) when the classifiers make a prediction that violates the constraints, at least one classifier must be making an error. Experiments on four real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classifier outputs. The results emphasize the utility of logical constraints in estimating accuracy, thus validating our intuition.


Improving Fuzzy Rule Classifier with Brain Storm Optimization and Rule Modification

arXiv.org Artificial Intelligence

The expanding complexity and dimensionality in the search space can adversely affect inductive learning in fuzzy rule classifiers, thus impacting the scalability and accuracy of fuzzy systems. This research specifically addresses the challenge of diabetic classification by employing the Brain Storm Optimization (BSO) algorithm to propose a novel fuzzy system that redefines rule generation for this context. An exponential model is integrated into the standard BSO algorithm to enhance rule derivation, tailored specifically for diabetes-related data. The innovative fuzzy system is then applied to classification tasks involving diabetic datasets, demonstrating a substantial improvement in classification accuracy, as evidenced by our experiments.


A Fuzzy-based Approach to Predict Human Interaction by Functional Near-Infrared Spectroscopy

arXiv.org Artificial Intelligence

The paper introduces a Fuzzy-based Attention (Fuzzy Attention Layer) mechanism, a novel computational approach to enhance the interpretability and efficacy of neural models in psychological research. The proposed Fuzzy Attention Layer mechanism is integrated as a neural network layer within the Transformer Encoder model to facilitate the analysis of complex psychological phenomena through neural signals, such as those captured by functional Near-Infrared Spectroscopy (fNIRS). By leveraging fuzzy logic, the Fuzzy Attention Layer is capable of learning and identifying interpretable patterns of neural activity. This capability addresses a significant challenge when using Transformer: the lack of transparency in determining which specific brain activities most contribute to particular predictions. Our experimental results demonstrated on fNIRS data from subjects engaged in social interactions involving handholding reveal that the Fuzzy Attention Layer not only learns interpretable patterns of neural activity but also enhances model performance. Additionally, the learned patterns provide deeper insights into the neural correlates of interpersonal touch and emotional exchange. The application of our model shows promising potential in deciphering the subtle complexities of human social behaviors, thereby contributing significantly to the fields of social neuroscience and psychological AI.


GB-RVFL: Fusion of Randomized Neural Network and Granular Ball Computing

arXiv.org Artificial Intelligence

The random vector functional link (RVFL) network is a prominent classification model with strong generalization ability. However, RVFL treats all samples uniformly, ignoring whether they are pure or noisy, and its scalability is limited due to the need for inverting the entire training matrix. To address these issues, we propose granular ball RVFL (GB-RVFL) model, which uses granular balls (GBs) as inputs instead of training samples. This approach enhances scalability by requiring only the inverse of the GB center matrix and improves robustness against noise and outliers through the coarse granularity of GBs. Furthermore, RVFL overlooks the dataset's geometric structure. To address this, we propose graph embedding GB-RVFL (GE-GB-RVFL) model, which fuses granular computing and graph embedding (GE) to preserve the topological structure of GBs. The proposed GB-RVFL and GE-GB-RVFL models are evaluated on KEEL, UCI, NDC and biomedical datasets, demonstrating superior performance compared to baseline models.


Bipolar fuzzy relation equations systems based on the product t-norm

arXiv.org Artificial Intelligence

Bipolar fuzzy relation equations arise as a generalization of fuzzy relation equations considering unknown variables together with their logical connective negations. The occurrence of a variable and the occurrence of its negation simultaneously can give very useful information for certain frameworks where the human reasoning plays a key role. Hence, the resolution of bipolar fuzzy relation equations systems is a research topic of great interest. This paper focuses on the study of bipolar fuzzy relation equations systems based on the max-product t-norm composition. Specifically, the solvability and the algebraic structure of the set of solutions of these bipolar equations systems will be studied, including the case in which such systems are composed of equations whose independent term be equal to zero. As a consequence, this paper complements the contribution carried out by the authors on the solvability of bipolar max-product fuzzy relation equations.