Fuzzy Logic
A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
Zhao, Heyang, He, Jiafan, Gu, Quanquan
The exploration-exploitation dilemma has been a central challenge in reinforcement learning (RL) with complex model classes. In this paper, we propose a new algorithm, Monotonic Q-Learning with Upper Confidence Bound (MQL-UCB) for RL with general function approximation. Our key algorithmic design includes (1) a general deterministic policy-switching strategy that achieves low switching cost, (2) a monotonic value function structure with carefully controlled function class complexity, and (3) a variance-weighted regression scheme that exploits historical trajectories with high data efficiency. MQL-UCB achieves minimax optimal regret of $\tilde{O}(d\sqrt{HK})$ when $K$ is sufficiently large and near-optimal policy switching cost of $\tilde{O}(dH)$, with $d$ being the eluder dimension of the function class, $H$ being the planning horizon, and $K$ being the number of episodes. Our work sheds light on designing provably sample-efficient and deployment-efficient Q-learning with nonlinear function approximation.
A New Approach to Intuitionistic Fuzzy Decision Making Based on Projection Technology and Cosine Similarity Measure
For a multi-attribute decision making (MADM) problem, the information of alternatives under different attributes is given in the form of intuitionistic fuzzy number(IFN). Intuitionistic fuzzy set (IFS) plays an important role in dealing with un-certain and incomplete information. The similarity measure of intuitionistic fuzzy sets (IFSs) has always been a research hotspot. A new similarity measure of IFSs based on the projection technology and cosine similarity measure, which con-siders the direction and length of IFSs at the same time, is first proposed in this paper. The objective of the presented pa-per is to develop a MADM method and medical diagnosis method under IFS using the projection technology and cosine similarity measure. Some examples are used to illustrate the comparison results of the proposed algorithm and some exist-ing methods. The comparison result shows that the proposed algorithm is effective and can identify the optimal scheme accurately. In medical diagnosis area, it can be used to quickly diagnose disease. The proposed method enriches the exist-ing similarity measure methods and it can be applied to not only IFSs, but also other interval-valued intuitionistic fuzzy sets(IVIFSs) as well.
Corruption-Robust Offline Reinforcement Learning with General Function Approximation
Ye, Chenlu, Yang, Rui, Gu, Quanquan, Zhang, Tong
We investigate the problem of corruption robustness in offline reinforcement learning (RL) with general function approximation, where an adversary can corrupt each sample in the offline dataset, and the corruption level $\zeta\geq0$ quantifies the cumulative corruption amount over $n$ episodes and $H$ steps. Our goal is to find a policy that is robust to such corruption and minimizes the suboptimality gap with respect to the optimal policy for the uncorrupted Markov decision processes (MDPs). Drawing inspiration from the uncertainty-weighting technique from the robust online RL setting \citep{he2022nearly,ye2022corruptionrobust}, we design a new uncertainty weight iteration procedure to efficiently compute on batched samples and propose a corruption-robust algorithm for offline RL. Notably, under the assumption of single policy coverage and the knowledge of $\zeta$, our proposed algorithm achieves a suboptimality bound that is worsened by an additive factor of $\mathcal O(\zeta \cdot (\text{CC}(\lambda,\hat{\mathcal F},\mathcal Z_n^H))^{1/2} (C(\hat{\mathcal F},\mu))^{-1/2} n^{-1})$ due to the corruption. Here $\text{CC}(\lambda,\hat{\mathcal F},\mathcal Z_n^H)$ is the coverage coefficient that depends on the regularization parameter $\lambda$, the confidence set $\hat{\mathcal F}$, and the dataset $\mathcal Z_n^H$, and $C(\hat{\mathcal F},\mu)$ is a coefficient that depends on $\hat{\mathcal F}$ and the underlying data distribution $\mu$. When specialized to linear MDPs, the corruption-dependent error term reduces to $\mathcal O(\zeta d n^{-1})$ with $d$ being the dimension of the feature map, which matches the existing lower bound for corrupted linear MDPs. This suggests that our analysis is tight in terms of the corruption-dependent term.
GBG++: A Fast and Stable Granular Ball Generation Method for Classification
Xie, Qin, Zhang, Qinghua, Xia, Shuyin, Zhao, Fan, Wu, Chengying, Wang, Guoyin, Ding, Weiping
Granular ball computing (GBC), as an efficient, robust, and scalable learning method, has become a popular research topic of granular computing. GBC includes two stages: granular ball generation (GBG) and multi-granularity learning based on the granular ball (GB). However, the stability and efficiency of existing GBG methods need to be further improved due to their strong dependence on $k$-means or $k$-division. In addition, GB-based classifiers only unilaterally consider the GB's geometric characteristics to construct classification rules, but the GB's quality is ignored. Therefore, in this paper, based on the attention mechanism, a fast and stable GBG (GBG++) method is proposed first. Specifically, the proposed GBG++ method only needs to calculate the distances from the data-driven center to the undivided samples when splitting each GB instead of randomly selecting the center and calculating the distances between it and all samples. Moreover, an outlier detection method is introduced to identify local outliers. Consequently, the GBG++ method can significantly improve effectiveness, robustness, and efficiency while being absolutely stable. Second, considering the influence of the sample size within the GB on the GB's quality, based on the GBG++ method, an improved GB-based $k$-nearest neighbors algorithm (GB$k$NN++) is presented, which can reduce misclassification at the class boundary. Finally, the experimental results indicate that the proposed method outperforms several existing GB-based classifiers and classical machine learning classifiers on $24$ public benchmark datasets.
State-of-the-Art Review and Synthesis: A Requirement-based Roadmap for Standardized Predictive Maintenance Automation Using Digital Twin Technologies
Ma, Sizhe, Flanigan, Katherine A., Bergés, Mario
Recent digital advances have popularized predictive maintenance (PMx), offering enhanced efficiency, automation, accuracy, cost savings, and independence in maintenance. Yet, it continues to face numerous limitations such as poor explainability, sample inefficiency of data-driven methods, complexity of physics-based methods, and limited generalizability and scalability of knowledge-based methods. This paper proposes leveraging Digital Twins (DTs) to address these challenges and enable automated PMx adoption at larger scales. While we argue that DTs have this transformative potential, they have not yet reached the level of maturity needed to bridge these gaps in a standardized way. Without a standard definition for such evolution, this transformation lacks a solid foundation upon which to base its development. This paper provides a requirement-based roadmap supporting standardized PMx automation using DT technologies. A systematic approach comprising two primary stages is presented. First, we methodically identify the Informational Requirements (IRs) and Functional Requirements (FRs) for PMx, which serve as a foundation from which any unified framework must emerge. Our approach to defining and using IRs and FRs to form the backbone of any PMx DT is supported by the track record of IRs and FRs being successfully used as blueprints in other areas, such as for product development within the software industry. Second, we conduct a thorough literature review spanning fields to determine the ways in which these IRs and FRs are currently being used within DTs, enabling us to point to the specific areas where further research is warranted to support the progress and maturation of requirement-based PMx DTs.
MP and MT properties of fuzzy inference with aggregation function
As the two basic fuzzy inference models, fuzzy modus ponens (FMP) and fuzzy modus tollens (FMT) have the important application in artificial intelligence. In order to solve FMP and FMT problems, Zadeh proposed a compositional rule of inference (CRI) method. This paper aims mainly to investigate the validity of A-compositional rule of inference (ACRI) method, as a generalized CRI method based on aggregation functions, from a logical view and an interpolative view, respectively. Specifically, the modus ponens (MP) and modus tollens (MT) properties of ACRI method are discussed in detail. It is shown that the aggregation functions to implement FMP and FMT problems provide more generality than the t-norms, uninorms and overlap functions as well-known the laws of T-conditionality, U-conditionality and O-conditionality, respectively. Moreover, two examples are also given to illustrate our theoretical results. Especially, Example 6.2 shows that the output B' in FMP(FMT) problem is close to B(DC) with our proposed inference method when the fuzzy input and the antecedent of fuzzy rule are near (the fuzzy input near with the negation of the seccedent in fuzzy rule).
Parameterized Convex Minorant for Objective Function Approximation in Amortized Optimization
Parameterized convex minorant (PCM) method is proposed for the approximation of the objective function in amortized optimization. In the proposed method, the objective function approximator is expressed by the sum of a PCM and a nonnegative gap function, where the objective function approximator is bounded from below by the PCM convex in the optimization variable. The proposed objective function approximator is a universal approximator for continuous functions, and the global minimizer of the PCM attains the global minimum of the objective function approximator. Therefore, the global minimizer of the objective function approximator can be obtained by a single convex optimization. As a realization of the proposed method, extended parameterized log-sum-exp network is proposed by utilizing a parameterized log-sum-exp network as the PCM. Numerical simulation is performed for parameterized non-convex objective function approximation and for learning-based nonlinear model predictive control to demonstrate the performance and characteristics of the proposed method. The simulation results support that the proposed method can be used to learn objective functions and to find a global minimizer reliably and quickly by using convex optimization algorithms.
The Energy Prediction Smart-Meter Dataset: Analysis of Previous Competitions and Beyond
Pekaslan, Direnc, Alonso-Moral, Jose Maria, Bandara, Kasun, Bergmeir, Christoph, Bernabe-Moreno, Juan, Eigenmann, Robert, Einecke, Nils, Ergen, Selvi, Godahewa, Rakshitha, Hewamalage, Hansika, Lago, Jesus, Limmer, Steffen, Rebhan, Sven, Rabinovich, Boris, Rajapasksha, Dilini, Song, Heda, Wagner, Christian, Wu, Wenlong, Magdalena, Luis, Triguero, Isaac
This paper presents the real-world smart-meter dataset and offers an analysis of solutions derived from the Energy Prediction Technical Challenges, focusing primarily on two key competitions: the IEEE Computational Intelligence Society (IEEE-CIS) Technical Challenge on Energy Prediction from Smart Meter data in 2020 (named EP) and its follow-up challenge at the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) in 2021 (named as XEP). These competitions focus on accurate energy consumption forecasting and the importance of interpretability in understanding the underlying factors. The challenge aims to predict monthly and yearly estimated consumption for households, addressing the accurate billing problem with limited historical smart meter data. The dataset comprises 3,248 smart meters, with varying data availability ranging from a minimum of one month to a year. This paper delves into the challenges, solutions and analysing issues related to the provided real-world smart meter data, developing accurate predictions at the household level, and introducing evaluation criteria for assessing interpretability. Additionally, this paper discusses aspects beyond the competitions: opportunities for energy disaggregation and pattern detection applications at the household level, significance of communicating energy-driven factors for optimised billing, and emphasising the importance of responsible AI and data privacy considerations. These aspects provide insights into the broader implications and potential advancements in energy consumption prediction. Overall, these competitions provide a dataset for residential energy research and serve as a catalyst for exploring accurate forecasting, enhancing interpretability, and driving progress towards the discussion of various aspects such as energy disaggregation, demand response programs or behavioural interventions.
Innovation and Word Usage Patterns in Machine Learning
Borges, Vítor Bandeira, Cajueiro, Daniel Oliveira
In this study, we delve into the dynamic landscape of machine learning research evolution. Initially, through the utilization of Latent Dirichlet Allocation, we discern pivotal themes and fundamental concepts that have emerged within the realm of machine learning. Subsequently, we undertake a comprehensive analysis to track the evolutionary trajectories of these identified themes. To quantify the novelty and divergence of research contributions, we employ the Kullback-Leibler Divergence metric. This statistical measure serves as a proxy for ``surprise'', indicating the extent of differentiation between the content of academic papers and the subsequent developments in research. By amalgamating these insights, we gain the ability to ascertain the pivotal roles played by prominent researchers and the significance of specific academic venues (periodicals and conferences) within the machine learning domain.
Client Orchestration and Cost-Efficient Joint Optimization for NOMA-Enabled Hierarchical Federated Learning
Wu, Bibo, Fang, Fang, Wang, Xianbin, Cai, Donghong, Fu, Shu, Ding, Zhiguo
Hierarchical federated learning (HFL) shows great advantages over conventional two-layer federated learning (FL) in reducing network overhead and interaction latency while still retaining the data privacy of distributed FL clients. However, the communication and energy overhead still pose a bottleneck for HFL performance, especially as the number of clients raises dramatically. To tackle this issue, we propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation in this paper, aiming to minimize the total cost of time and energy at each HFL global round. Specifically, we first propose a novel fuzzy logic based client orchestration policy considering client heterogenerity in multiple aspects, including channel quality, data quantity and model staleness. Subsequently, given the fuzzy based client-edge association, a joint edge server scheduling and resource allocation problem is formulated. Utilizing problem decomposition, we firstly derive the closed-form solution for the edge server scheduling subproblem via the penalty dual decomposition (PDD) method. Next, a deep deterministic policy gradient (DDPG) based algorithm is proposed to tackle the resource allocation subproblem considering time-varying environments. Finally, extensive simulations demonstrate that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.