AITopics

2509.06694

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.84)

Neural Information Processing SystemsOct-9-2025, 22:04:25 GMT

Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm

Reinforcement learning utilizing kernel ridge regression to predict the expected value function represents a powerful method with great representational capacity. This setting is a highly versatile framework amenable to analytical results. We consider kernel-based function approximation for RL in the infinite horizon average reward setting, also referred to as the undiscounted setting.

algorithm, assumption, confidence interval, (16 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

A. Rupam Mahmood, Hado P. van Hasselt, Richard S. Sutton

Weighted importance sampling for off-policy learning with linear function approximation

Neural Information Processing SystemsOct-9-2025, 14:09:03 GMT

Importance sampling is an essential component of off-policy model-free reinforcement learning algorithms. However, its most effective variant, weighted importance sampling, does not carry over easily to function approximation and, because of this, it is not utilized in existing off-policy learning algorithms. In this paper, we take two steps toward bridging this gap. First, we show that weighted importance sampling can be viewed as a special case of weighting the error of individual training samples, and that this weighting has theoretical and empirical benefits similar to those of weighted importance sampling. Second, we show that these benefits extend to a new weighted-importance-sampling version of off-policy LSTD(). We show empirically that our new WIS-LSTD() algorithm can result in much more rapid and reliable convergence than conventional off-policy LSTD() (Y u 2010, Bertsekas & Y u 2009).

algorithm, function approximation, wis-lstd, (13 more...)

Country:

North America > United States > Massachusetts (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)

Neural Information Processing SystemsOct-9-2025, 13:48:49 GMT

Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle

Simon S. Du, Yuping Luo, Ruosong Wang, Hanrui Zhang

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country:

Europe > United Kingdom > England (0.46)
North America > United States (0.28)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.44)

Neural Information Processing SystemsOct-8-2025, 17:18:26 GMT

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation

This paper investigates posterior sampling algorithms for competitive reinforcement learning (RL) in the context of general function approximations.

machine learning, pomg, reinforcement learning, (16 more...)

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.54)

Industry: Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Yesmin, Farjana, Shirmin, Nusrat

A Fuzzy Logic-Based Framework for Explainable Machine Learning in Big Data Analytics

arXiv.org Artificial IntelligenceOct-8-2025

The growing complexity of machine learning (ML) models in big data analytics, especially in domains such as environmental monitoring, highlights the critical need for interpretability and explainability to promote trust, ethical considerations, and regulatory adherence (e.g., GDPR). Traditional "black-box" models obstruct transparency, whereas post-hoc explainable AI (XAI) techniques like LIME and SHAP frequently compromise accuracy or fail to deliver inherent insights. This paper presents a novel framework that combines type-2 fuzzy sets, granular computing, and clustering to boost explainability and fairness in big data environments. When applied to the UCI Air Quality dataset, the framework effectively manages uncertainty in noisy sensor data, produces linguistic rules, and assesses fairness using silhouette scores and entropy. Key contributions encompass: (1) A type-2 fuzzy clustering approach that enhances cohesion by about 4% compared to type-1 methods (silhouette 0.365 vs. 0.349) and improves fairness (entropy 0.918); (2) Incorporation of fairness measures to mitigate biases in unsupervised scenarios; (3) A rule-based component for intrinsic XAI, achieving an average coverage of 0.65; (4) Scalable assessments showing linear runtime (roughly 0.005 seconds for sampled big data sizes). Experimental outcomes reveal superior performance relative to baselines such as DBSCAN and Agglomerative Clustering in terms of interpretability, fairness, and efficiency. Notably, the proposed method achieves a 4% improvement in silhouette score over type-1 fuzzy clustering and outperforms baselines in fairness (entropy reduction by up to 1%) and efficiency.

artificial intelligence, data mining, machine learning, (17 more...)

2510.0512

Country: North America > United States (0.15)

Genre: Research Report (1.00)

Industry:

Energy (0.69)
Law (0.49)
Information Technology (0.49)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceOct-6-2025

A Concept of Possibility for Real-World Events

Schwartz, Daniel G.

This paper offers a new concept of {\it possibility} as an alternative to the now-a-days standard concept originally introduced by L.A. Zadeh in 1978. This new version was inspired by the original but, formally, has nothing in common with it other than that they both adopt the Łukasiewicz multivalent interpretation of the logical connectives. Moreover, rather than seeking to provide a general notion of possibility, this focuses specifically on the possibility of a real-world event. An event is viewed as having prerequisites that enable its occurrence and constraints that may impede its occurrence, and the possibility of the event is computed as a function of the probabilities that the prerequisites hold and the constraints do not. This version of possibility might appropriately be applied to problems of planning. When there are multiple plans available for achieving a goal, this theory can be used to determine which plan is most possible, i.e., easiest or most feasible to complete. It is speculated that this model of reasoning correctly captures normal human reasoning about plans. The theory is elaborated and an illustrative example for vehicle route planning is provided. There is also a suggestion of potential future applications.

artificial intelligence, fuzzy logic, logic & formal reasoning, (18 more...)

2510.02655

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Industry: Transportation (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.46)

Neural Information Processing SystemsOct-3-2025, 08:27:49 GMT

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

algorithm, behavior policy, sarsa algorithm, (13 more...)

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Mandal, Saptarshi, Murthy, Yashaswini, Srikant, R.

Finite-Time Bounds for Distributionally Robust TD Learning with Linear Function Approximation

arXiv.org Artificial IntelligenceOct-3-2025

Distributionally robust reinforcement learning (DRRL) focuses on designing policies that achieve good performance under model uncertainties. In particular, we are interested in maximizing the worst-case long-term discounted reward, where the data for RL comes from a nominal model while the deployed environment can deviate from the nominal model within a prescribed uncertainty set. Existing convergence guarantees for robust temporal-difference (TD) learning for policy evaluation are limited to tabular MDPs or are dependent on restrictive discount-factor assumptions when function approximation is used. We present the first robust TD learning with linear function approximation, where robustness is measured with respect to the total-variation distance and Wasserstein-l distance uncertainty set. Additionally, our algorithm is both model-free and does not require generative access to the MDP. Our algorithm combines a two-time-scale stochastic-approximation update with an outer-loop target-network update. We establish an $\tilde{O}(1/ε^2)$ sample complexity to obtain an $ε$-accurate value estimate. Our results close a key gap between the empirical success of robust RL algorithms and the non-asymptotic guarantees enjoyed by their non-robust counterparts. The key ideas in the paper also extend in a relatively straightforward fashion to robust Q-learning with function approximation.

artificial intelligence, machine learning, reinforcement learning, (16 more...)