Goto

Collaborating Authors

 combiner





RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

Yang, Yang, XU, Hua, Hu, Zhangyi, Yue, Yutao

arXiv.org Artificial Intelligence

Nowadays, Large Language Models (LLMs) are able to propose rules in natural language, overcoming constrains of a predefined predicate space inherent in traditional rule learning. However, existing methods using LLMs often overlook the combination effects of rules, and the potential of coupling LLMs with probabilistic rule learning to ensure robust inference is not fully explored. To address this gap, we introduce RLIE, a unified framework that integrates LLMs with probabilistic modeling to learn a set of probabilistic rules. The RLIE framework comprises four stages: (1) Rule generation, where a LLM proposes and filters candidate rules; (2) Logistic regression, which learns the probabilistic weights of the rules for global selection and calibration; (3) Iterative refinement, which continuously optimizes the rule set based on prediction errors; and (4) Evaluation, which compares the performance of the weighted rule set as a direct classifier against various methods of injecting the rules into an LLM. Generated rules are the evaluated with different inference strategies on multiple real-world datasets. While applying rules directly with corresponding weights brings us superior performance, prompting LLMs with rules, weights and classification results from the logistic model will surprising degrade the performance. This result aligns with the observation that LLMs excel at semantic generation and interpretation but are less reliable at fine-grained, controlled probabilistic integration. Our work investigates the potentials and limitations of using LLMs for inductive reasoning tasks, proposing a unified framework which integrates LLMs with classic probabilistic rule combination methods, paving the way for more reliable neuro-symbolic reasoning systems. In data-driven applications and scientific discovery, the goal is not merely to predict outcomes, but to construct a set of verifiable, reusable, and composable theories(Zhou et al., 2024; Y ang et al., 2024a; Minh et al., 2022). These theories can enable explainable, auditable decisions while driving the discovery of new knowledge and underlying structures(Y ang et al., 2023; 2024b). These theories can be expressed in formal, structural statements(Cohen et al., 1995; Cropper & Morel, 2021) or natural language hypotheses(Zhou et al., 2024), and they share a common characteristic: they are declarative, testable, and self-contained discriminative patterns that yield predictions verifiable by external evidence In this paper, we do not distinguish between the terms "rule" and "hypothesis", and will use "rule" throughout the text for consistency.



Compression with Bayesian Implicit Neural Representations

Neural Information Processing Systems

Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image. Based on this view, data can be compressed by overfitting a compact neural network to its functional representation and then encoding the network weights. However, most current solutions for this are inefficient, as quantization to low-bit precision substantially degrades the reconstruction quality. To address this issue, we propose overfitting variational Bayesian neural networks to the data and compressing an approximate posterior weight sample using relative entropy coding instead of quantizing and entropy coding it. This strategy enables direct optimization of the rate-distortion performance by minimizing the β -ELBO, and target different rate-distortion trade-offs for a given network architecture by adjusting β . Moreover, we introduce an iterative algorithm for learning prior weight distributions and employ a progressive refinement process for the variational posterior that significantly enhances performance. Experiments show that our method achieves strong performance on image and audio compression while retaining simplicity.


Structured Basis Function Networks: Loss-Centric Multi-Hypothesis Ensembles with Controllable Diversity

Dominguez, Alejandro Rodriguez, Shahzad, Muhammad, Hong, Xia

arXiv.org Artificial Intelligence

Existing approaches to predictive uncertainty rely either on multi-hypothesis prediction, which promotes diversity but lacks principled aggregation, or on ensemble learning, which improves accuracy but rarely captures the structured ambiguity. This implicitly means that a unified framework consistent with the loss geometry remains absent. The Structured Basis Function Network addresses this gap by linking multi-hypothesis prediction and ensembling through centroidal aggregation induced by Bregman divergences. The formulation applies across regression and classification by aligning predictions with the geometry of the loss, and supports both a closed-form least-squares estimator and a gradient-based procedure for general objectives. A tunable diversity mechanism provides parametric control of the bias-variance-diversity trade-off, connecting multi-hypothesis generalisation with loss-aware ensemble aggregation. Experiments validate this relation and use the mechanism to study the complexity-capacity-diversity trade-off across datasets of increasing difficulty with deep-learning predictors.



Uncertainty estimation in satellite precipitation spatial prediction by combining distributional regression algorithms

Papacharalampous, Georgia, Tyralis, Hristos, Doulamis, Nikolaos, Doulamis, Anastasios

arXiv.org Machine Learning

To facilitate effective decision-making, gridded satellite precipitation products should include uncertainty estimates. Machine learning has been proposed for issuing such estimates. However, most existing algorithms for this purpose rely on quantile regression. Distributional regression offers distinct advantages over quantile regression, including the ability to model intermittency as well as a stronger ability to extrapolate beyond the training data, which is critical for predicting extreme precipitation. In this work, we introduce the concept of distributional regression for the engineering task of creating precipitation datasets through data merging. Building upon this concept, we propose new ensemble learning methods that can be valuable not only for spatial prediction but also for prediction problems in general. These methods exploit conditional zero-adjusted probability distributions estimated with generalized additive models for location, scale, and shape (GAMLSS), spline-based GAMLSS and distributional regression forests as well as their ensembles (stacking based on quantile regression, and equal-weight averaging). To identify the most effective methods for our specific problem, we compared them to benchmarks using a large, multi-source precipitation dataset. Stacking emerged as the most successful strategy. Three specific stacking methods achieved the best performance based on the quantile scoring rule, although the ranking of these methods varied across quantile levels. This suggests that a task-specific combination of multiple algorithms could yield significant benefits.


Uncertainty estimation in spatial interpolation of satellite precipitation with ensemble learning

Papacharalampous, Georgia, Tyralis, Hristos, Doulamis, Nikolaos, Doulamis, Anastasios

arXiv.org Artificial Intelligence

Predictions in the form of probability distributions are crucial for decision-making. Quantile regression enables this within spatial interpolation settings for merging remote sensing and gauge precipitation data. However, ensemble learning of quantile regression algorithms remains unexplored in this context. Here, we address this gap by introducing nine quantile-based ensemble learners and applying them to large precipitation datasets. We employed a novel feature engineering strategy, reducing predictors to distance-weighted satellite precipitation at relevant locations, combined with location elevation. Our ensemble learners include six stacking and three simple methods (mean, median, best combiner), combining six individual algorithms: quantile regression (QR), quantile regression forests (QRF), generalized random forests (GRF), gradient boosting machines (GBM), light gradient boosting machines (LightGBM), and quantile regression neural networks (QRNN). These algorithms serve as both base learners and combiners within different stacking methods. We evaluated performance against QR using quantile scoring functions in a large dataset comprising 15 years of monthly gauge-measured and satellite precipitation in contiguous US (CONUS). Stacking with QR and QRNN yielded the best results across quantile levels of interest (0.025, 0.050, 0.075, 0.100, 0.200, 0.300, 0.400, 0.500, 0.600, 0.700, 0.800, 0.900, 0.925, 0.950, 0.975), surpassing the reference method by 3.91% to 8.95%. This demonstrates the potential of stacking to improve probabilistic predictions in spatial interpolation and beyond.