AITopics

Das, Abhinav, Schlüter, Stephan, Schneider, Lorenz

Electricity Price Prediction Using Multi-Kernel Gaussian Process Regression Combined with Kernel-Based Support Vector Regression

arXiv.org Artificial IntelligenceJan-14-2025

This paper presents a new hybrid model for predicting German electricity prices. The algorithm is based on combining Gaussian Process Regression (GPR) and Support Vector Regression (SVR). While GPR is a competent model for learning the stochastic pattern within the data and interpolation, its performance for out-of-sample data is not very promising. By choosing a suitable data-dependent covariance function, we can enhance the performance of GPR for the tested German hourly power prices. However, since the out-of-sample prediction depends on the training data, the prediction is vulnerable to noise and outliers. To overcome this issue, a separate prediction is made using SVR, which applies margin-based optimization, having an advantage in dealing with non-linear processes and outliers, since only certain necessary points (support vectors) in the training data are responsible for regression. Both individual predictions are later combined using the performance-based weight assignment method. A test on historic German power prices shows that this approach outperforms its chosen benchmarks such as the autoregressive exogenous model, the naive approach, as well as the long short-term memory approach of prediction.

covariance function, kernel, prediction, (14 more...)

2412.00123

Country:

Europe > Germany (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report > New Finding (0.67)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Jumaah, Mahmood A., Ali, Yossra H., Rashid, Tarik A.

Artificial Liver Classifier: A New Alternative to Conventional Machine Learning Models

arXiv.org Artificial IntelligenceJan-14-2025

Supervised machine learning classifiers often encounter challenges related to performance, accuracy, and overfitting. This paper introduces the Artificial Liver Classifier (ALC), a novel supervised learning classifier inspired by the human liver's detoxification function. The ALC is characterized by its simplicity, speed, hyperparameters-free, ability to reduce overfitting, and effectiveness in addressing multi-classification problems through straightforward mathematical operations. To optimize the ALC's parameters, an improved FOX optimization algorithm (IFOX) is employed as the training method. The proposed ALC was evaluated on five benchmark machine learning datasets: Iris Flower, Breast Cancer Wisconsin, Wine, Voice Gender, and MNIST. The results demonstrated competitive performance, with the ALC achieving 100% accuracy on the Iris dataset, surpassing logistic regression, multilayer perceptron, and support vector machine. Similarly, on the Breast Cancer dataset, it achieved 99.12% accuracy, outperforming XGBoost and logistic regression. Across all datasets, the ALC consistently exhibited lower overfitting gaps and loss compared to conventional classifiers. These findings highlight the potential of leveraging biological process simulations to develop efficient machine learning models and open new avenues for innovation in the field.

artificial intelligence, dataset, machine learning, (15 more...)

2501.08074

Country: Asia > Middle East > Iraq (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.87)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.76)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

arXiv.org Artificial IntelligenceJan-14-2025

Gandalf the Red: Adaptive Security for LLMs

Pfister, Niklas, Volhejn, Václav, Knott, Manuel, Arias, Santiago, Bazińska, Julia, Bichurin, Mykhailo, Commike, Alan, Darling, Janet, Dienes, Peter, Fiedler, Matthew, Haber, David, Kraft, Matthias, Lancini, Marco, Mathys, Max, Pascual-Ortiz, Damián, Podolak, Jakub, Romero-López, Adrià, Shiarlis, Kyriacos, Signer, Andreas, Terek, Zsolt, Theocharis, Athanasios, Timbrell, Daniel, Trautwein, Samuel, Watts, Samuel, Wu, Natalie, Rojas-Carulla, Mateo

Current evaluations of defenses against prompt attacks in large language model (LLM) applications often overlook two critical factors: the dynamic nature of adversarial behavior and the usability penalties imposed on legitimate users by restrictive defenses. We propose D-SEC (Dynamic Security Utility Threat Model), which explicitly separates attackers from legitimate users, models multi-step interactions, and rigorously expresses the security-utility in an optimizable form. We further address the shortcomings in existing evaluations by introducing Gandalf, a crowd-sourced, gamified red-teaming platform designed to generate realistic, adaptive attack datasets. Using Gandalf, we collect and release a dataset of 279k prompt attacks. Complemented by benign user data, our analysis reveals the interplay between security and utility, showing that defenses integrated in the LLM (e.g., system prompts) can degrade usability even without blocking requests. We demonstrate that restricted application domains, defense-in-depth, and adaptive defenses are effective strategies for building secure and useful LLM applications. Code is available at \href{https://github.com/lakeraai/dsec-gandalf}{\texttt{https://github.com/lakeraai/dsec-gandalf}}.

large language model, machine learning, natural language, (21 more...)

2501.07927

Country:

Asia (0.46)
Europe (0.28)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.93)
Media > Film (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Neural Information Processing SystemsJan-13-2025, 20:18:33 GMT

LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-resolution and Beyond

Last few years have witnessed impressive progress propelled by deep learning methods. However, one critical challenge faced by existing methods is to strike a sweet spot of deep model complexity and resulting SISR quality. This paper addresses this pain point by proposing a linearly-assembled pixel-adaptive regression network (LAPAR), which casts the direct LR to HR mapping learning into a linear coefficient regression task over a dictionary of multiple predefined filter bases. Moreover, based on the same idea, LAPAR is extended to tackle other restoration tasks, e.g., image denoising and JPEG image deblocking, and again, yields strong performance.

lapar, linearly-assembled pixel-adaptive regression network, single image super-resolution

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Ferber, Frédérick Fabre, Gay, Dominique, Soulié, Jean-Christophe, Diatta, Jean, Maillard, Odalric-Ambrym

Kriging and Gaussian Process Interpolation for Georeferenced Data Augmentation

Data augmentation is a crucial step in the development of robust supervised learning models, especially when dealing with limited datasets. This study explores interpolation techniques for the augmentation of geo-referenced data, with the aim of predicting the presence of Commelina benghalensis L. in sugarcane plots in La R{\'e}union. Given the spatial nature of the data and the high cost of data collection, we evaluated two interpolation approaches: Gaussian processes (GPs) with different kernels and kriging with various variograms. The objectives of this work are threefold: (i) to identify which interpolation methods offer the best predictive performance for various regression algorithms, (ii) to analyze the evolution of performance as a function of the number of observations added, and (iii) to assess the spatial consistency of augmented datasets. The results show that GP-based methods, in particular with combined kernels (GP-COMB), significantly improve the performance of regression algorithms while requiring less additional data. Although kriging shows slightly lower performance, it is distinguished by a more homogeneous spatial coverage, a potential advantage in certain contexts.

artificial intelligence, interpolation method, machine learning, (14 more...)

2501.07183

Country: Africa > La Réunion (0.15)

Genre: Research Report > New Finding (0.35)

Industry: Energy > Oil & Gas > Upstream (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Biyyala, Varun, Kathuria, Bharat Chanderprakash, Li, Jialu, Zhang, Youshan

SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing

Video editing models have advanced significantly, but evaluating their performance remains challenging. Traditional metrics, such as CLIP text and image scores, often fall short: text scores are limited by inadequate training data and hierarchical dependencies, while image scores fail to assess temporal consistency. We present SST-EM (Semantic, Spatial, and Temporal Evaluation Metric), a novel evaluation framework that leverages modern Vision-Language Models (VLMs), Object Detection, and Temporal Consistency checks. SST-EM comprises four components: (1) semantic extraction from frames using a VLM, (2) primary object tracking with Object Detection, (3) focused object refinement via an LLM agent, and (4) temporal consistency assessment using a Vision Transformer (ViT). These components are integrated into a unified metric with weights derived from human evaluations and regression analysis. The name SST-EM reflects its focus on Semantic, Spatial, and Temporal aspects of video evaluation. SST-EM provides a comprehensive evaluation of semantic fidelity and temporal smoothness in video editing. The source code is available in the \textbf{\href{https://github.com/custommetrics-sst/SST_CustomEvaluationMetrics.git}{GitHub Repository}}.

consistency, evaluation, temporal consistency, (14 more...)

2501.07554

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Guo, Wenxuan, Lee, JungHo, Toulis, Panos

ML-assisted Randomization Tests for Detecting Treatment Effects in A/B Experiments

arXiv.org Machine LearningJan-13-2025

Experimentation is widely utilized for causal inference and data-driven decision-making across disciplines. In an A/B experiment, for example, an online business randomizes two different treatments (e.g., website designs) to their customers and then aims to infer which treatment is better. In this paper, we construct randomization tests for complex treatment effects, including heterogeneity and interference. A key feature of our approach is the use of flexible machine learning (ML) models, where the test statistic is defined as the difference between the cross-validation errors from two ML models, one including the treatment variable and the other without it. This approach combines the predictive power of modern ML tools with the finite-sample validity of randomization procedures, enabling a robust and efficient way to detect complex treatment effects in experimental settings. We demonstrate this combined benefit both theoretically and empirically through applied examples.

procedure 1, randomization test, treatment effect, (14 more...)

arXiv.org Machine Learning

2501.07722

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Guedes-Ayala, Monse, Schewe, Lars, Suvak, Zeynep, Anjos, Miguel

Generating Poisoning Attacks against Ridge Regression Models with Categorical Features

Machine Learning (ML) models have become a very powerful tool to extract information from large datasets and use it to make accurate predictions and automated decisions. However, ML models can be vulnerable to external attacks, causing them to underperform or deviate from their expected tasks. One way to attack ML models is by injecting malicious data to mislead the algorithm during the training phase, which is referred to as a poisoning attack. We can prepare for such situations by designing anticipated attacks, which are later used for creating and testing defence strategies. In this paper, we propose an algorithm to generate strong poisoning attacks for a ridge regression model containing both numerical and categorical features that explicitly models and poisons categorical features. We model categorical features as SOS-1 sets and formulate the problem of designing poisoning attacks as a bilevel optimization problem that is nonconvex mixed-integer in the upper-level and unconstrained convex quadratic in the lower-level. We present the mathematical formulation of the problem, introduce a single-level reformulation based on the Karush-Kuhn-Tucker (KKT) conditions of the lower level, find bounds for the lower-level variables to accelerate solver performance, and propose a new algorithm to poison categorical features. Numerical experiments show that our method improves the mean squared error of all datasets compared to the previous benchmark in the literature.

artificial intelligence, categorical feature, machine learning, (17 more...)

2501.07275

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

AdaPRL: Adaptive Pairwise Regression Learning with Uncertainty Estimation for Universal Regression Tasks

Liang, Fuhang, Xu, Rucong, Lin, Deng

Current deep regression models usually learn in point-wise way that treat each sample as an independent input, neglecting the relative ordering among different data. Consequently, the regression model could neglect the data 's interrelationships, potentially resulting in suboptimal performance. Moreover, the existence of aleatoric uncertainty in the training data may drive the model to capture non-generalizable patterns, contributing to increased overfitting. To address these issues, we propose a novel adaptive pairwise learning framework (AdaPRL) for regression tasks which leverages the relative differences between data points and integrates with deep probabilistic models to quantify the uncertainty associated with the predictions. Additionally, we adapt AdaPRL for applications in multi-task learning and multivariate time series forecasting. Extensive experiments with several real-world regression datasets including recommendation systems, age estimation, time series forecasting, natural language understanding, finance, and industry datasets show that AdaPRL is compatible with different backbone networks in various tasks and achieves state-of-the-art performance on the vast majority of tasks, highlighting its notable potential including enhancing prediction accuracy and ranking ability, increasing generalization capability, improving robustness to noisy data, improving resilience to reduced data, and enhancing interpretability, etc.

artificial intelligence, machine learning, natural language, (11 more...)

2501.05809

Country:

North America > United States (0.46)
Asia (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Energy (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)