Goto

Collaborating Authors

 barcelona


Adaptive Reinforcement Learning for Dynamic Configuration Allocation in Pre-Production Testing

Zhu, Yu

arXiv.org Machine Learning

Ensuring reliability in modern software systems requires rigorous pre-production testing across highly heterogeneous and evolving environments. Because exhaustive evaluation is infeasible, practitioners must decide how to allocate limited testing resources across configurations where failure probabilities may drift over time. Existing combinatorial optimization approaches are static, ad hoc, and poorly suited to such non-stationary settings. We introduce a novel reinforcement learning (RL) framework that recasts configuration allocation as a sequential decision-making problem. Our method is the first to integrate Q-learning with a hybrid reward design that fuses simulated outcomes and real-time feedback, enabling both sample efficiency and robustness. In addition, we develop an adaptive online-offline training scheme that allows the agent to quickly track abrupt probability shifts while maintaining long-run stability. Extensive simulation studies demonstrate that our approach consistently outperforms static and optimization-based baselines, approaching oracle performance. This work establishes RL as a powerful new paradigm for adaptive configuration allocation, advancing beyond traditional methods and offering broad applicability to dynamic testing and resource scheduling domains.


Sutton's predictions v The Coral, Starsailor, Picture Parlour & AI

BBC News

Forget Liverpool against Everton or Arsenal against Manchester City, the big fixture this weekend is Chris Sutton versus AI chatbot Copilot. Sutton claimed on this week's Monday Night Club that he is more intelligent than AI and the stats do support him, at least when it comes to football scores anyway. He is top of the BBC predictions league table (see bottom of page) so far, with three wins in the first four weeks - and he has beaten AI every time. I am 4-0 up on AI - I don't know if it feels pressure, but it should, Sutton said. Also, why is it not going to the best source? AI, however, does not agree that Sutton has superior intelligence, and seems to suggest that it is a bit early for him to be celebrating. When asked'are you worried that Chris Sutton is cleverer than you?' Copilot replied not in the slightest.


HiGraph: A Large-Scale Hierarchical Graph Dataset for Malware Analysis

Chen, Han, Wang, Hanchen, Chen, Hongmei, Zhang, Ying, Qin, Lu, Zhang, Wenjie

arXiv.org Artificial Intelligence

The advancement of graph-based malware analysis is critically limited by the absence of large-scale datasets that capture the inherent hierarchical structure of software. Existing methods often oversimplify programs into single level graphs, failing to model the crucial semantic relationship between high-level functional interactions and low-level instruction logic. To bridge this gap, we introduce \dataset, the largest public hierarchical graph dataset for malware analysis, comprising over \textbf{200M} Control Flow Graphs (CFGs) nested within \textbf{595K} Function Call Graphs (FCGs). This two-level representation preserves structural semantics essential for building robust detectors resilient to code obfuscation and malware evolution. We demonstrate HiGraph's utility through a large-scale analysis that reveals distinct structural properties of benign and malicious software, establishing it as a foundational benchmark for the community. The dataset and tools are publicly available at https://higraph.org.


LOGICPO: Efficient Translation of NL-based Logical Problems to FOL using LLMs and Preference Optimization

Viswanadha, Koushik, Ghosal, Deepanway, Aditya, Somak

arXiv.org Artificial Intelligence

Logical reasoning is a key task for artificial intelligence due to it's role in major downstream tasks such as Question Answering, Summarization. Recent methods in improving the reasoning ability of LLMs fall short in correctly converting a natural language reasoning problem to an equivalent logical formulation, which hinders the framework's overall ability to reason. Towards this, we propose to use finetuning on a preference optimization dataset to learn to parse and represent a natural language problem as a whole to a consistent logical program by 1) introducing a new supervised and preference optimization dataset LogicPO, and 2) adopting popular techniques such as Direct Preference Optimization (DPO), Kahneman-Tversky optimization (KTO) to finetune open-source LLMs. Our best model with Phi-3.5 consistently outperforms GPT-3.5-turbo's (8-shot) by producing 10% more logically correct and with 14% less syntax errors. Through the framework and our improved evaluation metrics, we offer a promising direction in improving the logical reasoning of LLMs by better representing them in their logical formulations.


Deep Learning in Renewable Energy Forecasting: A Cross-Dataset Evaluation of Temporal and Spatial Models

Sua, Lutfu, Wang, Haibo, Huang, Jun

arXiv.org Artificial Intelligence

Unpredictability of renewable energy sources coupled with the complexity of those methods used for various purposes in this area calls for the development of robust methods such as DL models within the renewable energy domain. Given the nonlinear relationships among variables in renewable energy datasets, DL models are preferred over traditional machine learning (ML) models because they can effectively capture and model complex interactions between variables. This research aims to identify the factors responsible for the accuracy of DL techniques, such as sampling, stationarity, linearity, and hyperparameter optimization for different algorithms. The proposed DL framework compares various methods and alternative training/test ratios. Seven ML methods, such as Long-Short Term Memory (LSTM), Stacked LSTM, Convolutional Neural Network (CNN), CNN-LSTM, Deep Neural Network (DNN), Multilayer Perceptron (MLP), and Encoder-Decoder (ED), were evaluated on two different datasets. The first dataset contains the weather and power generation data. It encompasses two distinct datasets, hourly energy demand data and hourly weather data in Spain, while the second dataset includes power output generated by the photovoltaic panels at 12 locations. This study deploys regularization approaches, including early stopping, neuron dropping, and L2 regularization, to reduce the overfitting problem associated with DL models. The LSTM and MLP models show superior performance. Their validation data exhibit exceptionally low root mean square error values.


More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce

Piao, Tong, Tang, Pei, Zhang, Zhipeng, Li, Jiaqi, Liu, Qiao, Wu, Zufeng

arXiv.org Artificial Intelligence

In recent years, Large Language Models (LLMs) have been widely applied across various domains due to their powerful domain adaptation capabilities. Previous studies have suggested that diverse, multi-modal data can enhance LLMs' domain adaptation performance. However, this hypothesis remains insufficiently validated in the e-commerce sector. To address this gap, we propose a comprehensive e-commerce multi-task framework and design empirical experiments to examine the impact of diverse data and tasks on LLMs from two perspectives: "capability comprehensiveness" and "task comprehensiveness." Specifically, we observe significant improvements in LLM performance by progressively introducing tasks related to new major capability areas and by continuously adding subtasks within different major capability domains. Furthermore, we observe that increasing model capacity amplifies the benefits of diversity, suggesting a synergistic relationship between model capacity and data diversity. Finally, we validate the best-performing model from our empirical experiments in the KDD Cup 2024, achieving a rank 5 in Task 1. This outcome demonstrates the significance of our research for advancing LLMs in the e-commerce domain.


Safeguarding Autonomy: a Focus on Machine Learning Decision Systems

Subías-Beltrán, Paula, Pujol, Oriol, de Lecuona, Itziar

arXiv.org Artificial Intelligence

As global discourse on AI regulation gains momentum, this paper focuses on delineating the impact of ML on autonomy and fostering awareness. Respect for autonomy is a basic principle in bioethics that establishes persons as decision-makers. While the concept of autonomy in the context of ML appears in several European normative publications, it remains a theoretical concept that has yet to be widely accepted in ML practice. Our contribution is to bridge the theoretical and practical gap by encouraging the practical application of autonomy in decision-making within ML practice by identifying the conditioning factors that currently prevent it. Consequently, we focus on the different stages of the ML pipeline to identify the potential effects on ML end-users' autonomy. To improve its practical utility, we propose a related question for each detected impact, offering guidance for identifying possible focus points to respect ML end-users autonomy in decision-making.


Agent-based Modeling meets the Capability Approach for Human Development: Simulating Homelessness Policy-making

Aguilera, Alba, Osman, Nardine, Curto, Georgina

arXiv.org Artificial Intelligence

The global rise in homelessness calls for urgent and alternative policy solutions. Non-profits and governmental organizations alert about the many challenges faced by people experiencing homelessness (PEH), which include not only the lack of shelter but also the lack of opportunities for personal development. In this context, the capability approach (CA), which underpins the United Nations Sustainable Development Goals (SDGs), provides a comprehensive framework to assess inequity in terms of real opportunities. This paper explores how the CA can be combined with agent-based modelling and reinforcement learning. The goals are: (1) implementing the CA as a Markov Decision Process (MDP), (2) building on such MDP to develop a rich decision-making model that accounts for more complex motivators of behaviour, such as values and needs, and (3) developing an agent-based simulation framework that allows to assess alternative policies aiming to expand or restore people's capabilities. The framework is developed in a real case study of health inequity and homelessness, working in collaboration with stakeholders, non-profits and domain experts. The ultimate goal of the project is to develop a novel agent-based simulation framework, rooted in the CA, which can be replicated in a diversity of social contexts to assess policies in a non-invasive way.


Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning

Lee, Sunwoo

arXiv.org Artificial Intelligence

Sharpness-aware minimization (SAM) is known to improve the generalization performance of neural networks. However, it is not widely used in real-world applications yet due to its expensive model perturbation cost. A few variants of SAM have been proposed to tackle such an issue, but they commonly do not alleviate the cost noticeably. In this paper, we propose a lightweight layer-wise gradient norm penalizing method that tackles the expensive computational cost of SAM while maintaining its superior generalization performance. Our study empirically proves that the gradient norm of the whole model can be effectively suppressed by penalizing the gradient norm of only a few critical layers. We also theoretically show that such a partial model perturbation does not harm the convergence rate of SAM, allowing them to be safely adapted in real-world applications. To demonstrate the efficacy of the proposed method, we perform extensive experiments comparing the proposed method to mini-batch SGD and the conventional SAM using representative computer vision and language modeling benchmarks.


World's most advanced humanoid robot gives chilling response when asked if it's going to take our jobs

Daily Mail - Science & tech

As robots get more and more advanced, it's natural to worry that we'll all soon be replaced by machines in the workplace. But the world's most advanced humanoid robot has hardly allayed our fears. At Mobile World Congress (MWC) in Barcelona this week, MailOnline spoke with Ameca the bot, made by British firm Engineered Arts. MailOnline asked the sophisticated machine: 'Will robots take all our jobs?' Somewhat concerningly, the bot replies: 'I don't know, how good are you at your job?' She continued: 'It depends how good you are at it I suppose.'