Goto

Collaborating Authors

 South America


Efficient Strategy for Improving Large Language Model (LLM) Capabilities

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have become a milestone in the field of artificial intelligence and natural language processing. However, their large-scale deployment remains constrained by the need for significant computational resources. This work proposes starting from a base model to explore and combine data processing and careful data selection techniques, training strategies, and architectural adjustments to improve the efficiency of LLMs in resource-constrained environments and within a delimited knowledge base. The methodological approach included defining criteria for building reliable datasets, conducting controlled experiments with different configurations, and systematically evaluating the resulting variants in terms of capability, versatility, response time, and safety. Finally, comparative tests were conducted to measure the performance of the developed variants and to validate the effectiveness of the proposed strategies. This work is based on the master's thesis in Systems and Computer Engineering titled Efficient Strategy for Improving the Capabilities of Large Language Models (LLMs) [1].


Evaluating Generative AI Tools for Personalized Offline Recommendations: A Comparative Study

arXiv.org Artificial Intelligence

Background: Generative AI tools have become increasingly relevant in supporting personalized recommendations across various domains. However, their effectiveness in health-related behavioral interventions, especially those aiming to reduce the use of technology, remains underexplored. Aims: This study evaluates the performance and user satisfaction of the five most widely used generative AI tools when recommending non-digital activities tailored to individuals at risk of repetitive strain injury. Method: Following the Goal/Question/Metric (GQM) paradigm, this proposed experiment involves generative AI tools that suggest offline activities based on predefined user profiles and intervention scenarios. The evaluation is focused on quantitative performance (precision, recall, F1-score and MCC-score) and qualitative aspects (user satisfaction and perceived recommendation relevance). Two research questions were defined: RQ1 assessed which tool delivers the most accurate recommendations, and RQ2 evaluated how tool choice influences user satisfaction.


Apple snails can regrow their eyeballs

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. If you step on a snail, you'll know it. Despite their slow speeds, and simple bodies, apple snails (Pomacea canaliculata) have eyes that are anatomically similar to human eyes. Both species have complex camera-like eyes with a lens, cornea, and retina that visually capture the world around them. Unlike humans, apple snails can regrow their peepers if they are injured or amputated.


FairLangProc: A Python package for fairness in NLP

arXiv.org Machine Learning

The astonishing results of the transformer architecture on Natural Language Processing (NLP) tasks (Devlin et al. 2019; Radford et al. 2019), their scalation properties (Vaswani et al. 2017) and the massive amount of text data available (Wang et al. 2019; Foundation Accessed 27/05/2025) have led to the development of Large Language Models (LLM) whose performance towers above that of traditional Language Models (LM) (Zhang et al. 2021; BigScience et al. 2022). Furthermore, LLMs have been widely adopted for custom downstream tasks by leveraging the flexibility provided by fine-tuning (Chung et al. 2024) and their few-shot learning capabilities (Brown et al. 2020), establishing a new zeitgeist in the NLP community. These factors have led to their widespread adoption across major areas of society such as academia (Naveed et al. 2023; Meyer et al. 2023); industry, including sectors such as finance (Li et al. 2023), healthcare (Goyal et al. 2024) or law (Lai et al. 2024) and personal use, for example, as a personal assistant or search engine (Xiong et al. 2024; Microsoft Accessed 27/05/2025). Furthermore, the recent surge in their reasoning ability (Wei et al. 2022) and the development of cost-efficient models (Liu et al. 2024) suggest that there are still new avenues for improvement.


Comprehensive Attribute Encoding and Dynamic LSTM HyperModels for Outcome Oriented Predictive Business Process Monitoring

arXiv.org Artificial Intelligence

--Predictive Business Process Monitoring (PBPM) aims to forecast future outcomes of ongoing business processes. However, existing methods often lack flexibility to handle real-world challenges such as simultaneous events, class imbalance, and multi-level attributes. While prior work has explored static encoding schemes and fixed LSTM architectures, they struggle to support adaptive representations and generalize across heterogeneous datasets. T o address these limitations, we propose a suite of dynamic LSTM HyperModels that integrate two-level hierarchical encoding for event and sequence attributes, character-based decomposition of event labels, and novel pseudo-embedding techniques for durations and attribute correlations. We further introduce specialized LSTM variants for simultaneous event modeling, leveraging multidimensional embeddings and time-difference flag augmentation. Experimental validation on four public and real-world datasets demonstrates up to 100% accuracy on balanced datasets and F1 scores exceeding 86% on imbalanced ones. Our approach advances PBPM by offering modular and interpretable models better suited for deployment in complex settings. Beyond PBPM, it contributes to the broader AI community by improving temporal outcome prediction, supporting data heterogeneity, and promoting explainable process intelligence frameworks. Impact Statement --Business processes underpin daily operations across healthcare, finance, public services, and logistics. Predicting the outcome of ongoing processes--such as whether a loan will be approved or a shipment delayed--can save time, reduce costs, and improve service. Our work introduces adaptive, interpretable models that overcome these hurdles, making accurate predictions in more realistic settings.


Quantum Neural Network applications to Protein Binding Affinity Predictions

arXiv.org Artificial Intelligence

Binding energy is a fundamental thermodynamic property that governs molecular interactions, playing a crucial role in fields such as healthcare and the natural sciences. It is particularly relevant in drug development, vaccine design, and other biomedical applications. Over the years, various methods have been developed to estimate protein binding energy, ranging from experimental techniques to computational approaches, with machine learning making significant contributions to this field. Although classical computing has demonstrated strong results in constructing predictive models, the variation of quantum computing for machine learning has emerged as a promising alternative. Quantum neural networks (QNNs) have gained traction as a research focus, raising the question of their potential advantages in predicting binding energies. To investigate this potential, this study explored the feasibility of QNNs for this task by proposing thirty variations of multilayer perceptron-based quantum neural networks. These variations span three distinct architectures, each incorporating ten different quantum circuits to configure their quantum layers. The performance of these quantum models was compared with that of a state-of-the-art classical multilayer perceptron-based artificial neural network, evaluating both accuracy and training time. A primary dataset was used for training, while two additional datasets containing entirely unseen samples were employed for testing. Results indicate that the quantum models achieved approximately 20% higher accuracy on one unseen dataset, although their accuracy was lower on the other datasets. Notably, quantum models exhibited training times several orders of magnitude shorter than their classical counterparts, highlighting their potential for efficient protein binding energy prediction.


Evaluation of Deep Learning Models for LBBB Classification in ECG Signals

arXiv.org Artificial Intelligence

This study explores different neural network architectures to evaluate their ability to extract spatial and temporal patterns from electrocardiographic (ECG) signals and classify them into three groups: healthy subjects, Left Bundle Branch Block (LBBB), and Strict Left Bundle Branch Block (sLBBB). Clinical Relevance, Innovative technologies enable the selection of candidates for Cardiac Resynchronization Therapy (CRT) by optimizing the classification of subjects with Left Bundle Branch Block (LBBB).


He'd need some LARGE SquarePants: Footage of a sea star with a 'big bottom' sparks hilarity as it's compared to SpongeBob's Patrick

Daily Mail - Science & tech

The sea floor is home to all sorts of weird and wonderful creatures. But one in particular has become an online sensation, thanks to its impressive'buttocks'. A big–bottomed sea star has been spotted more than 1,000 metres (3,280ft) below the waves. And it appears to have a backside that will make even the most avid gymgoer jealous. This has led many baffled viewers to compare the creature to Patrick from the animated series Spongebob Squarepants.


Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase Classification Using EEG

arXiv.org Artificial Intelligence

Biomedical decision making involves multiple signal processing, either from different sensors or from different channels. In both cases, information fusion plays a significant role. A deep learning based electroencephalogram channels' feature level fusion is carried out in this work for the electroencephalogram cyclic alternating pattern A phase classification. Channel selection, fusion, and classification procedures were optimized by two optimization algorithms, namely, Genetic Algorithm and Particle Swarm Optimization. The developed methodologies were evaluated by fusing the information from multiple electroencephalogram channels for patients with nocturnal frontal lobe epilepsy and patients without any neurological disorder, which was significantly more challenging when compared to other state of the art works. Results showed that both optimization algorithms selected a comparable structure with similar feature level fusion, consisting of three electroencephalogram channels, which is in line with the CAP protocol to ensure multiple channels' arousals for CAP detection. Moreover, the two optimized models reached an area under the receiver operating characteristic curve of 0.82, with average accuracy ranging from 77% to 79%, a result which is in the upper range of the specialist agreement. The proposed approach is still in the upper range of the best state of the art works despite a difficult dataset, and has the advantage of providing a fully automatic analysis without requiring any manual procedure. Ultimately, the models revealed to be noise resistant and resilient to multiple channel loss.


Quenched large deviations for Monte Carlo integration with Coulomb gases

arXiv.org Machine Learning

Gibbs measures, such as Coulomb gases, are popular in modelling systems of interacting particles. Recently, we proposed to use Gibbs measures as randomized numerical integration algorithms with respect to a target measure $π$ on $\mathbb R^d$, following the heuristics that repulsiveness between particles should help reduce integration errors. A major issue in this approach is to tune the interaction kernel and confining potential of the Gibbs measure, so that the equilibrium measure of the system is the target distribution $π$. Doing so usually requires another Monte Carlo approximation of the \emph{potential}, i.e. the integral of the interaction kernel with respect to $π$. Using the methodology of large deviations from Garcia--Zelada (2019), we show that a random approximation of the potential preserves the fast large deviation principle that guarantees the proposed integration algorithm to outperform independent or Markov quadratures. For non-singular interaction kernels, we make minimal assumptions on this random approximation, which can be the result of a computationally cheap Monte Carlo preprocessing. For the Coulomb interaction kernel, we need the approximation to be based on another Gibbs measure, and we prove in passing a control on the uniform convergence of the approximation of the potential.