Goto

Collaborating Authors

 Overview


Selective Use of Yannakakis' Algorithm to Improve Query Performance: Machine Learning to the Rescue

arXiv.org Artificial Intelligence

Query optimization has played a central role in database research for decades. However, more often than not, the proposed optimization techniques lead to a performance improvement in some, but not in all, situations. Therefore, we urgently need a methodology for designing a decision procedure that decides for a given query whether the optimization technique should be applied or not. In this work, we propose such a methodology with a focus on Yannakakis-style query evaluation as our optimization technique of interest. More specifically, we formulate this decision problem as an algorithm selection problem and we present a Machine Learning based approach for its solution. Empirical results with several benchmarks on a variety of database systems show that our approach indeed leads to a statistically significant performance improvement.


Climate And Resource Awareness is Imperative to Achieving Sustainable AI (and Preventing a Global AI Arms Race)

arXiv.org Artificial Intelligence

Sustainability encompasses three key facets: economic, environmental, and social. However, the nascent discourse that is emerging on sustainable artificial intelligence (AI) has predominantly focused on the environmental sustainability of AI, often neglecting the economic and social aspects. Achieving truly sustainable AI necessitates addressing the tension between its climate awareness and its social sustainability, which hinges on equitable access to AI development resources. The concept of resource awareness advocates for broader access to the infrastructure required to develop AI, fostering equity in AI innovation. Yet, this push for improving accessibility often overlooks the environmental costs of expanding such resource usage. In this position paper, we argue that reconciling climate and resource awareness is essential to realizing the full potential of sustainable AI. We use the framework of base-superstructure to analyze how the material conditions are influencing the current AI discourse. We also introduce the Climate and Resource Aware Machine Learning (CARAML) framework to address this conflict and propose actionable recommendations spanning individual, community, industry, government, and global levels to achieve sustainable AI.


Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models

arXiv.org Artificial Intelligence

In this paper, we explore machine unlearning from a novel dimension, by studying how to safeguard model unlearning in large language models (LLMs). Our goal is to prevent unlearned models from recalling any related memory of the targeted knowledge.We begin by uncovering a surprisingly simple yet overlooked fact: existing methods typically erase only the exact expressions of the targeted knowledge, leaving paraphrased or related information intact. To rigorously measure such oversights, we introduce UGBench, the first benchmark tailored for evaluating the generalisation performance across 13 state-of-the-art methods.UGBench reveals that unlearned models can still recall paraphrased answers and retain target facts in intermediate layers. To address this, we propose PERMU, a perturbation-based method that significantly enhances the generalisation capabilities for safeguarding LLM unlearning.Experiments demonstrate that PERMU delivers up to a 50.13% improvement in unlearning while maintaining a 43.53% boost in robust generalisation. Our code can be found in https://github.com/MaybeLizzy/UGBench.


Physics-Informed Neural Networks for Optimal Vaccination Plan in SIR Epidemic Models

arXiv.org Artificial Intelligence

The eradication time is defined as the earliest time the infectious population drops below a given threshold and remains below it. For time-homogeneous models, the eradication time is well-defined due to the predictable dynamics of the infectious population, and optimal control strategies can be systematically studied. We utilize Physics-Informed Neural Networks (PINNs) to solve the partial differential equation (PDE) governing the eradication time and derive the corresponding optimal vaccination control. The PINN framework enables a mesh-free solution to the PDE by embedding the dynamics directly into the loss function of a deep neural network. We use a variable scaling method to ensure stable training of PINN and mathematically analyze that this method is effective in our setting. This approach provides an efficient computational alternative to traditional numerical methods, allowing for an approximation of the eradication time and the optimal control strategy. Through numerical experiments, we validate the effectiveness of the proposed method in computing the minimum eradication time and achieving optimal control. This work offers a novel application of PINNs to epidemic modeling, bridging mathematical theory and computational practice for time-homogeneous SIR models.


ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model

arXiv.org Artificial Intelligence

Goal-oriented de novo molecule design, namely generating molecules with specific property or substructure constraints, is a crucial yet challenging task in drug discovery. Existing methods, such as Bayesian optimization and reinforcement learning, often require training multiple property predictors and struggle to incorporate substructure constraints. Inspired by the success of Large Language Models (LLMs) in text generation, we propose ChatMol, a novel approach that leverages LLMs for molecule design across diverse constraint settings. Initially, we crafted a molecule representation compatible with LLMs and validated its efficacy across multiple online LLMs. Afterwards, we developed specific prompts geared towards diverse constrained molecule generation tasks to further fine-tune current LLMs while integrating feedback learning derived from property prediction. Finally, to address the limitations of LLMs in numerical recognition, we referred to the position encoding method and incorporated additional encoding for numerical values within the prompt. Experimental results across single-property, substructure-property, and multi-property constrained tasks demonstrate that ChatMol consistently outperforms state-of-the-art baselines, including VAE and RL-based methods. Notably, in multi-objective binding affinity maximization task, ChatMol achieves a significantly lower KD value of 0.25 for the protein target ESR1, while maintaining the highest overall performance, surpassing previous methods by 4.76%. Meanwhile, with numerical enhancement, the Pearson correlation coefficient between the instructed property values and those of the generated molecules increased by up to 0.49. These findings highlight the potential of LLMs as a versatile framework for molecule generation, offering a promising alternative to traditional latent space and RL-based approaches.


Advancements in Natural Language Processing for Automatic Text Summarization

arXiv.org Artificial Intelligence

The substantial growth of textual content in diverse domains and platforms has led to a considerable need for Automatic Text Summarization (ATS) techniques that aid in the process of text analysis. The effectiveness of text summarization models has been significantly enhanced in a variety of technical domains because of advancements in Natural Language Processing (NLP) and Deep Learning (DL). Despite this, the process of summarizing textual information continues to be significantly constrained by the intricate writing styles of a variety of texts, which involve a range of technical complexities. Text summarization techniques can be broadly categorized into two main types: abstractive summarization and extractive summarization. Extractive summarization involves directly extracting sentences, phrases, or segments of text from the content without making any changes. On the other hand, abstractive summarization is achieved by reconstructing the sentences, phrases, or segments from the original text using linguistic analysis. Through this study, a linguistically diverse categorizations of text summarization approaches have been addressed in a constructive manner. In this paper, the authors explored existing hybrid techniques that have employed both extractive and abstractive methodologies. In addition, the pros and cons of various approaches discussed in the literature are also investigated. Furthermore, the authors conducted a comparative analysis on different techniques and matrices to evaluate the generated summaries using language generation models. This survey endeavors to provide a comprehensive overview of ATS by presenting the progression of language processing regarding this task through a breakdown of diverse systems and architectures accompanied by technical and mathematical explanations of their operations.


A Survey of Graph Transformers: Architectures, Theories and Applications

arXiv.org Artificial Intelligence

Graph Transformers (GTs) have demonstrated a strong capability in modeling graph structures by addressing the intrinsic limitations of graph neural networks (GNNs), such as over-smoothing and over-squashing. Recent studies have proposed diverse architectures, enhanced explainability, and practical applications for Graph Transformers. In light of these rapid developments, we conduct a comprehensive review of Graph Transformers, covering aspects such as their architectures, theoretical foundations, and applications within this survey. We categorize the architecture of Graph Transformers according to their strategies for processing structural information, including graph tokenization, positional encoding, structure-aware attention and model ensemble. Furthermore, from the theoretical perspective, we examine the expressivity of Graph Transformers in various discussed architectures and contrast them with other advanced graph learning algorithms to discover the connections. Furthermore, we provide a summary of the practical applications where Graph Transformers have been utilized, such as molecule, protein, language, vision, traffic, brain and material data. At the end of this survey, we will discuss the current challenges and prospective directions in Graph Transformers for potential future research.


UK universities warned to 'stress-test' assessments as 92% of students use AI

The Guardian

British universities have been warned to "stress-test" all assessments after new research revealed "almost all" undergraduates are using generative artificial intelligence (genAI) in their studies. A survey of 1,000 students – both domestic and international – found there had been an "explosive increase" in the use of genAI in the past 12 months. Almost nine out of 10 (88%) in the 2025 poll said they used tools such as ChatGPT for their assessments, up from 53% last year. The proportion using any AI tool surged from 66% in 2024 to 92% in 2025, meaning just 8% of students are not using AI, according to a report published by the Higher Education Policy Institute and Kortext, a digital etextbook provider. Josh Freeman, the report's author, said such dramatic changes in behaviour in just 12 months were almost unheard of, and warned: "Universities should take heed: generative AI is here to stay. "There are urgent lessons here for institutions," Freeman said. "Every assessment must be reviewed in case it can be completed easily using AI.


Sparkle: A Statistical Learning Toolkit for High-Dimensional Hawkes Processes in Python

arXiv.org Machine Learning

This paper introduce the Python package Sparklen (see Lacoste (2025)), which implements a complete set of statistical learning methods for exponential Hawkes processes with an emphasize on high-dimension setting. Hawkes processes, introduced in Hawkes (1971), form a specific but rather versatile class of point processes. Such processes model time series in which the occurrence of one event temporarily increases the probability of other events occurring. This intrinsic ability to take into account self-exciting effects makes them particularly interesting for real data modeling. Historically applied in seismology (see Ogata (1988)), they have since been used in a wide variety of other fields, including neuroscience in Reynaud-Bouret, Rivoirard, and Tuleau-Malot (2013), finance in Bacry, Mastromatteo, and Muzy (2015), ecology in Denis, Dion-Blanc, Lacoste, Sansonnet, and Bas (2024). The multidimensional version, known as the Multivariate Hawkes Processes (MHP), captures additionally interactions among each univariate process within a network. This generalization enables the modeling of more intricate dynamics, significantly expanding the range of potential applications. For example, MHP has been applied to model action potentials within neural networks in Bonnet, Dion-Blanc, Gindraud, and Lemler (2022), or for trend detection in social networks in Pinto, Chahed, and Altman (2015).


District Vitality Index Using Machine Learning Methods for Urban Planners

arXiv.org Artificial Intelligence

City leaders face critical decisions regarding budget allocation and investment priorities. How can they identify which city districts require revitalization? To address this challenge, a Current Vitality Index and a Long-Term Vitality Index are proposed. These indexes are based on a carefully curated set of indicators. Missing data is handled using K-Nearest Neighbors imputation, while Random Forest is employed to identify the most reliable and significant features. Additionally, k-means clustering is utilized to generate meaningful data groupings for enhanced monitoring of Long-Term Vitality. Current vitality is visualized through an interactive map, while Long-Term Vitality is tracked over 15 years with predictions made using Multilayer Perceptron or Linear Regression. The results, approved by urban planners, are already promising and helpful, with the potential for further improvement as more data becomes available. This paper proposes leveraging machine learning methods to optimize urban planning and enhance citizens' quality of life.