Goto

Collaborating Authors

 Overview


A Survey of Large Language Models in Discipline-specific Research: Challenges, Methods and Opportunities

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated their transformative potential across numerous disciplinary studies, reshaping the existing research methodologies and fostering interdisciplinary collaboration. However, a systematic understanding of their integration into diverse disciplines remains underexplored. This survey paper provides a comprehensive overview of the application of LLMs in interdisciplinary studies, categorising research efforts from both a technical perspective and with regard to their applicability. From a technical standpoint, key methodologies such as supervised fine-tuning, retrieval-augmented generation, agent-based approaches, and tool-use integration are examined, which enhance the adaptability and effectiveness of LLMs in discipline-specific contexts. From the perspective of their applicability, this paper explores how LLMs are contributing to various disciplines including mathematics, physics, chemistry, biology, and the humanities and social sciences, demonstrating their role in discipline-specific tasks. The prevailing challenges are critically examined and the promising research directions are highlighted alongside the recent advances in LLMs. By providing a comprehensive overview of the technical developments and applications in this field, this survey aims to serve as an invaluable resource for the researchers who are navigating the complex landscape of LLMs in the context of interdisciplinary studies.


CAS Condensed and Accelerated Silhouette: An Efficient Method for Determining the Optimal K in K-Means Clustering

arXiv.org Artificial Intelligence

--Clustering is a critical component of decision-making in today's data-driven environments. Clustering has been widely used in a variety of fields, such as bioinformatics, social network analysis, and image processing. However, clustering accuracy remains a major challenge in large datasets. This paper presents a comprehensive overview of strategies for selecting optimal k in clustering, with a focus on achieving a balance between clustering precision and computational efficiency in complex data environments. In addition, this paper introduces improvements to clustering techniques relating to text and image data to provide insights into better computational performance and cluster validity. The proposed approach is based on the Condensed Silhouette method, a statistical methods like Local Structures, Gap Statistics, Class-Consistency Ratio and Cluster Overlap Index(CCR-COI) based algorithm to calculate the best value of K for K-Means Clustering the data. The results of comparative experiments show that the proposed approach achieves up to 99% faster execution times on high-dimensional datasets while retaining both precision and scalability, making it highly suitable for real-time clustering needs or scenarios demanding efficient clustering with minimal resource utilization. Clustering is a critical component of unsupervised machine learning, with the K -means algorithm being particularly favored due to its straightforwardness, speed, and ability to be easily understood. Nonetheless, a major difficulty lies in accurately identifying the best number of clusters, K, especially with expansive and high-dimensional datasets where it is crucial to strike an effective balance between computational efficiency and accuracy.


Abductive Computational Systems: Creative Abduction and Future Directions

arXiv.org Artificial Intelligence

Abductive reasoning, reasoning for inferring explanations for observations, is often mentioned in scientific, design-related and artistic contexts, but its understanding varies across these domains. This paper reviews how abductive reasoning is discussed in epistemology, science and design, and then analyses how various computational systems use abductive reasoning. Our analysis shows that neither theoretical accounts nor computational implementations of abductive reasoning adequately address generating creative hypotheses. Theoretical frameworks do not provide a straightforward model for generating creative abductive hypotheses, and computational systems largely implement syllogistic forms of abductive reasoning. We break down abduc-tive computational systems into components and conclude by identifying specific directions for future research that could advance the state of creative abductive reasoning in computational systems.


Assessing the Capabilities and Limitations of FinGPT Model in Financial NLP Applications

arXiv.org Artificial Intelligence

The financial industry has long been a pioneer in adopting cutting-edge technologies to enhance operational efficiency, accuracy, and strategic decision-making [2]. With the exponential growth of structured and unstructured data, particularly from news feeds, earnings reports, disclosures, and social media, there is an increasing demand for intelligent systems capable of processing human language at scale [11]. Initially, the industry relied on rule-based approaches and traditional statistical techniques such as bag-of-words and TF-IDF [28], which offered limited semantic understanding. As noted by Abubakar et al.[1], these limitations triggered a shift toward machine learning and deep learning models that, while better at capturing patterns, still required substantial domain-specific feature engineering. This landscape was significantly transformed with the introduction of transformer-based architectures, most notably the Generative Pre-trained Transformer (GPT) family [5]. These models demonstrated the power of large-scale pretraining followed by task-specific fine-tuning, enabling generalization across diverse NLP tasks. Models such as GPT-3, GPT-4, BERT, and T5 have delivered state-of-the-art results in sentiment analysis, summarization, question answering, and named entity recognition [13]. Beyond LLMs, the broader field of Generative AI (GAI)--including GANs, V AEs, and diffusion models--has found increasing relevance in finance, facilitating applications such as synthetic data generation, automated reporting, and scenario simulation [32, 31]. LLMs have emerged as essential tools in processing unstructured financial text, especially models fine-tuned on finance-specific corpora like FinBERT, BloombergGPT, and FinGPT [4, 39].


Human Creativity and AI

arXiv.org Artificial Intelligence

With the advancement of science and technology, the philosophy of creativity has undergone significant reinterpretation. This paper investigates contemporary research in the fields of psychology, cognitive neuroscience, and the philosophy of creativity, particularly in the context of the development of artificial intelligence (AI) techniques. It aims to address the central question: Can AI exhibit creativity? The paper reviews the historical perspectives on the philosophy of creativity and explores the influence of psychological advancements on the study of creativity. Furthermore, it analyzes various definitions of creativity and examines the responses of naturalism and cognitive neuroscience to the concept of creativity.


Learnable quantum spectral filters for hybrid graph neural networks

arXiv.org Artificial Intelligence

In this paper, we describe a parameterized quantum circuit that can be considered as convolutional and pooling layers for graph neural networks. The circuit incorporates the parameterized quantum Fourier circuit where the qubit connections for the controlled gates derived from the Laplacian operator. Specifically, we show that the eigenspace of the Laplacian operator of a graph can be approximated by using QFT based circuit whose connections are determined from the adjacency matrix. For an $N\times N$ Laplacian, this approach yields an approximate polynomial-depth circuit requiring only $n=log(N)$ qubits. These types of circuits can eliminate the expensive classical computations for approximating the learnable functions of the Laplacian through Chebyshev polynomial or Taylor expansions. Using this circuit as a convolutional layer provides an $n-$ dimensional probability vector that can be considered as the filtered and compressed graph signal. Therefore, the circuit along with the measurement can be considered a very efficient convolution plus pooling layer that transforms an $N$-dimensional signal input into $n-$dimensional signal with an exponential compression. We then apply a classical neural network prediction head to the output of the circuit to construct a complete graph neural network. Since the circuit incorporates geometric structure through its graph connection-based approach, we present graph classification results for the benchmark datasets listed in TUDataset library. Using only [1-100] learnable parameters for the quantum circuit and minimal classical layers (1000-5000 parameters) in a generic setting, the obtained results are comparable to and in some cases better than many of the baseline results, particularly for the cases when geometric structure plays a significant role.


Task Assignment and Exploration Optimization for Low Altitude UAV Rescue via Generative AI Enhanced Multi-agent Reinforcement Learning

arXiv.org Artificial Intelligence

The integration of emerging uncrewed aerial vehicles (UAVs) with artificial intelligence (AI) and ground-embedded robots (GERs) has transformed emergency rescue operations in unknown environments. However, the high computational demands often exceed a single UAV's capacity, making it difficult to continuously provide stable high-level services. To address this, this paper proposes a cooperation framework involving UAVs, GERs, and airships. The framework enables resource pooling through UAV-to-GER (U2G) and UAV-to-airship (U2A) links, offering computing services for offloaded tasks. Specifically, we formulate the multi-objective problem of task assignment and exploration as a dynamic long-term optimization problem aiming to minimize task completion time and energy use while ensuring stability. Using Lyapunov optimization, we transform it into a per-slot deterministic problem and propose HG-MADDPG, which combines the Hungarian algorithm with a GDM-based multi-agent deep deterministic policy gradient. Simulations demonstrate significant improvements in offloading efficiency, latency, and system stability over baselines.


Toward Holistic Evaluation of Recommender Systems Powered by Generative Models

arXiv.org Artificial Intelligence

Recommender systems powered by generative models (Gen-RecSys) extend beyond classical item ranking by producing open-ended content, which simultaneously unlocks richer user experiences and introduces new risks. On one hand, these systems can enhance personalization and appeal through dynamic explanations and multi-turn dialogues. On the other hand, they might venture into unknown territory-hallucinating nonexistent items, amplifying bias, or leaking private information. Traditional accuracy metrics cannot fully capture these challenges, as they fail to measure factual correctness, content safety, or alignment with user intent. This paper makes two main contributions. First, we categorize the evaluation challenges of Gen-RecSys into two groups: (i) existing concerns that are exacerbated by generative outputs (e.g., bias, privacy) and (ii) entirely new risks (e.g., item hallucinations, contradictory explanations). Second, we propose a holistic evaluation approach that includes scenario-based assessments and multi-metric checks-incorporating relevance, factual grounding, bias detection, and policy compliance. Our goal is to provide a guiding framework so researchers and practitioners can thoroughly assess Gen-RecSys, ensuring effective personalization and responsible deployment.


UnIT: Scalable Unstructured Inference-Time Pruning for MAC-efficient Neural Inference on MCUs

arXiv.org Artificial Intelligence

Existing pruning methods are typically applied during training or compile time and often rely on structured sparsity. While compatible with low-power microcontrollers (MCUs), structured pruning underutilizes the opportunity for fine-grained efficiency on devices without SIMD support or parallel compute. To address these limitations, we introduce UnIT (Unstructured Inference-Time pruning), a lightweight method that dynamically identifies and skips unnecessary multiply-accumulate (MAC) operations during inference, guided by input-specific activation patterns. Unlike structured pruning, UnIT embraces irregular sparsity and does not require retraining or hardware specialization. It transforms pruning decisions into lightweight comparisons, replacing multiplications with threshold checks and approximated divisions. UnIT further optimizes compute by reusing threshold computations across multiple connections and applying layer- and group-specific pruning sensitivity. We present three fast, hardware-friendly division approximations tailored to the capabilities of common embedded platforms. Demonstrated on the MSP430 microcontroller, UnIT achieves 11.02% to 82.03% MAC reduction, 27.30% to 84.19% faster inference, and 27.33% to 84.38% lower energy consumption compared to training-time pruned models, while maintaining accuracy with 0.48-7%. Under domain shift, UnIT matches or exceeds the accuracy of retrained models while requiring significantly fewer MACs. These results establish unstructured inference-time pruning as a viable and practical solution for efficient, retraining-free deployment of deep neural networks on MCUs.


Predicting and generating antibiotics against future pathogens with ApexOracle

arXiv.org Artificial Intelligence

Antimicrobial resistance (AMR) is escalating and outpacing current antibiotic development. Thus, discovering antibiotics effective against emerging pathogens is becoming increasingly critical. However, existing approaches cannot rapidly identify effective molecules against novel pathogens or emerging drug-resistant strains. Here, we introduce ApexOracle, an artificial intelligence (AI) model that both predicts the antibacterial potency of existing compounds and designs de novo molecules active against strains it has never encountered. Departing from models that rely solely on molecular features, ApexOracle incorporates pathogen-specific context through the integration of molecular features captured via a foundational discrete diffusion language model and a dual-embedding framework that combines genomic- and literature-derived strain representations. Across diverse bacterial species and chemical modalities, ApexOracle consistently outperformed state-of-the-art approaches in activity prediction and demonstrated reliable transferability to novel pathogens with little or no antimicrobial data. Its unified representation-generation architecture further enables the in silico creation of "new-to-nature" molecules with high predicted efficacy against priority threats. By pairing rapid activity prediction with targeted molecular generation, ApexOracle offers a scalable strategy for countering AMR and preparing for future infectious-disease outbreaks.