Goto

Collaborating Authors

 Overview


EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python

arXiv.org Machine Learning

Iterative learning procedures are ubiquitous in machine learning and modern statistics. Regularision is typically required to prevent inflating the expected loss of a procedure in later iterations via the propagation of noise inherent in the data. Significant emphasis has been placed on achieving this regularisation implicitly by stopping procedures early. The EarlyStopping-package provides a toolbox of (in-sample) sequential early stopping rules for several well-known iterative estimation procedures, such as truncated SVD, Landweber (gradient descent), conjugate gradient descent, L2-boosting and regression trees. One of the central features of the package is that the algorithms allow the specification of the true data-generating process and keep track of relevant theoretical quantities. In this paper, we detail the principles governing the implementation of the EarlyStopping-package and provide a survey of recent foundational advances in the theoretical literature. We demonstrate how to use the EarlyStopping-package to explore core features of implicit regularisation and replicate results from the literature.


EmoGRACE: Aspect-based emotion analysis for social media data

arXiv.org Artificial Intelligence

While sentiment analysis has advanced from sentence to aspect-level, i.e., the identification of concrete terms related to a sentiment, the equivalent field of Aspect-based Emotion Analysis (ABEA) is faced with dataset bottlenecks and the increased complexity of emotion classes in contrast to binary sentiments. This paper addresses these gaps, by generating a first ABEA training dataset, consisting of 2,621 English Tweets, and fine-tuning a BERT-based model for the ABEA sub-tasks of Aspect Term Extraction (ATE) and Aspect Emotion Classification (AEC). The dataset annotation process was based on the hierarchical emotion theory by Shaver et al. [1] and made use of group annotation and majority voting strategies to facilitate label consistency. The resulting dataset contained aspect-level emotion labels for Anger, Sadness, Happiness, Fear, and a None class. Using the new ABEA training dataset, the state-of-the-art ABSA model GRACE by Luo et al. [2] was fine-tuned for ABEA. The results reflected a performance plateau at an F1-score of 70.1% for ATE and 46.9% for joint ATE and AEC extraction. The limiting factors for model performance were broadly identified as the small training dataset size coupled with the increased task complexity, causing model overfitting and limited abilities to generalize well on new data.


A Review on Large Language Models for Visual Analytics

arXiv.org Artificial Intelligence

This paper provides a comprehensive review of the integration of Large Language Models (LLMs) with visual analytics, addressing their foundational concepts, capabilities, and wide-ranging applications. It begins by outlining the theoretical underpinnings of visual analytics and the transformative potential of LLMs, specifically focusing on their roles in natural language understanding, natural language generation, dialogue systems, and text-to-media transformations. The review further investigates how the synergy between LLMs and visual analytics enhances data interpretation, visualization techniques, and interactive exploration capabilities. Key tools and platforms including LIDA, Chat2VIS, Julius AI, and Zoho Analytics, along with specialized multimodal models such as ChartLlama and CharXIV, are critically evaluated. The paper discusses their functionalities, strengths, and limitations in supporting data exploration, visualization enhancement, automated reporting, and insight extraction. The taxonomy of LLM tasks, ranging from natural language understanding (NLU), natural language generation (NLG), to dialogue systems and text-to-media transformations, is systematically explored. This review provides a SWOT analysis of integrating Large Language Models (LLMs) with visual analytics, highlighting strengths like accessibility and flexibility, weaknesses such as computational demands and biases, opportunities in multimodal integration and user collaboration, and threats including privacy concerns and skill degradation. It emphasizes addressing ethical considerations and methodological improvements for effective integration.


Challenges and Trends in Egocentric Vision: A Survey

arXiv.org Artificial Intelligence

With the rapid development of artificial intelligence technologies and wearable devices, egocentric vision understanding has emerged as a new and challenging research direction, gradually attracting widespread attention from both academia and industry. Egocentric vision captures visual and multimodal data through cameras or sensors worn on the human body, offering a unique perspective that simulates human visual experiences. This paper provides a comprehensive survey of the research on egocentric vision understanding, systematically analyzing the components of egocentric scenes and categorizing the tasks into four main areas: subject understanding, object understanding, environment understanding, and hybrid understanding. We explore in detail the sub-tasks within each category. We also summarize the main challenges and trends currently existing in the field. Furthermore, this paper presents an overview of high-quality egocentric vision datasets, offering valuable resources for future research. By summarizing the latest advancements, we anticipate the broad applications of egocentric vision technologies in fields such as augmented reality, virtual reality, and embodied intelligence, and propose future research directions based on the latest developments in the field.


Persistent Homology-induced Graph Ensembles for Time Series Regressions

arXiv.org Artificial Intelligence

The effectiveness of Spatio-temporal Graph Neural Networks (STGNNs) in time-series applications is often limited by their dependence on fixed, hand-crafted input graph structures. Motivated by insights from the Topological Data Analysis (TDA) paradigm, of which real-world data exhibits multi-scale patterns, we construct several graphs using Persistent Homology Filtration -- a mathematical framework describing the multiscale structural properties of data points. Then, we use the constructed graphs as an input to create an ensemble of Graph Neural Networks. The ensemble aggregates the signals from the individual learners via an attention-based routing mechanism, thus systematically encoding the inherent multiscale structures of data. Four different real-world experiments on seismic activity prediction and traffic forecasting (PEMS-BAY, METR-LA) demonstrate that our approach consistently outperforms single-graph baselines while providing interpretable insights.


HQNN-FSP: A Hybrid Classical-Quantum Neural Network for Regression-Based Financial Stock Market Prediction

arXiv.org Artificial Intelligence

Financial time-series forecasting remains a challenging task due to complex temporal dependencies and market fluctuations. This study explores the potential of hybrid quantum-classical approaches to assist in financial trend prediction by leveraging quantum resources for improved feature representation and learning. A custom Quantum Neural Network (QNN) regressor is introduced, designed with a novel ansatz tailored for financial applications. Two hybrid optimization strategies are proposed: (1) a sequential approach where classical recurrent models (RNN/LSTM) extract temporal dependencies before quantum processing, and (2) a joint learning framework that optimizes classical and quantum parameters simultaneously. Systematic evaluation using TimeSeriesSplit, k-fold cross-validation, and predictive error analysis highlights the ability of these hybrid models to integrate quantum computing into financial forecasting workflows. The findings demonstrate how quantum-assisted learning can contribute to financial modeling, offering insights into the practical role of quantum resources in time-series analysis.


Automated Processing of eXplainable Artificial Intelligence Outputs in Deep Learning Models for Fault Diagnostics of Large Infrastructures

arXiv.org Artificial Intelligence

Deep Learning (DL) models processing images to recognize the health state of large infrastructure components can exhibit biases and rely on non-causal shortcuts. eXplainable Artificial Intelligence (XAI) can address these issues but manually analyzing explanations generated by XAI techniques is time-consuming and prone to errors. This work proposes a novel framework that combines post-hoc explanations with semi-supervised learning to automatically identify anomalous explanations that deviate from those of correctly classified images and may therefore indicate model abnormal behaviors. This significantly reduces the workload for maintenance decision-makers, who only need to manually reclassify images flagged as having anomalous explanations. The proposed framework is applied to drone-collected images of insulator shells for power grid infrastructure monitoring, considering two different Convolutional Neural Networks (CNNs), GradCAM explanations and Deep Semi-Supervised Anomaly Detection. The average classification accuracy on two faulty classes is improved by 8% and maintenance operators are required to manually reclassify only 15% of the images. We compare the proposed framework with a state-of-the-art approach based on the faithfulness metric: the experimental results obtained demonstrate that the proposed framework consistently achieves F_1 scores larger than those of the faithfulness-based approach. Additionally, the proposed framework successfully identifies correct classifications that result from non-causal shortcuts, such as the presence of ID tags printed on insulator shells.


Multivariate Gaussian Topic Modelling: A novel approach to discover topics with greater semantic coherence

arXiv.org Artificial Intelligence

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) elegantly model topics as probability distributions and are useful in identifying latent topics from large document corpora with minimal supervision, they suffer from difficulty in topic interpretability and reduced performance in shorter texts. Here we propose a novel Multivariate Gaussian Topic modelling (MGD) approach. In this approach topics are presented as Multivariate Gaussian Distributions and documents as Gaussian Mixture Models. Using EM algorithm, the various constituent Multivariate Gaussian Distributions and their corresponding parameters are identified. Analysis of the parameters helps identify the keywords having the highest variance and mean contributions to the topic, and from these key-words topic annotations are carried out. This approach is first applied on a synthetic dataset to demonstrate the interpretability benefits vis-\`a-vis LDA. A real-world application of this topic model is demonstrated in analysis of risks and hazards at a petrochemical plant by applying the model on safety incident reports to identify the major latent hazards plaguing the plant. This model achieves a higher mean topic coherence of 0.436 vis-\`a-vis 0.294 for LDA.


Applications of Large Language Model Reasoning in Feature Generation

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have revolutionized natural language processing through their state of art reasoning capabilities. This paper explores the convergence of LLM reasoning techniques and feature generation for machine learning tasks. We examine four key reasoning approaches: Chain of Thought, Tree of Thoughts, Retrieval-Augmented Generation, and Thought Space Exploration. Our analysis reveals how these approaches can be used to identify effective feature generation rules without having to manually specify search spaces. The paper categorizes LLM-based feature generation methods across various domains including finance, healthcare, and text analytics. LLMs can extract key information from clinical notes and radiology reports in healthcare, by enabling more efficient data utilization. In finance, LLMs facilitate text generation, summarization, and entity extraction from complex documents. We analyze evaluation methodologies for assessing feature quality and downstream performance, with particular attention to OCTree's decision tree reasoning approach that provides language-based feedback for iterative improvements. Current challenges include hallucination, computational efficiency, and domain adaptation. As of March 2025, emerging approaches include inference-time compute scaling, reinforcement learning, and supervised fine-tuning with model distillation. Future directions point toward multimodal feature generation, self-improving systems, and neuro-symbolic approaches. This paper provides a detailed overview of an emerging field that promises to automate and enhance feature engineering through language model reasoning.


CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity Quantification

arXiv.org Artificial Intelligence

Explainability is a critical factor influencing the wide deployment of deep vision models (DVMs). Concept-based post-hoc explanation methods can provide both global and local insights into model decisions. However, current methods in this field face challenges in that they are inflexible to automatically construct accurate and sufficient linguistic explanations for global concepts and local circuits. Particularly, the intrinsic polysemanticity in semantic Visual Concepts (VCs) impedes the interpretability of concepts and DVMs, which is underestimated severely. In this paper, we propose a Chain-of-Explanation (CoE) approach to address these issues. Specifically, CoE automates the decoding and description of VCs to construct global concept explanation datasets. Further, to alleviate the effect of polysemanticity on model explainability, we design a concept polysemanticity disentanglement and filtering mechanism to distinguish the most contextually relevant concept atoms. Besides, a Concept Polysemanticity Entropy (CPE), as a measure of model interpretability, is formulated to quantify the degree of concept uncertainty. The modeling of deterministic concepts is upgraded to uncertain concept atom distributions. Finally, CoE automatically enables linguistic local explanations of the decision-making process of DVMs by tracing the concept circuit. GPT-4o and human-based experiments demonstrate the effectiveness of CPE and the superiority of CoE, achieving an average absolute improvement of 36% in terms of explainability scores.