Goto

Collaborating Authors

 stackoverflow





High-Performance Variance-Covariance Matrix Construction Using an Uncentered Gram Formulation

arXiv.org Artificial Intelligence

Reichel (2025) defined the bariance as a pairwise-difference measure that can be rewritten in linear time using only scalar sums. We extend this idea to the covariance matrix by showing that the standard matrix expression involving the uncentered Gram matrix and a correction term is algebraically identical to the pairwise-difference definition while avoiding explicit centering. The computation then reduces to one outer product of dimension p-by-p and a single subtraction. Benchmarks in Python show clear runtime gains, especially when BLAS optimizations are absent. Optionally faster Gram-matrix routines such as RXTX (Rybin et al., 2025) further reduce overall cost.



Unified Flow Matching for Long Horizon Event Forecasting

arXiv.org Artificial Intelligence

Modeling long horizon marked event sequences is a fundamental challenge in many real-world applications, including healthcare, finance, and user behavior modeling. Existing neural temporal point process models are typically au-toregressive, predicting the next event one step at a time, which limits their efficiency and leads to error accumulation in long-range forecasting. In this work, we propose a unified flow matching framework for marked temporal point processes that enables non-autoregressive, joint modeling of inter-event times and event types, via continuous and discrete flow matching. By learning continuous-time flows for both components, our method generates coherent long horizon event trajectories without sequential decoding. We evaluate our model on six real-world benchmarks and demonstrate significant improvements over autoregressive and diffusion-based baselines in both accuracy and generation efficiency.


An Empirical Study of OpenAI API Discussions on Stack Overflow

arXiv.org Artificial Intelligence

The rapid advancement of large language models (LLMs), represented by OpenAI's GPT series, has significantly impacted various domains such as natural language processing, software development, education, healthcare, finance, and scientific research. However, OpenAI APIs introduce unique challenges that differ from traditional APIs, such as the complexities of prompt engineering, token-based cost management, non-deterministic outputs, and operation as black boxes. To the best of our knowledge, the challenges developers encounter when using OpenAI APIs have not been explored in previous empirical studies. To fill this gap, we conduct the first comprehensive empirical study by analyzing 2,874 OpenAI API-related discussions from the popular Q&A forum Stack Overflow. We first examine the popularity and difficulty of these posts. After manually categorizing them into nine OpenAI API-related categories, we identify specific challenges associated with each category through topic modeling analysis. Based on our empirical findings, we finally propose actionable implications for developers, LLM vendors, and researchers.


From Questions to Insights: Exploring XAI Challenges Reported on Stack Overflow Questions

arXiv.org Artificial Intelligence

The lack of interpretability is a major barrier that limits the practical usage of AI models. Several eXplainable AI (XAI) techniques (e.g., SHAP, LIME) have been employed to interpret these models' performance. However, users often face challenges when leveraging these techniques in real-world scenarios and thus submit questions in technical Q&A forums like Stack Overflow (SO) to resolve these challenges. We conducted an exploratory study to expose these challenges, their severity, and features that can make XAI techniques more accessible and easier to use. Our contributions to this study are fourfold. First, we manually analyzed 663 SO questions that discussed challenges related to XAI techniques. Our careful investigation produced a catalog of seven challenges (e.g., disagreement issues). We then analyzed their prevalence and found that model integration and disagreement issues emerged as the most prevalent challenges. Second, we attempt to estimate the severity of each XAI challenge by determining the correlation between challenge types and answer metadata (e.g., the presence of accepted answers). Our analysis suggests that model integration issues is the most severe challenge. Third, we attempt to perceive the severity of these challenges based on practitioners' ability to use XAI techniques effectively in their work. Practitioners' responses suggest that disagreement issues most severely affect the use of XAI techniques. Fourth, we seek agreement from practitioners on improvements or features that could make XAI techniques more accessible and user-friendly. The majority of them suggest consistency in explanations and simplified integration. Our study findings might (a) help to enhance the accessibility and usability of XAI and (b) act as the initial benchmark that can inspire future research.


An Empirical Study on Self-correcting Large Language Models for Data Science Code Generation

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have recently advanced many applications on software engineering tasks, particularly the potential for code generation. Among contemporary challenges, code generated by LLMs often suffers from inaccuracies and hallucinations, requiring external inputs to correct. One recent strategy to fix these issues is to refine the code generated from LLMs using the input from the model itself (self-augmented). In this work, we proposed a novel method, namely CoT-SelfEvolve. CoT-SelfEvolve iteratively and automatically refines code through a self-correcting process, guided by a chain of thought constructed from real-world programming problem feedback. Focusing on data science code, including Python libraries such as NumPy and Pandas, our evaluations on the DS-1000 dataset demonstrate that CoT-SelfEvolve significantly outperforms existing models in solving complex problems. The framework shows substantial improvements in both initial code generation and subsequent iterations, with the model's accuracy increasing significantly with each additional iteration. This highlights the effectiveness of using chain-of-thought prompting to address complexities revealed by program executor traceback error messages. We also discuss how CoT-SelfEvolve can be integrated into continuous software engineering environments, providing a practical solution for improving LLM-based code generation.


Semi-Variance Reduction for Fair Federated Learning

arXiv.org Artificial Intelligence

Ensuring fairness in a Federated Learning (FL) system, i.e., a satisfactory performance for all of the participating diverse clients, is an important and challenging problem. There are multiple fair FL algorithms in the literature, which have been relatively successful in providing fairness. However, these algorithms mostly emphasize on the loss functions of worst-off clients to improve their performance, which often results in the suppression of well-performing ones. As a consequence, they usually sacrifice the system's overall average performance for achieving fairness. Motivated by this and inspired by two well-known risk modeling methods in Finance, Mean-Variance and Mean-Semi-Variance, we propose and study two new fair FL algorithms, Variance Reduction (VRed) and Semi-Variance Reduction (SemiVRed). VRed encourages equality between clients' loss functions by penalizing their variance. In contrast, SemiVRed penalizes the discrepancy of only the worst-off clients' loss functions from the average loss. Through extensive experiments on multiple vision and language datasets, we show that, SemiVRed achieves SoTA performance in scenarios with heterogeneous data distributions and improves both fairness and system overall average performance.