Goto

Collaborating Authors

 Instructional Material


Multi-modal Time Series Analysis: A Tutorial and Survey

arXiv.org Artificial Intelligence

Multi-modal time series analysis has recently emerged as a prominent research area in data mining, driven by the increasing availability of diverse data modalities, such as text, images, and structured tabular data from real-world sources. However, effective analysis of multi-modal time series is hindered by data heterogeneity, modality gap, misalignment, and inherent noise. Recent advancements in multi-modal time series methods have exploited the multi-modal context via cross-modal interactions based on deep learning methods, significantly enhancing various downstream tasks. In this tutorial and survey, we present a systematic and up-to-date overview of multi-modal time series datasets and methods. We first state the existing challenges of multi-modal time series analysis and our motivations, with a brief introduction of preliminaries. Then, we summarize the general pipeline and categorize existing methods through a unified cross-modal interaction framework encompassing fusion, alignment, and transference at different levels (\textit{i.e.}, input, intermediate, output), where key concepts and ideas are highlighted. We also discuss the real-world applications of multi-modal analysis for both standard and spatial time series, tailored to general and specific domains. Finally, we discuss future research directions to help practitioners explore and exploit multi-modal time series. The up-to-date resources are provided in the GitHub repository: https://github.com/UConn-DSIS/Multi-modal-Time-Series-Analysis


Rendering Transparency to Ranking in Educational Assessment via Bayesian Comparative Judgement

arXiv.org Artificial Intelligence

Ensuring transparency in educational assessment is increasingly critical, particularly post-pandemic, as demand grows for fairer and more reliable evaluation methods. Comparative Judgement (CJ) offers a promising alternative to traditional assessments, yet concerns remain about its perceived opacity. This paper examines how Bayesian Comparative Judgement (BCJ) enhances transparency by integrating prior information into the judgement process, providing a structured, data-driven approach that improves interpretability and accountability. BCJ assigns probabilities to judgement outcomes, offering quantifiable measures of uncertainty and deeper insights into decision confidence. By systematically tracking how prior data and successive judgements inform final rankings, BCJ clarifies the assessment process and helps identify assessor disagreements. Multi-criteria BCJ extends this by evaluating multiple learning outcomes (LOs) independently, preserving the richness of CJ while producing transparent, granular rankings aligned with specific assessment goals. It also enables a holistic ranking derived from individual LOs, ensuring comprehensive evaluations without compromising detailed feedback. Using a real higher education dataset with professional markers in the UK, we demonstrate BCJ's quantitative rigour and ability to clarify ranking rationales. Through qualitative analysis and discussions with experienced CJ practitioners, we explore its effectiveness in contexts where transparency is crucial, such as high-stakes national assessments. We highlight the benefits and limitations of BCJ, offering insights into its real-world application across various educational settings.


AI Agents: Evolution, Architecture, and Real-World Applications

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) has evolved dramatically over the past decade, transitioning from specialized systems designed for narrow tasks to increasingly sophisticated architectures capable of autonomous operation across diverse domains. Among these advancements, AI agents represent a particularly significant development, embodying a paradigm shift in how intelligent systems interact with their environments, make decisions, and achieve complex goals. Unlike traditional AI systems that execute predefined algorithms within constraints, AI agents possess the capacity to autonomously perceive, reason, and act, often adapting their behavior based on environmental feedback and accumulated experience. The concept of an AI agent refers to a system or program that is capable of autonomously performing tasks on behalf of a user or another system by designing its workflow and utilizing available tools. These agents can encompass a wide range of functionalities beyond natural language processing, including decision making, problem solving, interacting with external environments, and executing actions. As Kapoor et al. (2024) note in their analysis of agent benchmarks, the development of AI agents represents an exciting new research direction with significant implications for real-world applications across numerous industries. The evolution of AI agents has been accelerated by recent breakthroughs in large language models (LLMs), which have provided a foundation for more sophisticated reasoning capabilities. Modern AI agents leverage these advanced language models as core components, augmenting them with specialized modules for memory, planning, tool use, and environmental interaction. This integration enables agents to perform complex tasks that would be challenging or impossible for traditional AI systems, from reconciling financial statements to providing step-by-step instructions for field technicians based on contextual understanding of product information.


Augmented Adversarial Trigger Learning

arXiv.org Artificial Intelligence

Gradient optimization-based adversarial attack methods automate the learning of adversarial triggers to generate jailbreak prompts or leak system prompts. In this work, we take a closer look at the optimization objective of adversarial trigger learning and propose ATLA: Adversarial Trigger Learning with Augmented objectives. ATLA improves the negative log-likelihood loss used by previous studies into a weighted loss formulation that encourages the learned adversarial triggers to optimize more towards response format tokens. This enables ATLA to learn an adversarial trigger from just one query-response pair and the learned trigger generalizes well to other similar queries. We further design a variation to augment trigger optimization with an auxiliary loss that suppresses evasive responses. We showcase how to use ATLA to learn adversarial suffixes jailbreaking LLMs and to extract hidden system prompts. Empirically we demonstrate that ATLA consistently outperforms current state-of-the-art techniques, achieving nearly 100% success in attacking while requiring 80% fewer queries. ATLA learned jailbreak suffixes demonstrate high generalization to unseen queries and transfer well to new LLMs.


Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints

arXiv.org Artificial Intelligence

In the realm of high-frequency data streams, achieving real-time learning within varying memory constraints is paramount. This paper presents Ferret, a comprehensive framework designed to enhance online accuracy of Online Continual Learning (OCL) algorithms while dynamically adapting to varying memory budgets. Ferret employs a fine-grained pipeline parallelism strategy combined with an iterative gradient compensation algorithm, ensuring seamless handling of high-frequency data with minimal latency, and effectively counteracting the challenge of stale gradients in parallel training. To adapt to varying memory budgets, its automated model partitioning and pipeline planning optimizes performance regardless of memory limitations. Extensive experiments across 20 benchmarks and 5 integrated OCL algorithms show Ferret's remarkable efficiency, achieving up to 3.7$\times$ lower memory overhead to reach the same online accuracy compared to competing methods. Furthermore, Ferret consistently outperforms these methods across diverse memory budgets, underscoring its superior adaptability. These findings position Ferret as a premier solution for efficient and adaptive OCL framework in real-time environments.


General Scales Unlock AI Evaluation with Explanatory and Predictive Power

arXiv.org Artificial Intelligence

Ensuring safe and effective use of AI requires understanding and anticipating its performance on novel tasks, from advanced scientific challenges to transformed workplace activities. So far, benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems, given the low transferability across diverse tasks. In this paper, we introduce general scales for AI evaluation that can explain what common AI benchmarks really measure, extract ability profiles of AI systems, and predict their performance for new task instances, in- and out-of-distribution. Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate. Illustrated for 15 large language models and 63 tasks, high explanatory power is unleashed from inspecting the demand and ability profiles, bringing insights on the sensitivity and specificity exhibited by different benchmarks, and how knowledge, metacognition and reasoning are affected by model size, chain-of-thought and distillation. Surprisingly, high predictive power at the instance level becomes possible using these demand levels, providing superior estimates over black-box baseline predictors based on embeddings or finetuning, especially in out-of-distribution settings (new tasks and new benchmarks). The scales, rubrics, battery, techniques and results presented here represent a major step for AI evaluation, underpinning the reliable deployment of AI in the years ahead. (Collaborative platform: https://kinds-of-intelligence-cfi.github.io/ADELE.)


The Status Quo and Future of AI-TPACK for Mathematics Teacher Education Students: A Case Study in Chinese Universities

arXiv.org Artificial Intelligence

As artificial intelligence (AI) technology becomes increasingly prevalent in the filed of education, there is a growing need for mathematics teacher education students (MTES) to demonstrate proficiency in the integration of AI with the technological pedagogical content knowledge (AI-TPACK). To study the issue, we firstly devised an systematic AI-TPACK scale and test on 412 MTES from seven universities. Through descriptive statistical analyses, we found that the current status of AI-TPACK for MTES in China is at a basic, preliminary stage. Secondly, we compared MTES between three different grades on the six variables and found that there is no discernible difference, which suggested that graduate studies were observed to have no promotion in the development of AI-TPACK competencies. Thirdly, we proposed a new AI-TPACK structural equation model (AI-TPACK-SEM) to explore the impact of self-efficacy and teaching beliefs on AI-TPACK. Our findings indicate a positive correlation between self-efficacy and AI-TPACK. We also come to a conclusion that may be contrary to common perception, excessive teaching beliefs may impede the advancement of AI-TPACK. Overall, this paper revealed the current status of AI-TPACK for MTES in China for the first time, designed a dedicated SEM to study the effect of specific factors on AI-TPACK, and proposed some suggestions on future developments.


LLM Agents for Education: Advances and Applications

arXiv.org Artificial Intelligence

Large Language Model (LLM) agents have demonstrated remarkable capabilities in automating tasks and driving innovation across diverse educational applications. In this survey, we provide a systematic review of state-of-the-art research on LLM agents in education, categorizing them into two broad classes: (1) \emph{Pedagogical Agents}, which focus on automating complex pedagogical tasks to support both teachers and students; and (2) \emph{Domain-Specific Educational Agents}, which are tailored for specialized fields such as science education, language learning, and professional development. We comprehensively examine the technological advancements underlying these LLM agents, including key datasets, benchmarks, and algorithmic frameworks that drive their effectiveness. Furthermore, we discuss critical challenges such as privacy, bias and fairness concerns, hallucination mitigation, and integration with existing educational ecosystems. This survey aims to provide a comprehensive technological overview of LLM agents for education, fostering further research and collaboration to enhance their impact for the greater good of learners and educators alike.


Data Driven Decision Making with Time Series and Spatio-temporal Data

arXiv.org Artificial Intelligence

Time series data captures properties that change over time. Such data occurs widely, ranging from the scientific and medical domains to the industrial and environmental domains. When the properties in time series exhibit spatial variations, we often call the data spatio-temporal. As part of the continued digitalization of processes throughout society, increasingly large volumes of time series and spatio-temporal data are available. In this tutorial, we focus on data-driven decision making with such data, e.g., enabling greener and more efficient transportation based on traffic time series forecasting. The tutorial adopts the holistic paradigm of "data-governance-analytics-decision." We first introduce the data foundation of time series and spatio-temporal data, which is often heterogeneous. Next, we discuss data governance methods that aim to improve data quality. We then cover data analytics, focusing on five desired characteristics: automation, robustness, generality, explainability, and resource efficiency. We finally cover data-driven decision making strategies and briefly discuss promising research directions. We hope that the tutorial will serve as a primary resource for researchers and practitioners who are interested in value creation from time series and spatio-temporal data.


Conversational AI as a Coding Assistant: Understanding Programmers' Interactions with and Expectations from Large Language Models for Coding

arXiv.org Artificial Intelligence

Conversational AI interfaces powered by large language models (LLMs) are increasingly used as coding assistants. However, questions remain about how programmers interact with LLM-based conversational agents, the challenges they encounter, and the factors influencing adoption. This study investigates programmers' usage patterns, perceptions, and interaction strategies when engaging with LLM-driven coding assistants. Through a survey, participants reported both the benefits, such as efficiency and clarity of explanations, and the limitations, including inaccuracies, lack of contextual awareness, and concerns about over-reliance. Notably, some programmers actively avoid LLMs due to a preference for independent learning, distrust in AI-generated code, and ethical considerations. Based on our findings, we propose design guidelines for improving conversational coding assistants, emphasizing context retention, transparency, multimodal support, and adaptability to user preferences. These insights contribute to the broader understanding of how LLM-based conversational agents can be effectively integrated into software development workflows while addressing adoption barriers and enhancing usability.