Goto

Collaborating Authors

 Temporal Reasoning


Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding

arXiv.org Artificial Intelligence

Understanding long video content is a complex endeavor that often relies on densely sampled frame captions or end-to-end feature selectors, yet these techniques commonly overlook the logical relationships between textual queries and visual elements. In practice, computational constraints necessitate coarse frame subsampling, a challenge analogous to ``finding a needle in a haystack.'' To address this issue, we introduce a semantics-driven search framework that reformulates keyframe selection under the paradigm of Visual Semantic-Logical Search. Specifically, we systematically define four fundamental logical dependencies: 1) spatial co-occurrence, 2) temporal proximity, 3) attribute dependency, and 4) causal order. These relations dynamically update frame sampling distributions through an iterative refinement process, enabling context-aware identification of semantically critical frames tailored to specific query requirements. Our method establishes new SOTA performance on the manually annotated benchmark in key-frame selection metrics. Furthermore, when applied to downstream video question-answering tasks, the proposed approach demonstrates the best performance gains over existing methods on LongVideoBench and Video-MME, validating its effectiveness in bridging the logical gap between textual queries and visual-temporal reasoning. The code will be publicly available.


Temporal Analysis of NetFlow Datasets for Network Intrusion Detection Systems

arXiv.org Artificial Intelligence

This paper investigates the temporal analysis of NetFlow datasets for machine learning (ML)-based network intrusion detection systems (NIDS). Although many previous studies have highlighted the critical role of temporal features, such as inter-packet arrival time and flow length/duration, in NIDS, the currently available NetFlow datasets for NIDS lack these temporal features. This study addresses this gap by creating and making publicly available a set of NetFlow datasets that incorporate these temporal features [1]. With these temporal features, we provide a comprehensive temporal analysis of NetFlow datasets by examining the distribution of various features over time and presenting time-series representations of NetFlow features. This temporal analysis has not been previously provided in the existing literature. We also borrowed an idea from signal processing, time frequency analysis, and tested it to see how different the time frequency signal presentations (TFSPs) are for various attacks. The results indicate that many attacks have unique patterns, which could help ML models to identify them more easily.


Modeling, Simulation, and Application of Spatio-Temporal Characteristics Detection in Incipient Slip

arXiv.org Artificial Intelligence

--Incipient slip detection provides critical feedback for robotic grasping and manipulation tasks. However, maintaining its adaptability under diverse object properties and complex working conditions remains challenging. This article highlights the importance of completely representing spatiotemporal features of slip, and proposes a novel approach for incipient slip modeling and detection. Based on the analysis of localized displacement phenomenon, we establish the relationship between the characteristic strain rate extreme events and the local slip state. This approach enables the detection of both the spatial distribution and temporal dynamics of stick -slip regions. Also, the proposed method can be applied to strain distribution sensing devices, such as vis ion-based tactile sensors. Simulations and prototype experiments validated the effectiveness of this approach under varying contact conditions, including different contact geometries, friction coefficients, and combined loads. Experiments demonstrated that this method not only accurately and reliably delineates incipient slip, but also facilitates friction parameter estimation and adaptive grasping control. INTRODUCTION ACTILE perception plays a crucial role in stable grasping and dexterous manipulation in humans [1]. Neuroscientific studies show that humans can identify the frictional parameters of objects they touch with over 90% accuracy [2], and quickly adjust the grasp force within about 200 milliseconds to prevent slipping [3]. This ability enables humans to adapt to changes in friction levels based on tactile feedback and apply proper force to ensure s tability while maintaining gentle grasping [4]. The perception of incipient slip is an effective means for friction parameter recognition and grasp force control [5],[6]. Incipient slip is an intermediate state between complete sticking and full slipping of the contact surface, as shown in Figure 1. When a tangential load is applied to the contact surface, slip first occurs at the contact edge. It gradually spreads inward, eventually covering the entire stick region [7]. This work was supported by the National Natural Science Foundation of China under Grant 52375017. We refer to these two characteristics of incipient slip as spatial and temporal characteristics: spatial characteristics refer to the distribution of the stick -slip reg ion at a given moment, while temporal characteristics describe the time evolution of local slip. These characteristics are widely present in human tactile perception. According to existing research, Human sensory information is encoded by neural populations to capture spatial distribution, rather than being transmitted by individual neurons. Besides, skin deformation can be influenced by the loading history [9].


Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties

arXiv.org Artificial Intelligence

Temporal reasoning is fundamental to human cognition and is crucial for various real-world applications. While recent advances in Large Language Models have demonstrated promising capabilities in temporal reasoning, existing benchmarks primarily rely on rule-based construction, lack contextual depth, and involve a limited range of temporal entities. To address these limitations, we introduce Chinese Time Reasoning (CTM), a benchmark designed to evaluate LLMs on temporal reasoning within the extensive scope of Chinese dynastic chronology. CTM emphasizes cross-entity relationships, pairwise temporal alignment, and contextualized and culturally-grounded reasoning, providing a comprehensive evaluation. Extensive experimental results reveal the challenges posed by CTM and highlight potential avenues for improvement.


Fast Multivariate Spatio-temporal Analysis via Low Rank Tensor Learning

Neural Information Processing Systems

Accurate and efficient analysis of multivariate spatio-temporal data is critical in climatology, geology, and sociology applications. Existing models usually assume simple inter-dependence among variables, space, and time, and are computationally expensive. We propose a unified low rank tensor learning framework for multivariate spatio-temporal analysis, which can conveniently incorporate different properties in spatio-temporal data, such as spatial clustering and shared structure among variables. We demonstrate how the general framework can be applied to cokriging and forecasting tasks, and develop an efficient greedy algorithm to solve the resulting optimization problem with convergence guarantee. We conduct experiments on both synthetic datasets and real application datasets to demonstrate that our method is not only significantly faster than existing methods but also achieves lower estimation error.


TReMu: Towards Neuro-Symbolic Temporal Reasoning for LLM-Agents with Memory in Multi-Session Dialogues

arXiv.org Artificial Intelligence

Temporal reasoning in multi-session dialogues presents a significant challenge which has been under-studied in previous temporal reasoning benchmarks. To bridge this gap, we propose a new evaluation task for temporal reasoning in multi-session dialogues and introduce an approach to construct a new benchmark by augmenting dialogues from LoCoMo and creating multi-choice QAs. Furthermore, we present TReMu, a new framework aimed at enhancing the temporal reasoning capabilities of LLM-agents in this context. Specifically, the framework employs \textit{time-aware memorization} through timeline summarization, generating retrievable memory by summarizing events in each dialogue session with their inferred dates. Additionally, we integrate \textit{neuro-symbolic temporal reasoning}, where LLMs generate Python code to perform temporal calculations and select answers. Experimental evaluations on popular LLMs demonstrate that our benchmark is challenging, and the proposed framework significantly improves temporal reasoning performance compared to baseline methods, raising from 29.83 on GPT-4o via standard prompting to 77.67 via our approach and highlighting its effectiveness in addressing temporal reasoning in multi-session dialogues.


Learning to Fuse Temporal Proximity Networks: A Case Study in Chimpanzee Social Interactions

arXiv.org Machine Learning

How can we identify groups of primate individuals which could be conjectured to drive social structure? To address this question, one of us has collected a time series of data for social interactions between chimpanzees. Here we use a network representation, leading to the task of combining these data into a time series of a single weighted network per time stamp, where different proximities should be given different weights reflecting their relative importance. We optimize these proximity-type weights in a principled way, using an innovative loss function which rewards structural consistency across time. The approach is empirically validated by carefully designed synthetic data. Using statistical tests, we provide a way of identifying groups of individuals that stay related for a significant length of time. Applying the approach to the chimpanzee data set, we detect cliques in the animal social network time series, which can be validated by real-world intuition from prior research and qualitative observations by chimpanzee experts.


TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph

Neural Information Processing Systems

Multi-hop logical reasoning over knowledge graph plays a fundamental role in many artificial intelligence tasks. Recent complex query embedding methods for reasoning focus on static KGs, while temporal knowledge graphs have not been fully explored. Reasoning over TKGs has two challenges: 1. The query should answer entities or timestamps; 2. The operators should consider both set logic on entity set and temporal logic on timestamp set.To bridge this gap, we introduce the multi-hop logical reasoning problem on TKGs and then propose the first temporal complex query embedding named Temporal Feature-Logic Embedding framework (TFLEX) to answer the temporal complex queries. Specifically, we utilize fuzzy logic to compute the logic part of the Temporal Feature-Logic embedding, thus naturally modeling all first-order logic operations on the entity set.


TimelineKGQA: A Comprehensive Question-Answer Pair Generator for Temporal Knowledge Graphs

arXiv.org Artificial Intelligence

Question answering over temporal knowledge graphs (TKGs) is crucial for understanding evolving facts and relationships, yet its development is hindered by limited datasets and difficulties in generating custom QA pairs. We propose a novel categorization framework based on timeline-context relationships, along with \textbf{TimelineKGQA}, a universal temporal QA generator applicable to any TKGs. The code is available at: \url{https://github.com/PascalSun/TimelineKGQA} as an open source Python package.


Temporal reasoning for timeline summarisation in social media

arXiv.org Artificial Intelligence

This paper explores whether enhancing temporal reasoning capabilities in Large Language Models (LLMs) can improve the quality of timeline summarization, the task of summarising long texts containing sequences of events, particularly social media threads . We introduce \textit{NarrativeReason}, a novel dataset focused on temporal relationships among sequential events within narratives, distinguishing it from existing temporal reasoning datasets that primarily address pair-wise event relationships. Our approach then combines temporal reasoning with timeline summarization through a knowledge distillation framework, where we first fine-tune a teacher model on temporal reasoning tasks and then distill this knowledge into a student model while simultaneously training it for the task of timeline summarization. Experimental results demonstrate that our model achieves superior performance on mental health-related timeline summarization tasks, which involve long social media threads with repetitions of events and a mix of emotions, highlighting the importance of leveraging temporal reasoning to improve timeline summarisation.