AITopics | Zhu, Jiawen

Collaborating Authors

Zhu, Jiawen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection

Lim, Ying Fu, Zhu, Jiawen, Pang, Guansong

arXiv.org Artificial IntelligenceMar-11-2025

Log Anomaly Detection (LAD) seeks to identify atypical patterns in log data that are crucial to assessing the security and condition of systems. Although Large Language Models (LLMs) have shown tremendous success in various fields, the use of LLMs in enabling the detection of log anomalies is largely unexplored. This work aims to fill this gap. Due to the prohibitive costs involved in fully fine-tuning LLMs, we explore the use of parameter-efficient fine-tuning techniques (PEFTs) for adapting LLMs to LAD. To have an in-depth exploration of the potential of LLM-driven LAD, we present a comprehensive investigation of leveraging two of the most popular PEFTs -- Low-Rank Adaptation (LoRA) and Representation Fine-tuning (ReFT) -- to tap into three prominent LLMs of varying size, including RoBERTa, GPT-2, and Llama-3, for parameter-efficient LAD. Comprehensive experiments on four public log datasets are performed to reveal important insights into effective LLM-driven LAD in several key perspectives, including the efficacy of these PEFT-based LLM-driven LAD methods, their stability, sample efficiency, robustness w.r.t. unstable logs, and cross-dataset generalization. Code is available at https://github.com/mala-lab/LogADReft.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.08045

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge

Xiong, Haomiao, Yang, Zongxin, Yu, Jiazuo, Zhuge, Yunzhi, Zhang, Lu, Zhu, Jiawen, Lu, Huchuan

arXiv.org Artificial IntelligenceJan-23-2025

Recent advances in Large Language Models (LLMs) have enabled the development of Video-LLMs, advancing multimodal learning by bridging video data with language tasks. However, current video understanding models struggle with processing long video sequences, supporting multi-turn dialogues, and adapting to real-world dynamic scenarios. To address these issues, we propose StreamChat, a training-free framework for streaming video reasoning and conversational interaction. $\StreamChat$ leverages a novel hierarchical memory system to efficiently process and compress video features over extended sequences, enabling real-time, multi-turn dialogue. Our framework incorporates a parallel system scheduling strategy that enhances processing speed and reduces latency, ensuring robust performance in real-world applications. Furthermore, we introduce StreamBench, a versatile benchmark that evaluates streaming video understanding across diverse media types and interactive scenarios, including multi-turn interactions and complex reasoning tasks. Extensive evaluations on StreamBench and other public benchmarks demonstrate that StreamChat significantly outperforms existing state-of-the-art models in terms of accuracy and response times, confirming its effectiveness for streaming video understanding. Code is available at StreamChat: https://github.com/hmxiong/StreamChat.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.13468

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SUTrack: Towards Simple and Unified Single Object Tracking

Chen, Xin, Kang, Ben, Geng, Wanting, Zhu, Jiawen, Liu, Yi, Wang, Dong, Lu, Huchuan

arXiv.org Artificial IntelligenceDec-26-2024

In this paper, we propose a simple yet unified single object tracking (SOT) framework, dubbed SUTrack. It consolidates five SOT tasks (RGB-based, RGB-Depth, RGB-Thermal, RGB-Event, RGB-Language Tracking) into a unified model trained in a single session. Due to the distinct nature of the data, current methods typically design individual architectures and train separate models for each task. This fragmentation results in redundant training processes, repetitive technological innovations, and limited cross-modal knowledge sharing. In contrast, SUTrack demonstrates that a single model with a unified input representation can effectively handle various common SOT tasks, eliminating the need for task-specific designs and separate training sessions. Additionally, we introduce a task-recognition auxiliary training strategy and a soft token type embedding to further enhance SUTrack's performance with minimal overhead. Experiments show that SUTrack outperforms previous task-specific counterparts across 11 datasets spanning five SOT tasks. Moreover, we provide a range of models catering edge devices as well as high-performance GPUs, striking a good trade-off between speed and accuracy. We hope SUTrack could serve as a strong foundation for further compelling research into unified tracking models. Code and models are available at github.com/chenxin-dlut/SUTrack.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.19138

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

EmoWear: Exploring Emotional Teasers for Voice Message Interaction on Smartwatches

An, Pengcheng, Zhu, Jiawen, Zhang, Zibo, Yin, Yifei, Ma, Qingyuan, Yan, Che, Du, Linghao, Zhao, Jian

arXiv.org Artificial IntelligenceFeb-11-2024

Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"-pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging system enabling users to apply 30 animation teasers on message bubbles to reflect emotions. EmoWear eases senders' choice by prioritizing emotions based on semantic and acoustic processing. EmoWear was evaluated in comparison with a mirroring system using color-coded message bubbles as emotional cues (N=24). Results showed EmoWear significantly enhanced emotional communication experience in both receiving and sending messages. The animated teasers were considered intuitive and valued for diverse expressions. Desirable interaction qualities and practical implications are distilled for future design. We thereby contribute both a novel system and empirical knowledge concerning emotional teasers for voice messaging.

artificial intelligence, emotional teaser, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642101

2402.07174

Country:

Europe (1.00)
North America > United States > New York > New York County > New York City (0.15)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Information Technology > Services (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Hardware (1.00)
Information Technology > Communications > Social Media (1.00)
(3 more...)

Add feedback

DCPT: Darkness Clue-Prompted Tracking in Nighttime UAVs

Zhu, Jiawen, Tang, Huayi, Cheng, Zhi-Qi, He, Jun-Yan, Luo, Bin, Qiu, Shihao, Li, Shengming, Lu, Huchuan

arXiv.org Artificial IntelligenceOct-19-2023

Existing nighttime unmanned aerial vehicle (UAV) trackers follow an "Enhance-then-Track" architecture - first using a light enhancer to brighten the nighttime video, then employing a daytime tracker to locate the object. This separate enhancement and tracking fails to build an end-to-end trainable vision system. To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts. Without a separate enhancer, DCPT directly encodes anti-dark capabilities into prompts using a darkness clue prompter (DCP). Specifically, DCP iteratively learns emphasizing and undermining projections for darkness clues. It then injects these learned visual prompts into a daytime tracker with fixed parameters across transformer layers. Moreover, a gated feature aggregation mechanism enables adaptive fusion between prompts and between prompts and the base model. Extensive experiments show state-of-the-art performance for DCPT on multiple dark scenario benchmarks. The unified end-to-end learning of enhancement and tracking in DCPT enables a more trainable system. The darkness clue prompting efficiently injects anti-dark knowledge without extra modules. Code is available at https://github.com/bearyi26/DCPT.

artificial intelligence, machine learning, tracker, (15 more...)

arXiv.org Artificial Intelligence

2309.10491

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Information Technology (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.48)

Add feedback

A Causal Inference Framework for Leveraging External Controls in Hybrid Trials

Valancius, Michael, Pang, Herb, Zhu, Jiawen, Cole, Stephen R, Funk, Michele Jonsson, Kosorok, Michael R

arXiv.org Machine LearningMay-15-2023

We consider the challenges associated with causal inference in settings where data from a randomized trial is augmented with control data from an external source to improve efficiency in estimating the average treatment effect (ATE). Through the development of a formal causal inference framework, we outline sufficient causal assumptions about the exchangeability between the internal and external controls to identify the ATE and establish the connection to a novel graphical criteria. We propose estimators, review efficiency bounds, develop an approach for efficient doubly-robust estimation even when unknown nuisance models are estimated with flexible machine learning methods, and demonstrate finite-sample performance through a simulation study. To illustrate the ideas and methods, we apply the framework to a trial investigating the effect of risdisplam on motor function in patients with spinal muscular atrophy for which there exists an external set of control patients from a previous trial.

artificial intelligence, external control, machine learning, (16 more...)

arXiv.org Machine Learning

2305.08969

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Health & Medicine > Therapeutic Area > Genetic Disease (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

AutoChart: A Dataset for Chart-to-Text Generation Task

Zhu, Jiawen, Ran, Jinye, Lee, Roy Ka-wei, Choo, Kenny, Li, Zhi

arXiv.org Artificial IntelligenceAug-16-2021

The analytical description of charts is an exciting and important research area with many applications in academia and industry. Yet, this challenging task has received limited attention from the computational linguistics research community. This paper proposes \textsf{AutoChart}, a large dataset for the analytical description of charts, which aims to encourage more research into this important area. Specifically, we offer a novel framework that generates the charts and their analytical description automatically. We conducted extensive human and machine evaluations on the generated charts and descriptions and demonstrate that the generated texts are informative, coherent, and relevant to the corresponding charts.

analytical description, artificial intelligence, banking & finance, (19 more...)

arXiv.org Artificial Intelligence

2108.06897

Country:

Europe (1.00)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback