AITopics | Du, Hang

Collaborating Authors

Du, Hang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring What Why and How: A Multifaceted Benchmark for Causation Understanding of Video Anomaly

Du, Hang, Nan, Guoshun, Qian, Jiawen, Wu, Wangchenhui, Deng, Wendi, Mu, Hanqing, Chen, Zhenyan, Mao, Pengxuan, Tao, Xiaofeng, Liu, Jun

arXiv.org Artificial IntelligenceDec-9-2024

Recent advancements in video anomaly understanding (VAU) have opened the door to groundbreaking applications in various fields, such as traffic monitoring and industrial automation. While the current benchmarks in VAU predominantly emphasize the detection and localization of anomalies. Here, we endeavor to delve deeper into the practical aspects of VAU by addressing the essential questions: "what anomaly occurred?", "why did it happen?", and "how severe is this abnormal event?". In pursuit of these answers, we introduce a comprehensive benchmark for Exploring the Causation of Video Anomalies (ECVA). Our benchmark is meticulously designed, with each video accompanied by detailed human annotations. Specifically, each instance of our ECVA involves three sets of human annotations to indicate "what", "why" and "how" of an anomaly, including 1) anomaly type, start and end times, and event descriptions, 2) natural language explanations for the cause of an anomaly, and 3) free text reflecting the effect of the abnormality. Building upon this foundation, we propose a novel prompt-based methodology that serves as a baseline for tackling the intricate challenges posed by ECVA. We utilize "hard prompt" to guide the model to focus on the critical parts related to video anomaly segments, and "soft prompt" to establish temporal and spatial relationships within these anomaly segments. Furthermore, we propose AnomEval, a specialized evaluation metric crafted to align closely with human judgment criteria for ECVA. This metric leverages the unique features of the ECVA dataset to provide a more comprehensive and reliable assessment of various video large language models. We demonstrate the efficacy of our approach through rigorous experimental analysis and delineate possible avenues for further investigation into the comprehension of video anomaly causation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.07183

Country: Asia > China (0.31)

Genre:

Overview (0.66)
Research Report (0.64)

Industry: Law Enforcement & Public Safety > Fire & Emergency Services (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

Du, Hang, Zhang, Sicheng, Xie, Binzhu, Nan, Guoshun, Zhang, Jiayang, Xu, Junrui, Liu, Hangyu, Leng, Sicong, Liu, Jiangming, Fan, Hehe, Huang, Dajiu, Feng, Jing, Chen, Linli, Zhang, Can, Li, Xuhuan, Zhang, Hao, Chen, Jianhang, Cui, Qimei, Tao, Xiaofeng

arXiv.org Artificial IntelligenceMay-6-2024

Video anomaly understanding (VAU) aims to automatically comprehend unusual occurrences in videos, thereby enabling various applications such as traffic surveillance and industrial manufacturing. While existing VAU benchmarks primarily concentrate on anomaly detection and localization, our focus is on more practicality, prompting us to raise the following crucial questions: "what anomaly occurred?", "why did it happen?", and "how severe is this abnormal event?". In pursuit of these answers, we present a comprehensive benchmark for Causation Understanding of Video Anomaly (CUVA). Specifically, each instance of the proposed benchmark involves three sets of human annotations to indicate the "what", "why" and "how" of an anomaly, including 1) anomaly type, start and end times, and event descriptions, 2) natural language explanations for the cause of an anomaly, and 3) free text reflecting the effect of the abnormality. In addition, we also introduce MMEval, a novel evaluation metric designed to better align with human preferences for CUVA, facilitating the measurement of existing LLMs in comprehending the underlying cause and corresponding effect of video anomalies. Finally, we propose a novel prompt-based method that can serve as a baseline approach for the challenging CUVA. We conduct extensive experiments to show the superiority of our evaluation metric and the prompt-based approach. Our code and dataset are available at https://github.com/fesvhtr/CUVA.

data mining, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2405.00181

Country:

Asia > China (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.82)

Industry:

Law (0.93)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
Law Enforcement & Public Safety > Fire & Emergency Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

DocMSU: A Comprehensive Benchmark for Document-level Multimodal Sarcasm Understanding

Du, Hang, Nan, Guoshun, Zhang, Sicheng, Xie, Binzhu, Xu, Junrui, Fan, Hehe, Cui, Qimei, Tao, Xiaofeng, Jiang, Xudong

arXiv.org Artificial IntelligenceDec-26-2023

Multimodal Sarcasm Understanding (MSU) has a wide range of applications in the news field such as public opinion analysis and forgery detection. However, existing MSU benchmarks and approaches usually focus on sentence-level MSU. In document-level news, sarcasm clues are sparse or small and are often concealed in long text. Moreover, compared to sentence-level comments like tweets, which mainly focus on only a few trends or hot topics (e.g., sports events), content in the news is considerably diverse. Models created for sentence-level MSU may fail to capture sarcasm clues in document-level news. To fill this gap, we present a comprehensive benchmark for Document-level Multimodal Sarcasm Understanding (DocMSU). Our dataset contains 102,588 pieces of news with text-image pairs, covering 9 diverse topics such as health, business, etc. The proposed large-scale and diverse DocMSU significantly facilitates the research of document-level MSU in real-world scenarios. To take on the new challenges posed by DocMSU, we introduce a fine-grained sarcasm comprehension method to properly align the pixel-level image features with word-level textual features in documents. Experiments demonstrate the effectiveness of our method, showing that it can serve as a baseline approach to the challenging DocMSU. Our code and dataset are available at https://github.com/Dulpy/DocMSU.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2312.16023

Country: Asia > Japan (0.28)

Genre: Research Report (1.00)

Industry:

Media (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Hybrid Complex-valued Neural Network Framework with Applications to Electroencephalogram (EEG)

Du, Hang, Riddell, Rebecca Pillai, Wang, Xiaogang

arXiv.org Artificial IntelligenceJul-27-2022

In this article, we present a new EEG signal classification framework by integrating the complex-valued and real-valued Convolutional Neural Network(CNN) with discrete Fourier transform (DFT). The proposed neural network architecture consists of one complex-valued convolutional layer, two real-valued convolutional layers, and three fully connected layers. Our method can efficiently utilize the phase information contained in the DFT. We validate our approach using two simulated EEG signals and a benchmark data set and compare it with two widely used frameworks. Our method drastically reduces the number of parameters used and improves accuracy when compared with the existing methods in classifying benchmark data sets, and significantly improves performance in classifying simulated EEG signals.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Artificial Intelligence

2207.14799

Country: North America (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Matching recovery threshold for correlated random graphs

Ding, Jian, Du, Hang

arXiv.org Machine LearningMay-29-2022

For two correlated graphs which are independently sub-sampled from a common Erd\H{o}s-R\'enyi graph $\mathbf{G}(n, p)$, we wish to recover their \emph{latent} vertex matching from the observation of these two graphs \emph{without labels}. When $p = n^{-\alpha+o(1)}$ for $\alpha\in (0, 1]$, we establish a sharp information-theoretic threshold for whether it is possible to correctly match a positive fraction of vertices. Our result sharpens a constant factor in a recent work by Wu, Xu and Yu.

arXiv.org Machine Learning

2205.1465

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback