AITopics | event recognition

Collaborating Authors

event recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Critical appraisal of artificial intelligence for rare-event recognition: principles and pharmacovigilance case studies

Noren, G. Niklas, Meldau, Eva-Lisa, Ellenius, Johan

arXiv.org Artificial IntelligenceOct-7-2025

Many high-stakes AI applications target low-prevalence events, where apparent accuracy can conceal limited real-world value. Relevant AI models range from expert-defined rules and traditional machine learning to generative LLMs constrained for classification. We outline key considerations for critical appraisal of AI in rare-event recognition, including problem framing and test set design, prevalence-aware statistical evaluation, robustness assessment, and integration into human workflows. In addition, we propose an approach to structured case-level examination (SCLE), to complement statistical performance evaluation, and a comprehensive checklist to guide procurement or development of AI models for rare-event recognition. We instantiate the framework in pharmacovigilance, drawing on three studies: rule-based retrieval of pregnancy-related reports; duplicate detection combining machine learning with probabilistic record linkage; and automated redaction of person names using an LLM. We highlight pitfalls specific to the rare-event setting including optimism from unrealistic class balance and lack of difficult positive controls in test sets - and show how cost-sensitive targets align model performance with operational value. While grounded in pharmacovigilance practice, the principles generalize to domains where positives are scarce and error costs may be asymmetric.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2510.04341

Country:

Europe > Sweden > Uppsala County > Uppsala (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (0.93)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
(4 more...)

Add feedback

A Framework for Real-Time Volcano-Seismic Event Recognition Based on Multi-Station Seismograms and Semantic Segmentation Models

Espinosa-Curilem, Camilo, Curilem, Millaray, Basualto, Daniel

arXiv.org Artificial IntelligenceNov-5-2024

In volcano monitoring, effective recognition of seismic events is essential for understanding volcanic activity and raising timely warning alerts. Traditional methods rely on manual analysis, which can be subjective and labor-intensive. Furthermore, current automatic approaches often tackle detection and classification separately, mostly rely on single station information and generally require tailored preprocessing and representations to perform predictions. These limitations often hinder their application to real-time monitoring and utilization across different volcano conditions. This study introduces a novel approach that utilizes Semantic Segmentation models to automate seismic event recognition by applying a straight forward transformation of multi-channel 1D signals into 2D representations, enabling their use as images. Our framework employs a data-driven, end-to-end design that integrates multi-station seismic data with minimal preprocessing, performing both detection and classification simultaneously for five seismic event classes. We evaluated four state-of-the-art segmentation models (UNet, UNet++, DeepLabV3+ and SwinUNet) on approximately 25.000 seismic events recorded at four different Chilean volcanoes: Nevados del Chill\'an Volcanic Complex, Laguna del Maule, Villarrica and Puyehue-Cord\'on Caulle. Among these models, the UNet architecture was identified as the most effective model, achieving mean F1 and Intersection over Union (IoU) scores of up to 0.91 and 0.88, respectively, and demonstrating superior noise robustness and model flexibility to unseen volcano datasets.

machine learning, real time system, volcano, (21 more...)

arXiv.org Artificial Intelligence

2410.20595

Country:

North America > United States > West Virginia (0.41)
South America > Chile (0.29)
Europe (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture > Real Time Systems (0.90)

Add feedback

Hawk: An Efficient NALM System for Accurate Low-Power Appliance Recognition

Wang, Zijian, Zhang, Xingzhou, Wang, Yifan, Peng, Xiaohui, Xu, Zhiwei

arXiv.org Artificial IntelligenceOct-6-2024

Non-intrusive Appliance Load Monitoring (NALM) aims to recognize individual appliance usage from the main meter without indoor sensors. However, existing systems struggle to balance dataset construction efficiency and event/state recognition accuracy, especially for low-power appliance recognition. This paper introduces Hawk, an efficient and accurate NALM system that operates in two stages: dataset construction and event recognition. In the data construction stage, we efficiently collect a balanced and diverse dataset, HawkDATA, based on balanced Gray code and enable automatic data annotations via a sampling synchronization strategy called shared perceptible time. During the event recognition stage, our algorithm integrates steady-state differential pre-processing and voting-based post-processing for accurate event recognition from the aggregate current. Experimental results show that HawkDATA takes only 1/71.5 of the collection time to collect 6.34x more appliance state combinations than the baseline. In HawkDATA and a widely used dataset, Hawk achieves an average F1 score of 93.94% for state recognition and 97.07% for event recognition, which is a 47. 98% and 11. 57% increase over SOTA algorithms. Furthermore, selected appliance subsets and the model trained from HawkDATA are deployed in two real-world scenarios with many unknown background appliances. The average F1 scores of event recognition are 96.02% and 94.76%. Hawk's source code and HawkDATA are accessible at https://github.com/WZiJ/SenSys24-Hawk.

appliance, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3666025.3699359

2410.16293

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Power Industry (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Communications > Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

EmotionQueen: A Benchmark for Evaluating Empathy of Large Language Models

Chen, Yuyan, Wang, Hao, Yan, Songzhou, Liu, Sijia, Li, Yueze, Zhao, Yi, Xiao, Yanghua

arXiv.org Artificial IntelligenceSep-20-2024

Emotional intelligence in large language models (LLMs) is of great importance in Natural Language Processing. However, the previous research mainly focus on basic sentiment analysis tasks, such as emotion recognition, which is not enough to evaluate LLMs' overall emotional intelligence. Therefore, this paper presents a novel framework named EmotionQueen for evaluating the emotional intelligence of LLMs. The framework includes four distinctive tasks: Key Event Recognition, Mixed Event Recognition, Implicit Emotional Recognition, and Intention Recognition. LLMs are requested to recognize important event or implicit emotions and generate empathetic response. We also design two metrics to evaluate LLMs' capabilities in recognition and response for emotion-related statements. Experiments yield significant conclusions about LLMs' capabilities and limitations in emotion intelligence.

different llm, llm, recognition, (16 more...)

arXiv.org Artificial Intelligence

2409.13359

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > Pennsylvania (0.04)
(3 more...)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

Multi-Frame Vision-Language Model for Long-form Reasoning in Driver Behavior Analysis

Takato, Hiroshi, Tsutsui, Hiroshi, Soda, Komei, Kamigaito, Hidetaka

arXiv.org Artificial IntelligenceAug-3-2024

Identifying risky driving behavior in real-world situations is essential for the safety of both drivers and pedestrians. However, integrating natural language models in this field remains relatively untapped. To address this, we created a novel multi-modal instruction tuning dataset and driver coaching inference system. Our primary use case is dashcam-based coaching for commercial drivers. The North American Dashcam Market is expected to register a CAGR of 15.4 percent from 2022 to 2027. Our dataset enables language models to learn visual instructions across various risky driving scenarios, emphasizing detailed reasoning crucial for effective driver coaching and managerial comprehension. Our model is trained on roadfacing and driver-facing RGB camera footage, capturing the comprehensive scope of driving Figure 1: Overview of our targeting coaching task.

dataset, instruction, language model, (16 more...)

arXiv.org Artificial Intelligence

2408.01682

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry:

Automobiles & Trucks (0.51)
Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

AGS: An Dataset and Taxonomy for Domestic Scene Sound Event Recognition

Che, Nan, Liu, Chenrui, Yu, Fei

arXiv.org Artificial IntelligenceAug-29-2023

Environmental sound scene and sound event recognition is important for the recognition of suspicious events in indoor and outdoor environments (such as nurseries, smart homes, nursing homes, etc.) and is a fundamental task involved in many audio surveillance applications. In particular, there is no public common data set for the research field of sound event recognition for the data set of the indoor environmental sound scene. Therefore, this paper proposes a data set (called as AGS) for the home environment sound. This data set considers various types of overlapping audio in the scene, background noise. Moreover, based on the proposed data set, this paper compares and analyzes the advanced methods for sound event recognition, and then illustrates the reliability of the data set proposed in this paper, and studies the challenges raised by the new data set. Our proposed AGS and the source code of the corresponding baselines at https://github.com/taolunzu11/AGS .

dataset and taxonomy, event recognition

arXiv.org Artificial Intelligence

2308.15726

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)

Add feedback

Analyzing sports commentary in order to automatically recognize events and extract insights

Miraoui, Yanis

arXiv.org Artificial IntelligenceJul-18-2023

In this paper, we carefully investigate how we can use multiple different Natural Language Processing techniques and methods in order to automatically recognize the main actions in sports events. We aim to extract insights by analyzing live sport commentaries from different sources and by classifying these major actions into different categories. We also study if sentiment analysis could help detect these main actions.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2307.10303

Country:

North America > United States (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Spain (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Roder, Mateus, Almeida, Jurandy, de Rosa, Gustavo H., Passos, Leandro A., Rossi, André L. D., Papa, João P.

arXiv.org Artificial IntelligenceNov-30-2022

In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process as training complex models over large datasets are expensive and time-consuming. Such a problem is even more evident when dealing with video analysis. Some works have considered transfer learning or domain adaptation, i.e., approaches that map the knowledge from one domain to another, to ease the training burden, yet most of them operate over individual or small blocks of frames. This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model, denoted as Spectral Deep Belief Network. Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process. The experimental results conducted over two public video dataset, the HMDB-51 and the UCF-101, depict the effectiveness of the proposed model and its reduced computational burden when compared to traditional energy-based models, such as Restricted Boltzmann Machines and Deep Belief Networks.

artificial intelligence, machine learning, recognition, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SSCI50451.2021.9660128

2211.17045

Country:

South America > Brazil > São Paulo (0.04)
Oceania > New Zealand (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Add feedback

Explainable Event Recognition

Khan, Imran, Ahmad, Kashif, Gul, Namra, Khan, Talhat, Ahmad, Nasir, Al-Fuqaha, Ala

arXiv.org Artificial IntelligenceOct-10-2021

The literature shows outstanding capabilities for CNNs in event recognition in images. However, fewer attempts are made to analyze the potential causes behind the decisions of the models and exploring whether the predictions are based on event-salient objects or regions? To explore this important aspect of event recognition, in this work, we propose an explainable event recognition framework relying on Grad-CAM and an Xception architecture-based CNN model. Experiments are conducted on three large-scale datasets covering a diversified set of natural disasters, social, and sports events. Overall, the model showed outstanding generalization capabilities obtaining overall F1-scores of 0.91, 0.94, and 0.97 on natural disasters, social, and sports events, respectively. Moreover, for subjective analysis of activation maps generated through Grad-CAM for the predicted samples of the model, a crowdsourcing study is conducted to analyze whether the model's predictions are based on event-related objects/regions or not? The results of the study indicate that 78%, 84%, and 78% of the model decisions on natural disasters, sports, and social events datasets, respectively, are based onevent-related objects or regions.

dataset, event recognition, recognition, (16 more...)

arXiv.org Artificial Intelligence

2110.00755

Country:

Asia > Pakistan > Khyber Pakhtunkhwa > Peshawar Division > Peshawar District > Peshawar (0.05)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.05)
Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Where and When: Space-Time Attention for Audio-Visual Explanations

Chen, Yanbei, Hummel, Thomas, Koepke, A. Sophia, Akata, Zeynep

arXiv.org Artificial IntelligenceMay-4-2021

Explaining the decision of a multi-modal decision-maker requires to determine the evidence from both modalities. Recent advances in XAI provide explanations for models trained on still images. However, when it comes to modeling multiple sensory modalities in a dynamic world, it remains underexplored how to demystify the mysterious dynamics of a complex multi-modal model. In this work, we take a crucial step forward and explore learnable explanations for audio-visual recognition. Specifically, we propose a novel space-time attention network that uncovers the synergistic dynamics of audio and visual data over both space and time. Our model is capable of predicting the audio-visual video events, while justifying its decision by localizing where the relevant visual cues appear, and when the predicted sounds occur in videos. We benchmark our model on three audio-visual video event datasets, comparing extensively to multiple recent multi-modal representation learners and intrinsic explanation models. Experimental results demonstrate the clear superior performance of our model over the existing methods on audio-visual video event recognition. Moreover, we conduct an in-depth study to analyze the explainability of our model based on robustness analysis via perturbation tests and pointing games using human annotations.

explanation, recognition, stan, (15 more...)

arXiv.org Artificial Intelligence

2105.01517

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback