AITopics | behavior recognition

Collaborating Authors

behavior recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild

Mueller, Felix B., Meier, Jan F., Lueddecke, Timo, Vogg, Richard, Freixanet, Roger L., Hassler, Valentin, Bosshard, Tiffany, Karakoc, Elif, O'Hearn, William J., Pereira, Sofia M., Sehner, Sandro, Wierucka, Kaja, Burkart, Judith, Fichtel, Claudia, Fischer, Julia, Gail, Alexander, Hobaiter, Catherine, Ostner, Julia, Samuni, Liran, Schülke, Oliver, Shahidi, Neda, Wessling, Erin G., Ecker, Alexander S.

arXiv.org Artificial IntelligenceNov-18-2025

Non-human primates are our closest living relatives, and analyzing their behavior is central to research in cognition, evolution, and conservation. Computer vision could greatly aid this research, but existing methods often rely on human-centric pretrained models and focus on single datasets, which limits generalization. W e address this limitation by shifting from a model-centric to a data-centric approach and introduce PriVi, a large-scale primate-centric video pretraining dataset. PriVi contains 424 hours of curated video, combining 174 hours from behavioral research across 11 settings with 250 hours of diverse web-sourced footage, assembled through a scalable data cura-tion pipeline. W e continue pretraining V-JEP A, a large-scale video model, on PriVi to learn primate-specific representations and evaluate it using a lightweight frozen classifier . Across four benchmark datasets - ChimpACT, PanAf500, BaboonLand, and ChimpBehave - our approach consistently outperforms prior work, including fully fine-tuned baselines, and scales favorably with fewer labels. These results demonstrate that primate-centric pretraining substantially improves data efficiency and generalization, making it a promising approach for low-label applications. Code, models, and the majority of the dataset will be made available.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.09675

Country:

North America > United States (1.00)
Asia (0.93)
Africa (0.67)
Europe > Germany > Lower Saxony > Gottingen (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

A Framework Combining 3D CNN and Transformer for Video-Based Behavior Recognition

Zhang, Xiuliang, Nyamasvisva, Tadiwa Elisha, Liu, Chuntao

arXiv.org Artificial IntelligenceAug-12-2025

Video-based behavior recognition is essential in fields such as public safety, intelligent surveillance, and human-computer interaction. Traditional 3D Convolutional Neural Network (3D CNN) effectively capture local spatiotemporal features but struggle with modeling long-range dependencies. Conversely, Transformers excel at learning global contextual information but face challenges with high computational costs. To address these limitations, we propose a hybrid framework combining 3D CNN and Transformer architectures. The 3D CNN module extracts low-level spatiotemporal features, while the Transformer module captures long-range temporal dependencies, with a fusion mechanism integrating both representations. Evaluated on benchmark datasets, the proposed model outperforms traditional 3D CNN and standalone Transformers, achieving higher recognition accuracy with manageable complexity. Ablation studies further validate the complementary strengths of the two modules. This hybrid framework offers an effective and scalable solution for video-based behavior recognition.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.06528

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Artificial Behavior Intelligence: Technology, Challenges, and Future Directions

Jo, Kanghyun, Choi, Jehwan, Kim, Kwanho, Kim, Seongmin, Nguyen, Duy-Linh, Vo, Xuan-Thuy, Priadana, Adri, Tran, Tien-Dat

arXiv.org Artificial IntelligenceMay-7-2025

--Understanding and predicting human behavior has emerged as a core capability in various AI application domains such as autonomous driving, smart healthcare, surveillance systems, and social robotics. This paper defines the technical framework of Artificial Behavior Intelligence (ABI), which comprehensively analyzes and interprets human posture, facial expressions, emotions, behavioral sequences, and contextual cues. It details the essential components of ABI, including pose estimation, face and emotion recognition, sequential behavior analysis, and context-aware modeling. Furthermore, we highlight the transformative potential of recent advances in large-scale pretrained models, such as large language models (LLMs), vision foundation models, and multimodal integration models, in significantly improving the accuracy and interpretability of behavior recognition. Our research team has a strong interest in the ABI domain and is actively conducting research, particularly focusing on the development of intelligent lightweight models capable of efficiently inferring complex human behaviors. This paper identifies several technical challenges that must be addressed to deploy ABI in real-world applications including learning behavioral intelligence from limited data, quantifying uncertainty in complex behavior prediction, and optimizing model structures for low-power, real-time inference. T o tackle these challenges, our team is exploring various optimization strategies including lightweight transformers, graph-based recognition architectures, energy-aware loss functions, and multimodal knowledge distillation, while validating their applicability in real-time environments. The philosopher Aristotle once described human beings as "social animals." This statement implies that humans do not exist as isolated entities, but rather live in constant interaction and communication with others. Humans intuitively perceive others' emotions, states, and intentions through their tone of voice, facial expressions, gestures, and behavioral patterns. These abilities are fundamental to mutual understanding and empathetic social interaction.

large language model, machine learning, recognition, (18 more...)

arXiv.org Artificial Intelligence

2505.03315

Country:

Asia > South Korea > Ulsan > Ulsan (0.05)
Asia > Singapore (0.04)
North America > United States > Virginia (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Overview (1.00)
Research Report (0.65)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Driving behavior recognition via self-discovery learning

Wang, Yilin

arXiv.org Artificial IntelligenceMar-18-2025

Autonomous driving systems require a deep understanding of human driving behaviors to achieve higher intelligence and safety.Despite advancements in deep learning, challenges such as long-tail distribution due to scarce samples and confusion from similar behaviors hinder effective driving behavior detection.Existing methods often fail to address sample confusion adequately, as datasets frequently contain ambiguous samples that obscure unique semantic information.

artificial intelligence, machine learning, self-discovery learning, (1 more...)

arXiv.org Artificial Intelligence

2503.14194

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Pig behavior dataset and Spatial-temporal perception and enhancement networks based on the attention mechanism for pig behavior recognition

Qi, Fangzheng, Hou, Zhenjie, Lin, En, Li, Xing, Liang, iuzhen, Zhou, Xinwen

arXiv.org Artificial IntelligenceMar-12-2025

The recognition of pig behavior plays a crucial role in smart farming and welfare assurance for pigs. Currently, in the field of pig behavior recognition, the lack of publicly available behavioral datasets not only limits the development of innovative algorithms but also hampers model robustness and algorithm optimization.This paper proposes a dataset containing 13 pig behaviors that significantly impact welfare.Based on this dataset, this paper proposes a spatial-temporal perception and enhancement networks based on the attention mechanism to model the spatiotemporal features of pig behaviors and their associated interaction areas in video data. The network is composed of a spatiotemporal perception network and a spatiotemporal feature enhancement network. The spatiotemporal perception network is responsible for establishing connections between the pigs and the key regions of their behaviors in the video data. The spatiotemporal feature enhancement network further strengthens the important spatial features of individual pigs and captures the long-term dependencies of the spatiotemporal features of individual behaviors by remodeling these connections, thereby enhancing the model's perception of spatiotemporal changes in pig behaviors. Experimental results demonstrate that on the dataset established in this paper, our proposed model achieves a MAP score of 75.92%, which is an 8.17% improvement over the best-performing traditional model. This study not only improces the accuracy and generalizability of individual pig behavior recognition but also provides new technological tools for modern smart farming. The dataset and related code will be made publicly available alongside this paper.

pig behavior, recognition, spatial feature, (16 more...)

arXiv.org Artificial Intelligence

2503.09378

Country:

Asia > China > Jiangsu Province > Changzhou (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.49)

Industry:

Food & Agriculture > Agriculture (1.00)
Health & Medicine > Consumer Health (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition

Deng, Shijian, Kosloski, Erin E., Patel, Siddhi, Barnett, Zeke A., Nan, Yiyang, Kaplan, Alexander, Aarukapalli, Sisira, Doan, William T., Wang, Matthew, Singh, Harsh, Rollins, Pamela R., Tian, Yapeng

arXiv.org Artificial IntelligenceMar-22-2024

In this article, we introduce a novel problem of audio-visual autism behavior recognition, which includes social behavior recognition, an essential aspect previously omitted in AI-assisted autism screening research. We define the task at hand as one that is audio-visual autism behavior recognition, which uses audio and visual cues, including any speech present in the audio, to recognize autism-related behaviors. To facilitate this new research direction, we collected an audio-visual autism spectrum dataset (AV-ASD), currently the largest video dataset for autism screening using a behavioral approach. It covers an extensive range of autism-associated behaviors, including those related to social communication and interaction. To pave the way for further research on this new problem, we intensively explored leveraging foundation models and multimodal large language models across different modalities. Our experiments on the AV-ASD dataset demonstrate that integrating audio, visual, and speech modalities significantly enhances the performance in autism behavior recognition. Additionally, we explored the use of a post-hoc to ad-hoc pipeline in a multimodal large language model to investigate its potential to augment the model's explanatory capability during autism behavior recognition. We will release our dataset, code, and pre-trained models.

dataset, non-responsiveness, recognition, (13 more...)

arXiv.org Artificial Intelligence

2406.02554

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > India (0.04)
Asia > China (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Deep Neural Networks in Video Human Action Recognition: A Review

Wang, Zihan, Yang, Yang, Liu, Zhi, Zheng, Yifan

arXiv.org Artificial IntelligenceMay-24-2023

Currently, video behavior recognition is one of the most foundational tasks of computer vision. The 2D neural networks of deep learning are built for recognizing pixel-level information such as images with RGB, RGB-D, or optical flow formats, with the current increasingly wide usage of surveillance video and more tasks related to human action recognition. There are increasing tasks requiring temporal information for frames dependency analysis. The researchers have widely studied video-based recognition rather than image-based(pixel-based) only to extract more informative elements from geometry tasks. Our current related research addresses multiple novel proposed research works and compares their advantages and disadvantages between the derived deep learning frameworks rather than machine learning frameworks. The comparison happened between existing frameworks and datasets, which are video format data only. Due to the specific properties of human actions and the increasingly wide usage of deep neural networks, we collected all research works within the last three years between 2020 to 2022. In our article, the performance of deep neural networks surpassed most of the techniques in the feature learning and extraction tasks, especially video action recognition.

artificial intelligence, machine learning, recognition, (17 more...)

arXiv.org Artificial Intelligence

2305.15692

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Oceania > Australia > Western Australia > Perth (0.04)
(2 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Industry:

Information Technology (0.67)
Health & Medicine (0.67)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Multi-Agent Continuous Learning System

Qian, Xingyu, Yuemaier, Aximu, Liang, Longfei, Yang, Wen-Chi, Chen, Xiaogang, Li, Shunfen, Dai, Weibang, Song, Zhitang

arXiv.org Artificial IntelligenceApr-4-2023

We propose an adaptive multi-agent clustering recognition system that can be self-supervised driven, based on a temporal sequences continuous learning mechanism with adaptability. The system is designed to use some different functional agents to build up a connection structure to improve adaptability to cope with environmental diverse demands, by predicting the input of the agent to drive the agent to achieve the act of clustering recognition of sequences using the traditional algorithmic approach. Finally, the feasibility experiments of video behavior clustering demonstrate the feasibility of the system to cope with dynamic situations. Our work is placed here\footnote{https://github.com/qian-git/MAMMALS}.

agent, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2212.07646

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Continuing Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Machine Learning and Airport Security See Eye to Eye

#artificialintelligenceJan-4-2017, 16:25:27 GMT

The prospect of standing for hours on end has become all too common at airports around the world. But soon, airports may be piloting security programs based on behavior recognition and machine learning, instead of asking passengers to practice patience. As we know, patience is becoming a lost art, but predictive analytics based on sensor, device, and video data is a technology art form that airlines and airports are exploring. The 9/11 attacks and the 2001 Shoe Bomber's attempt are among the most well-known security threats, and they upended how we travel. To protect passengers and crews, airports have made finding dangerous items their primary objective.

airport, artificial intelligence, data mining, (18 more...)

#artificialintelligence

Country:

North America > United States (0.05)
Europe > France (0.05)

Industry:

Information Technology > Security & Privacy (1.00)
Commercial Services & Supplies > Security & Alarm Services (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.96)
Transportation > Infrastructure & Services > Airport (0.72)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence (1.00)
Information Technology > Information Management (0.97)
Information Technology > Data Science > Data Mining > Big Data (0.32)

Add feedback

Improvement of Multi-AUV Cooperation through Teammate Verification

Novitzky, Michael (The Georgia Institute of Technology)

AAAI ConferencesAug-8-2011

Current methods for multi-AUV cooperation suffer in low communication environments. State of the art methods employ auctioneering or planning to determine a single AUV'task. These systems require communication to update models of teammates and tasks for efficient task selection. Most strategies assume a teammate is inoperable if a communication timeout is reached which reduces overall team efficiency. Including teammate prediction has been shown to mitigate efficiency degeneration due to low communication. However, there is no verification of a predicted teammate's task other than through eventual communication. A possible verification tool is behavior recognition. Current behavior recognition utilizes either overhead sensors or post mission analysis to track robot trajectories in order to infer their internal state. A system in which an AUV is capable of sensing a teammate, for example through a forward-looking sonar, and deducing it's behavior along with contextual information, such as location, will enable an AUV to determine that teammate's current task in the overall mission. This will allow for an accurate update of that teammate's model allowing the AUV to more efficiently determine its own next task rather than relying only on communication. This position paper posits that multi-AUV cooperation efficiency will improve in low communication environments with the combination of robust teammate prediction along with verification using behavior recognition.

artificial intelligence, behavior recognition, prediction, (14 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States > Georgia > Fulton County > Atlanta (0.05)

Industry: Government (0.30)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback