Aerial Violence Recognition Based on Spatial-Temporal Graph Convolutional Networks and Attention Model

التفاصيل البيبلوغرافية
العنوان:	Aerial Violence Recognition Based on Spatial-Temporal Graph Convolutional Networks and Attention Model
المؤلفون:	SHAO Yan-hua, LI Wen-feng, ZHANG Xiao-qiang, CHU Hong-yu, RAO Yun-bo, CHEN Lu
المصدر:	Jisuanji kexue, Vol 49, Iss 6, Pp 254-261 (2022)
بيانات النشر:	Editorial office of Computer Science, 2022.
سنة النشر:	2022
المجموعة:	LCC:Computer software LCC:Technology (General)
مصطلحات موضوعية:	violence recognition, human pose estimation, aerial photography, spatial-temporal graph convolutional, cascade network, attention mechanism, Computer software, QA76.75-76.765, Technology (General), T1-995
الوصف:	The violence in public areas occurs frequently and video surveillance is of great significance for maintaining public safety.Compared with fixed cameras,unmanned aerial vehicles (UAVs) have surveillance mobility.However,in aerial images,the rapid movement of UAVs as well as the change of posture and height cause the problem of motion blur and large-scale change of target.To solve this problem,an attention spatial-temporal convolutional network (AST-GCN) combining attention mechanism is designed to realize the identification of violent behavior in aerial video.The proposed method is divided into two steps:the key frame detection network completes the initial positioning,and the AST-GCN network completes the behavior identification through the sequence features.Firstly,aiming at video violence localization,a key frame cascade detection network is designed to realize violence key frame detection based on human posture estimation,and preliminarily judge the occurrence time of violence.Secondly,the skeleton information of multiple frames around key frames is extracted from the video sequence,and the skeleton data is pre-processed,including normalization,screening and completion,so as to improve the robustness of different scenes and the partial missing of key nodes.And the skeleton temporal-spatial representation matrix is constructed according to the extracted skeleton information.Finally,AST-GCN network analyzes and identifies multiple frames of human skeleton information,to integrate attention module,improve feature expression ability,and complete the recognition of violent behavior.The method is validated on self-built aerial violence data set,and experimental results show that the AST-GCN can realize the recognition of aerial scene violence,and the recognition accuracy is 86.6%.The proposed method has important engineering value and scientific signifi-cance for the realization of aerial video surveillance and human pose understanding applications.
نوع الوثيقة:	article
وصف الملف:	electronic resource
اللغة:	Chinese
تدمد:	1002-137X
Relation:	https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-6-254.pdf; https://doaj.org/toc/1002-137X
DOI:	10.11896/jsjkx.210400272
URL الوصول:	https://doaj.org/article/50f7055af189488d984f6077c4eaddd8
رقم الأكسشن:	edsdoj.50f7055af189488d984f6077c4eaddd8
قاعدة البيانات:	Directory of Open Access Journals

Full Text Finder

الوصف
تدمد:	1002137X
DOI:	10.11896/jsjkx.210400272