Mobile Surveillance System using Unmanned Aerial Vehicle for Aerial Imagery

Authors

DOI:

https://doi.org/10.18196/eist.v5i2.24837

Keywords:

Deep learning, unmanned aerial vehicle, surveillance system, crowd counting

Abstract

Crowd counting plays a vital role in public safety, particularly during riot scenarios where understanding crowd dynamics is crucial for effective decision-making and risk mitigation. Accurate crowd estimation in such environments enables authorities to monitor the situation in real time, allocate resources efficiently, and prevent potential escalations. However, counting individuals in a riot scenario presents unique challenges due to the chaotic nature of the scene, varying crowd densities, and obstructions caused by movement and environmental factors. Traditional methods struggle to provide reliable results in these conditions, necessitating advanced solutions. This study explores the implementation of CSRNet (Congested Scene Recognition Network), a state-of-the-art deep learning model, to address crowd counting in challenging environments characterized as "images in the wild." CSRNet’s ability to leverage dilated convolutions allows it to effectively capture contextual information and handle high crowd densities without sacrificing spatial resolution. We evaluate the model’s performance on diverse datasets, including aerial imagery and real-world riot scenarios, focusing on its adaptability to dynamic, unstructured environments. The results demonstrate the potential of CSRNet to provide accurate crowd density estimates under adverse conditions, offering critical insights for public safety applications. By addressing the technical challenges of implementing CSRNet in these contexts, this study contributes to the advancement of deep learning-based crowd counting, emphasizing its significance in real-world scenarios such as riots and other high-stakes events. Future work aims to further enhance the model's robustness and applicability to diverse operational settings.

Author Biography

Muhamad Amirul Haq, Universitas Muhammadiyah Surabaya

Computer Science Department

References

D. Helbing, L. Buzna, A. Johansson, and T. Werner, “Self-Organized Pedestrian Crowd Dynamics: Experiments, Simulations, and Design Solutions,” Transp. Sci., vol. 39, no. 1, pp. 1–24, Feb. 2005, doi: 10.1287/trsc.1040.0108.

C. Celes, A. Boukerche, and A. A. F. Loureiro, “Crowd Management: A New Challenge for Urban Big Data Analytics,” IEEE Commun. Mag., vol. 57, no. 4, pp. 20–25, Apr. 2019, doi: 10.1109/MCOM.2019.1800640.

F. Yang, H. Fan, P. Chu, E. Blasch, and H. Ling, “Clustered Object Detection in Aerial Images,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 2019, pp. 8310–8319, doi: 10.1109/ICCV.2019.00840.

S. Hardaha, D. R. Edla, and S. R. Parne, “A Survey on Convolutional Neural Networks for MRI Analysis,” Wirel. Pers. Commun., vol. 128, no. 2, pp. 1065–1085, 2023, doi: 10.1007/s11277-022-09989-0.

D. C. Duives, W. Daamen, and S. P. Hoogendoorn, “Quantification of the level of crowdedness for pedestrian movements,” Phys. A Stat. Mech. its Appl., vol. 427, pp. 162–180, Jun. 2015, doi: 10.1016/j.physa.2014.11.054.

M. A. Khan, H. Menouar, and R. Hamila, “Visual crowd analysis: Open research problems,” AI Mag., vol. 44, no. 3, pp. 296–311, Sep. 2023, doi: 10.1002/aaai.12117.

Y. Jeon, W. Chang, S. Jeong, S. Han, and J. Park, “A Bayesian convolutional neural network-based generalized linear model,” Biometrics, vol. 80, no. 2, Mar. 2024, doi: 10.1093/biomtc/ujae057.

Y. Chen, J. Yang, B. Chen, and S. Du, “Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 3, pp. 1055–1068, 2023, doi: 10.1109/TCSVT.2022.3208714.

B. R. Pandit et al., “Deep learning neural network for lung cancer classification: enhanced optimization function,” Multimed. Tools Appl., vol. 82, no. 5, pp. 6605–6624, 2023, doi: 10.1007/s11042-022-13566-9.

P. Zhu, L. Wen, X. Bian, H. Ling, and Q. Hu, “Vision Meets Drones: A Challenge,” pp. 1–11, 2018, [Online]. Available: http://arxiv.org/abs/1804.07437.

Y. Li, X. Zhang, and D. Chen, “CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 1091–1100, doi: 10.1109/CVPR.2018.00120.

P. Thanasutives, K. Fukui, M. Numao, and B. Kijsirikul, “Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting,” in 2020 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp. 2382–2389, doi: 10.1109/ICPR48806.2021.9413286.

J. Zhang, S. Chen, S. Tian, W. Gong, G. Cai, and Y. Wang, “A Crowd Counting Framework Combining with Crowd Location,” J. Adv. Transp., vol. 2021, pp. 1–14, Feb. 2021, doi: 10.1155/2021/6664281.

M. A. Khan, H. Menouar, and R. Hamila, “Revisiting crowd counting: State-of-the-art, trends, and future perspectives,” Image Vis. Comput., vol. 129, p. 104597, Jan. 2023, doi: 10.1016/j.imavis.2022.104597.

R. Gouiaa, M. A. Akhloufi, and M. Shahbazi, “Advances in Convolution Neural Networks Based Crowd Counting and Density Estimation,” Big Data Cogn. Comput., vol. 5, no. 4, p. 50, Sep. 2021, doi: 10.3390/bdcc5040050.

A. Chrysler, R. Gunarso, T. Puteri, and H. L. H. S. Warnars, “A literature review of crowd-counting system on convolutional neural network,” IOP Conf. Ser. Earth Environ. Sci., vol. 729, no. 1, p. 012029, Apr. 2021, doi: 10.1088/1755-1315/729/1/012029.

A. Beucher, C. B. Rasmussen, T. B. Moeslund, and M. H. Greve, “Interpretation of Convolutional Neural Networks for Acid Sulfate Soil Classification,” Front. Environ. Sci., vol. 9, Jan. 2022, doi: 10.3389/fenvs.2021.809995.

T. Li, J. Liu, W. Zhang, Y. Ni, W. Wang, and Z. Li, “UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 16261–16270, doi: 10.1109/CVPR46437.2021.01600.

Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “VGGFace2: A dataset for recognising faces across pose and age,” Proc. - 13th IEEE Int. Conf. Autom. Face Gesture Recognition, FG 2018, pp. 67–74, 2018, doi: 10.1109/FG.2018.00020.

F. Yu and V. Koltun, “Multi-Scale Context Aggregation by Dilated Convolutions,” Nov. 2015, [Online]. Available: http://arxiv.org/abs/1511.07122.

Downloads

Published

2024-11-30

Issue

Section

Intelligent Systems