Mobile Surveillance System using Unmanned Aerial Vehicle for Aerial Imagery
DOI:
https://doi.org/10.18196/eist.v5i2.24837Keywords:
Deep learning, unmanned aerial vehicle, surveillance system, crowd countingAbstract
Crowd counting plays a vital role in public safety, particularly during riot scenarios where understanding crowd dynamics is crucial for effective decision-making and risk mitigation. Accurate crowd estimation in such environments enables authorities to monitor the situation in real time, allocate resources efficiently, and prevent potential escalations. However, counting individuals in a riot scenario presents unique challenges due to the chaotic nature of the scene, varying crowd densities, and obstructions caused by movement and environmental factors. Traditional methods struggle to provide reliable results in these conditions, necessitating advanced solutions. This study explores the implementation of CSRNet (Congested Scene Recognition Network), a state-of-the-art deep learning model, to address crowd counting in challenging environments characterized as "images in the wild." CSRNet’s ability to leverage dilated convolutions allows it to effectively capture contextual information and handle high crowd densities without sacrificing spatial resolution. We evaluate the model’s performance on diverse datasets, including aerial imagery and real-world riot scenarios, focusing on its adaptability to dynamic, unstructured environments. The results demonstrate the potential of CSRNet to provide accurate crowd density estimates under adverse conditions, offering critical insights for public safety applications. By addressing the technical challenges of implementing CSRNet in these contexts, this study contributes to the advancement of deep learning-based crowd counting, emphasizing its significance in real-world scenarios such as riots and other high-stakes events. Future work aims to further enhance the model's robustness and applicability to diverse operational settings.References
D. Helbing, L. Buzna, A. Johansson, and T. Werner, “Self-Organized Pedestrian Crowd Dynamics: Experiments, Simulations, and Design Solutions,” Transp. Sci., vol. 39, no. 1, pp. 1–24, Feb. 2005, doi: 10.1287/trsc.1040.0108.
C. Celes, A. Boukerche, and A. A. F. Loureiro, “Crowd Management: A New Challenge for Urban Big Data Analytics,” IEEE Commun. Mag., vol. 57, no. 4, pp. 20–25, Apr. 2019, doi: 10.1109/MCOM.2019.1800640.
F. Yang, H. Fan, P. Chu, E. Blasch, and H. Ling, “Clustered Object Detection in Aerial Images,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 2019, pp. 8310–8319, doi: 10.1109/ICCV.2019.00840.
S. Hardaha, D. R. Edla, and S. R. Parne, “A Survey on Convolutional Neural Networks for MRI Analysis,” Wirel. Pers. Commun., vol. 128, no. 2, pp. 1065–1085, 2023, doi: 10.1007/s11277-022-09989-0.
D. C. Duives, W. Daamen, and S. P. Hoogendoorn, “Quantification of the level of crowdedness for pedestrian movements,” Phys. A Stat. Mech. its Appl., vol. 427, pp. 162–180, Jun. 2015, doi: 10.1016/j.physa.2014.11.054.
M. A. Khan, H. Menouar, and R. Hamila, “Visual crowd analysis: Open research problems,” AI Mag., vol. 44, no. 3, pp. 296–311, Sep. 2023, doi: 10.1002/aaai.12117.
Y. Jeon, W. Chang, S. Jeong, S. Han, and J. Park, “A Bayesian convolutional neural network-based generalized linear model,” Biometrics, vol. 80, no. 2, Mar. 2024, doi: 10.1093/biomtc/ujae057.
Y. Chen, J. Yang, B. Chen, and S. Du, “Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 3, pp. 1055–1068, 2023, doi: 10.1109/TCSVT.2022.3208714.
B. R. Pandit et al., “Deep learning neural network for lung cancer classification: enhanced optimization function,” Multimed. Tools Appl., vol. 82, no. 5, pp. 6605–6624, 2023, doi: 10.1007/s11042-022-13566-9.
P. Zhu, L. Wen, X. Bian, H. Ling, and Q. Hu, “Vision Meets Drones: A Challenge,” pp. 1–11, 2018, [Online]. Available: http://arxiv.org/abs/1804.07437.
Y. Li, X. Zhang, and D. Chen, “CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 1091–1100, doi: 10.1109/CVPR.2018.00120.
P. Thanasutives, K. Fukui, M. Numao, and B. Kijsirikul, “Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting,” in 2020 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp. 2382–2389, doi: 10.1109/ICPR48806.2021.9413286.
J. Zhang, S. Chen, S. Tian, W. Gong, G. Cai, and Y. Wang, “A Crowd Counting Framework Combining with Crowd Location,” J. Adv. Transp., vol. 2021, pp. 1–14, Feb. 2021, doi: 10.1155/2021/6664281.
M. A. Khan, H. Menouar, and R. Hamila, “Revisiting crowd counting: State-of-the-art, trends, and future perspectives,” Image Vis. Comput., vol. 129, p. 104597, Jan. 2023, doi: 10.1016/j.imavis.2022.104597.
R. Gouiaa, M. A. Akhloufi, and M. Shahbazi, “Advances in Convolution Neural Networks Based Crowd Counting and Density Estimation,” Big Data Cogn. Comput., vol. 5, no. 4, p. 50, Sep. 2021, doi: 10.3390/bdcc5040050.
A. Chrysler, R. Gunarso, T. Puteri, and H. L. H. S. Warnars, “A literature review of crowd-counting system on convolutional neural network,” IOP Conf. Ser. Earth Environ. Sci., vol. 729, no. 1, p. 012029, Apr. 2021, doi: 10.1088/1755-1315/729/1/012029.
A. Beucher, C. B. Rasmussen, T. B. Moeslund, and M. H. Greve, “Interpretation of Convolutional Neural Networks for Acid Sulfate Soil Classification,” Front. Environ. Sci., vol. 9, Jan. 2022, doi: 10.3389/fenvs.2021.809995.
T. Li, J. Liu, W. Zhang, Y. Ni, W. Wang, and Z. Li, “UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 16261–16270, doi: 10.1109/CVPR46437.2021.01600.
Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “VGGFace2: A dataset for recognising faces across pose and age,” Proc. - 13th IEEE Int. Conf. Autom. Face Gesture Recognition, FG 2018, pp. 67–74, 2018, doi: 10.1109/FG.2018.00020.
F. Yu and V. Koltun, “Multi-Scale Context Aggregation by Dilated Convolutions,” Nov. 2015, [Online]. Available: http://arxiv.org/abs/1511.07122.
Downloads
Published
Issue
Section
License
Copyright
The author should be aware that by submitting an article to this journal, the article's copyright will be fully transferred to journal of Emerging Information Science and Technology. Authors are allowed to resend their manuscript to other journals or intentionally withdraw the manuscript only if both parties (journal of Emerging Information Science and Technology and Authors) have agreed on the issue. Once the manuscript has been published, authors are allowed to use their published article under journal of Emerging Information Science and Technology's copyrights.
All authors are required to deliver the agreement of license transfer once they submit the manuscript to journal of Emerging Information Science and Technology. By signing the agreement, the copyright is attributed to this journal to protect the intellectual material for the authors. Authors are allowed to share, copy and redistribute the material in any medium and in any circumstances to give appropriate credit and wide readership to the work.
License
Articles published in the journal of Emerging Information Science and Technology are licensed under an Attribution 4.0 International (CC BY 4.0) license. You are free to:
- Share — copy and redistribute the material in any medium or format.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
This license is acceptable for Free Cultural Works. The licensor cannot revoke these freedoms as long as you follow the license terms. Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.