Advanced Surveillance With Yolov12: Fusion- Based Detection Of Threatening Objects
Keywords:
YOLOv12, Threat Detection, Intelligent Surveillance, Object Detection, Sensor Fusion, Multimodal Fusion, Deep Learning, Computer Vision, Real-Time Detection, Public Safety, Smart Surveillance, Weapon Detection, Infrared Imaging, Security Monitoring, Artificial Intelligence.Abstract
The rapid evolution of intelligent surveillance systems has accelerated the adoption of deep learning-based object detection models for enhanced situational awareness, automated threat recognition, and proactive security monitoring. This study presents an Advanced Surveillance Framework utilizing YOLOv12 for fusion-based detection of threatening objects such as firearms, knives, explosives, unattended suspicious baggage, and other hazardous items in complex real-world environments. The proposed system integrates multi-sensor data fusion, combining RGB visual imagery and infrared sensing to improve detection performance under low-light conditions, partial occlusions, crowded scenes, and adverse environmental situations.
The proposed framework leverages the advanced capabilities of YOLOv12, including optimized detection heads, transformer-based attention modules, adaptive anchor strategies, and enhanced feature aggregation mechanisms, to achieve superior speed–accuracy trade-offs compared with conventional object detection models. A multimodal feature fusion module is incorporated to exploit complementary spatial and thermal information, improving robustness and minimizing false positives and false negatives in threat detection.
Furthermore, an intelligent threat prioritization mechanism is introduced for real-time classification and alert generation based on threat severity levels, enabling rapid response in critical surveillance scenarios. The system is implemented using Python-based deep learning frameworks and designed for scalable deployment in smart surveillance infrastructures. Experimental analysis demonstrates that the proposed fusion-based YOLOv12 model significantly outperforms traditional single-sensor and conventional YOLO-based approaches in terms of precision, recall, mean Average Precision (mAP), inference speed, and detection robustness, making it a powerful and reliable solution for modern surveillance applications in public safety, transportation hubs, border security, defense monitoring, and smart city security networks.
The proposed framework contributes toward next-generation intelligent surveillance by combining real-time deep learning detection, multimodal sensor fusion, and adaptive threat intelligence into a unified security architecture capable of supporting autonomous and large-scale surveillance ecosystems.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.










