Journal of Artificial Intelligence and Metaheuristics JAIM 2833-5597 10.54216/JAIM https://www.americaspg.com/journals/show/3858 2022 2022 Real-Time Violence Detection in Smart Cities Using Lightweight Spatiotemporal Deep Learning Models School of Mathematical Sciences, Jiangsu University, Jiangsu 212013, China Muhammad Muhammad Smart city infrastructure development and urban environment complexity increase the need for automated systems that detect violence immediately in surveillance footage. The current CCTV system depends on human operators, which becomes impractical when quick response times are mandatory for extensive deployment domains. This research develops a deep learning architecture that proposes automated detection methods for violence and weapon activities in practical CCTV surveillance through the Smart-City CCTV Violence Detection (SCVD) dataset. The system uses MobileNetV2 as its basic convolutional framework, which can extract spatial frame patterns through TimeDistributed layers from video sequence inputs. The features move to a stacked Long Short-Term Memory (LSTM) network to extract the temporal-based dependencies within violent actions. The system processes video sequences with 15 frames while maintaining a pixel size of 128128× to achieve operational efficiency and representational capability. Regularization techniques Batch Normalization and Dropout are used in every part of the network to improve generalization capability and limit overfitting. The pipeline finishes through dense layers linked in full connection, followed by a sigmoid activation function to achieve binary outputs. The experiments on the SCVD dataset resulted in highly positive outcomes. Evaluation of the model produced a 99.58% accuracy rate together with a minimal cross-entropy loss amounting to 0.0139. This model monitoring system demonstrated exceptional performance metrics because the standard class achieved 0.99 precision and 0.99 recall alongside 0.99 F1-score, and the violent class received a perfect score of 100 on every metric. The model proves effective for detecting and classifying violent activities with excellent reliability under diverse and complex surveillance settings. The research shows that real-time deployment of deep learning models in intelligent city surveillance can be accomplished using robust, compact solutions. The system design incorporates spatial along with temporal feature methodologies thus making it suitable for deployment on edge devices such as smart cameras and embedded systems. Through its work on uniting academic models with practical deployment, this study helps create safer urban environments by developing AI-driven public safety technologies. 2025 2025 19 36 10.54216/JAIM.090202 https://www.americaspg.com/articleinfo/28/show/3858