ASPG

Hybrid Ensemble Learning for Flow-Level IoT Traffic Classification Using ACI Dataset: Towards Scalable and Real-Time Threat Detection

El-Sayed M. El-Kenawy , Sini Raj Pulari , Shriram K Vasudevan

Internet of Things devices, which spread across consumer industrial and critical infrastructure domains, have boosted the quantity of diverse network traffic and its high frequency. The increasing scale of IoT networks causes problems securing the diverse data flow within these networks, threatening system performance and management capabilities. Analyzing network traffic with traditional methods based on signature identification and rule detection becomes ineffective for new traffic activity patterns and system behavior. Due to extensive growth in IoT networks, developing intelligent data-based classification systems that can process IoT traffic quickly and at large operational scales becomes essential. A detailed model of flow-level data-based machine learning operations for IoT traffic classification utilizes features extracted from the Army Cyber Institute (ACI) IoT dataset. The dataset encompasses statistical, temporal, and protocol-specific attributes for benign and malicious network flows. Our methodology first conducts a strict data preprocessing stage, which involves numerous operations such as cleaning the data, normalizing it and encoding the labels, and performing a feature correlation analysis before preparing the learning algorithms with a suitable quality and balanced dataset. Various classification models underwent training, including Linear Discriminant Analysis (LDA), Quadratic  Discriminant Analysis (QDA), Naive Bayes and SGD Classifiers, and statistical learners. Our proposed hybrid ensemble method combines weighted voting between a deep learning neural network, a Random Forest model, and an XGBoost classifier to overcome the limitations of single classifiers. This ensemble model aimed to make the system more resilient while lowering bias and enhancing its ability to understand various IoT traffic patterns. A complete set of evaluation metrics assessed the models, using accuracy, precision, recall, F1-score, Hamming loss, Matthews correlation coefficient (MCC) and Cohen’s Kappa plus balanced accuracy and log loss for assessment. The chosen metrics allowed researchers to monitor model performance from global and detailed perspectives when dealing with imbalanced classes and similar patterns between legitimate and malicious network traffic. The ensemble methodology produces superior results than individual classifiers demonstrated through experimental results under all performance metrics evaluation. The complex nature of network environments demonstrates that model fusion achieves excellent results when tracking non-easy- to-classify traffic patterns. The ensemble approach proves excellent generalization properties and optimized performance for real-time IoT implementations because of its ability to adapt continuously while maintaining high accuracy levels. This proposed framework adds to intelligent IoT traffic analysis research while demonstrating how deep learning and traditional machine learning methods enhance ensemble systems. The system develops an expandable and clear quantitative solution that can be implemented for advanced network security systems and traffic monitoring applications across smart cities industrial settings, and critical infrastructure frameworks.

Vol. 9 Issue. 2 PP. 01-18, (2025)

Real-Time Violence Detection in Smart Cities Using Lightweight Spatiotemporal Deep Learning Models

Muhammad Ahsan

Smart city infrastructure development and urban environment complexity increase the need for automated systems that detect violence immediately in surveillance footage. The current CCTV system depends on human operators, which becomes impractical when quick response times are mandatory for extensive deployment domains. This research develops a deep learning architecture that proposes automated detection methods for violence and weapon activities in practical CCTV surveillance through the Smart-City CCTV Violence Detection (SCVD) dataset. The system uses MobileNetV2 as its basic convolutional framework, which can extract spatial frame patterns through TimeDistributed layers from video sequence inputs. The features move to a stacked Long Short-Term Memory (LSTM) network to extract the temporal-based dependencies within violent actions. The system processes video sequences with 15 frames while maintaining a pixel size of 128128× to achieve operational efficiency and representational capability. Regularization techniques Batch Normalization and Dropout are used in every part of the network to improve generalization capability and limit overfitting. The pipeline finishes through dense layers linked in full connection, followed by a sigmoid activation function to achieve binary outputs. The experiments on the SCVD dataset resulted in highly positive outcomes. Evaluation of the model produced a 99.58% accuracy rate together with a minimal cross-entropy loss amounting to 0.0139. This model monitoring system demonstrated exceptional performance metrics because the standard class achieved 0.99 precision and 0.99 recall alongside 0.99 F1-score, and the violent class received a perfect score of 100 on every metric. The model proves effective for detecting and classifying violent activities with excellent reliability under diverse and complex surveillance settings. The research shows that real-time deployment of deep learning models in intelligent city surveillance can be accomplished using robust, compact solutions. The system design incorporates spatial along with temporal feature methodologies thus making it suitable for deployment on edge devices such as smart cameras and embedded systems. Through its work on uniting academic models with practical deployment, this study helps create safer urban environments by developing AI-driven public safety technologies.

Vol. 9 Issue. 2 PP. 19-36, (2025)

Interpretable Rainfall Forecasting Using SHAP-Enhanced Machine Learning: A Case Study on U.S. Urban Climate Data (2024–2025)

Khaled Sh. Gaber , Mahmoud Elshabrawy Mohamed

Correct rainfall prediction is fundamental for developing resilient climates, guaranteeing sustainable farms and planned water distribution networks, and reducing possible disasters. Many meteorological elements affect rainfall patterns because rainfall shows nonlinear behavior and dependence across different timescales and diverse spatial areas. Multiple problematic features defeat conventional forecasting techniques because they produce insufficient accurate predictions of short-duration precipitation patterns. Because of rising climate variability, we require predictive frameworks built with data with strong performance abilities and human- understandable features. In this paper, we establish a machine learning that predicts daily rainfall in advance with a refined dataset consisting of detailed weather measurements spanning 20 United States metropolises from 2024 to 2025. The selected dataset contains six atmospheric factors: temperature, humidity, wind speed, and cloud cover with pressure and precipitation and a binary outcome to show rainfall prediction for the following day. Random Forest and Support Vector Machine (RBF) KNearest Neighbors (KNN), Logistic Regression, Naive Bayes, and Linear SVM formed the set of machine learning models that underwent training and evaluation. The SHAP method was integrated to improve prediction interpretation and trust through Shapley additive explanations value measures. SHAP values provided quantitative measurement and graphical visualization to explain the role of each input variable in making individual prediction outcomes. SHAP analysis of the model showcased precipitation and humidity as their most crucial features because they match the principles of meteorological theory and demonstrate the rational decision-making process of the model. The Random Forest approach scored the highest performance from all models, reaching perfect measurements for Precision = 100, Recall = 100 and F1-score = 100. The RBF SVM model alongside KNN showed strong performance since they delivered F1 scores of 0.97 and 0.94. The evaluation revealed that Logistic Regression, Linear SVM and Naive Bayes achieved satisfactory results, providing F1-score ratings between 0.76 and0.77.The SHAP-based diagnostic results showed that Random Forest yielded exceptional classification results while simultaneously showing consistent weighting patterns between features across diverse locations. The integration of the Random Forest model with SHAP interpretation creates an effective solution for rainfall forecasting despite its high prediction capabilities. The model achieves complete prediction accuracy with precise explanation capabilities, generating trust for using it in actual deployment scenarios. According to the results, weather-sensitive sectors like agriculture, urban planning, and disaster response can leverage these transparent machine learning systems into their decision-making support pipelines. The approach described has the potential to become a model structure for conducting future predictive analyses in meteorology and environmental science.

Vol. 9 Issue. 2 PP. 37-53, (2025)

Multi-Classification of Brain Tumor MRI Images Hybrid VGG16 Support Vector Machine Model

Asifa Iqbal

Tumor brain research stands essential for detecting patients during timely periods and delivering proper treatment options. Inspecting tumors becomes difficult because tumor morphology shows diverse characteristics in terms of dimensions and placement surface texture patterns, and inconsistent visual features across various medical image types. A combined methodology will be implemented to detect brain tumors through MRI image analysis in this research. The model operated with three publicly accessible datasets containing 3,966 T1-weighted contrast-enhanced magnetic resonance images (T1-w MRI) that were split between glioma, meningioma, pituitary tumor and no tumor groups. The diagnosis pipeline starts by applying preprocessing and data augmentation steps that improve data quality alongside increasing its variability rates. The main structure of this system uses VGG16 deep convolutional neural network features alongside a Support Vector Machine (SVM) classifier to determine outputs. The modified VGG16 output became the SVM input, delivering optimal results while keeping the computational time sensible. The proposed hybrid model performs better than all existing methods analyzed in the literature according to experimental results. The test success rate of the model reached 97.2\%. Test outcomes from standard machine learning methods XGBoost, AdaBoost, Decision Tree, and K-Nearest Neighbors demonstrate that using SVM as the endpoint classifier boosts achievement levels in this dataset assessment.

Vol. 9 Issue. 2 PP. 54-71, (2025)

LightGBM-Driven Earthquake Magnitude Prediction: A Comparative Machine Learning Framework Using Global Seismic Data

Nima Khodadadi

Earthquakes represent one of the most destructive natural hazards because they cause consequential destruction to entire communities and fatal consequences for people. Research has continued for decades because scientists aim to develop better forecasting tools for seismic events, which unpredictably strike society with massive economic losses. Research methods from classical earthquake science and statistical and physical earthquake models do not effectively demonstrate earthquake data's complex spatial and temporal characteristics. ML methods generated widespread interest in prediction work because they extract understanding from extensive data collections to produce accurate results independently of physical rules. The presented work examines various ML models that predict earthquake magnitudes by assessing an open-access global earthquake dataset from 2023. The evaluation consists of five predictive models, including Light Gradient Boosting Machine (LightGBM) and Support Vector Regression (SVR), as well as k-nearest Neighbors (KNN), Ridge Regression, along Extra Trees Regressor. The training process included stratified cross-validation and model optimization of hyperparameters for every model. The assessment included a mixture of statistical and mathematical performance indicators that measured Mean Squared Error (MSE) alongside Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Bias Error (MBE), Coefficient of Determination ($R^2$), Nash–Sutcliffe Efficiency (NSE), Willmott Index (WI), Pearson's Correlation Coefficient ($r$) and Relative Root Mean Squared Error (RRMSE). LightGBM outperformed all evaluation models by attaining a minimum MSE value of 0.0474 and a $R^2$ score of 0.9241. LightGBM's leaf-wise tree-building approach, robust scalability, and native regularization features enabled it to apply very well to unknown data samples without reducing computational speed. The experimental outcomes validate LightGBM as a powerful tool for recognizing delicate patterns within high-dimensional seismic data collections for potential use as a predictive modeling instrument in earthquake-prone zones. ML-based forecasting systems have displayed the  capability to change earthquake prediction processes according to research outcomes. When used together, LightGBM and alternative advanced ML systems enhance real-time early warning systems, which leads to shortened emergency response time bet, better planning decisions, and lower numbers of human and economic losses from earthquakes. This approach, along with open-access datasets, allows the goal of seismic risk mitigation to achieve broader transparency and collaborative innovation through reproducible modeling strategies.

Vol. 9 Issue. 2 PP. 72-87, (2025)

Journal of Artificial Intelligence and Metaheuristics

Journal DOI

Journal Menu

Journal Volumes

Volume 1

Volume 2

Volume 3

Volume 4

Volume 5

Volume 6

Volume 7

Volume 8

Volume 9