Earthworm Optimization with Deep Transfer Learning Enabled Aerial Image Classification Model in IoT Enabled UAV Networks
Dr.R.PANDI SELVAM
Assistant Professor & Head, PG Department of Computer Science,
Vidhyaa Giri College of Arts & Science
Puduvayal, Karaikudi- 630 108,
Sivaganga District, Tamilnadu, India.
pandiselvamraman@gmail.com
Abstract
Unmanned aerial vehicles (UAVs) can be placed effectively in offering high-quality services for Internet of Things (IoT) networks. It finds use in several applications such as smart city, smart healthcare, surveillance, environment monitoring, disaster management, etc. Classification of images captured by UAV networks, i.e., aerial image classification is a challenging task and can be solved by the design of artificial intelligence (AI) techniques. Therefore, this article presents an Earthworm Optimization with Deep Transfer Learning Enabled Aerial Image Classification (EWODTL-AIC) model in IoT enabled UAV networks. The major intention of the EWODTL-AIC technique is to effectually categorize different classes of aerial images captured by UAVs. The EWODTL-AIC technique initially employs AlexNet model as feature extractor for producing optimal feature vectors. Followed by, the hyperparameter values of the AlexNet model are decided by the utilization of earthworm optimization (EWO) algorithm. At last, the extreme gradient boosting (XGBoost) model is employed for the classification of aerial images. The experimental validation of the EWODTL-AIC model is performed using benchmark dataset. The extensive comparative analysis reported the better outcomes of the EWODTL-AIC technique over the other existing techniques.
Keywords: Unmanned aerial vehicles, Internet of things, Aerial images, Image classification, Deep learning, Parameter optimization.
Introduction
In recent days, Internet of Things (IoT) became an interesting exploration subject and got colossal consideration among analysts to offer huge administrations and applications. Simultaneously, cloud computing (CC) innovations offer a few advantages to help IoT applications and deal with a few advantages, for example, low dormancy, area mindfulness, versatility, and so on [1]. Simultaneously, Unmanned Aerial Vehicles (UAV) innovation has been essentially evolved and utilized for some application areas. UAVs give quick, financially savvy, and safe arrangements for some affable and military applications [2]. The prevalence of autonomous UAVs and their applications, including search and salvage tasks, observation, and found recognition in the new years, is enormous. The UAVs incorporated inside IoT systems are fit for conveying IoT administrations, which, joined with cloud figuring and deep learning methods, can propel the capacity for automated activities [23]. There has been huge exertion toward keen flying IoT gadgets. Be that as it may, there is an absence of adequate execution of computer vision and deep learning enabled calculations for UAV enabled IoT networks, for example, geospatial planning of metropolitan/provincial regions, particularly reasonable for brilliant city advancement. At the point when aerial images are gained, it goes through aerial image classification.
The images are arranged into sub-locales by covering a few ground objects and an assortment of lands casing dissimilar semantic class labels. Hence, aerial image classification is a significant interaction for long time world applications like computer map making, metropolitan preparation, far off sensors, and asset the board [4]. For the most part, a portion of the indistinguishable article class labels or land cover assortments is apportioned in numerous scenes. For instance, business and private are the two fundamental classes of scenes which might incorporate streets, structures, and trees. Be that as it may, these two class labels have changes in spatial distribution and thickness of 3 classes. In this way, aerial image classification is carried out relying upon primary and spatial example confusions which is a moving issue to defeat [5]. Deep Learning (DL) strategy [7] is exceptionally gainful in settling customary difficulties like object acknowledgment and discovery, Natural Language Processing (NLP), etc.
Kyrkou et al. [11] focused on the design of proficient image classifier for aerial images on UAV networks. It can be used for emergence application areas. Particularly, a specific dataset is presented a comparison study is made with recent methods. In this work, a new DL model called EmergencyNet is devised to classify images under emergence situations. Next, Hua et al. [12] proposed a dedicated model called class wise attention based convolution and BiLSTM model to classify aerial images. It comprises 3 major elements such as feature extraction, class attention layer, and BLSTM model. In specific, the feature extractor can be designed to derive fine grained semantic feature map whereas the attention layer is aimed to capture the distinct class features.
In [13], a CNN model has been developed for the classification of aerial images into 4 class labels namely tree, grass, bare land, and others. It learnt the spectral and spatial features of the images from the provided ground truth images offered. Lin et al. [14] proposed an effective multilabel classifier to categorize aerial images. It mainly employs labelled correction for particular datasets and ConceptNet. Secondly, graph neural network (GNN) is utilized and presented a dedicated technique known as multi-label concept graph (ML-CG). It aims to construct a concept graph for defining the semantic correlation from all datasets. Finally, in [15], an AI enables model has been developed for the automatic identification of litter that exists on beaches with distinct sizes. It makes use of a DL model which allows pixel wise categorization (sematic segmentation) by the use of beach images captured from beaches.
This article presents an Earthworm Optimization with Deep Transfer Learning Enabled Aerial Image Classification (EWODTL-AIC) model in IoT enabled UAV networks. The EWODTL-AIC technique initially employs AlexNet model as feature extractor for producing optimal feature vectors. Followed by, the hyperparameter values of the AlexNet model are decided by the utilization of earthworm optimization (EWO) algorithm. At last, the extreme gradient boosting (XGBoost) model is employed for the classification of aerial images. The experimental validation of the EWODTL-AIC model is performed using benchmark dataset. The extensive comparative analysis reported the better outcomes of the EWODTL-AIC technique over the other existing techniques.
The Proposed Model
In this study, a new EWODTL-AIC model has been developed for IoT enabled UAV networks. The major intention of the EWODTL-AIC technique is to effectually categorize different classes of aerial images captured by UAVs. The EWODTL-AIC technique follows a three stage process namely AlexNet based feature extraction, EWO based hyperapramter tuning, and XGBoost based classification. Fig. 1 demonstrates the overall process of EWODTL-AIC technique.
Fig. 1. Overall process of EWODTL-AIC technique
AlexNet based feature extraction
At the initial stage, the AlexNet model is applied as a feature extractor for producing optimal feature vectors. The AlexNet architecture was the initial deep model that popularized convolution networks in computer vision and considerably enhance the classification performance of the ImageNet ILSVRC2012 challenges than conventional techniques. The AlexNet model [16] comprises three fully connected (FC) layers, 8 learnable layers, and 5 convolution layers. The FC layer Each convolution and are attached by rectified linear unit (ReLU) as nonlinear activation. Adding nonlinearity through ReLU assist CNN train fast. The initial and next convolution layers are followed by max-pooling layers and local response normalization (LRN), however, only max-pooling layer is utilized afterward the 5th convolution layer. The initial convolution layer has ninety-six kernels of 11 × 11 dimensions with a step (stride) of four pixels. Fig. 2 depicts the framework of AlexNet.
The stride of remaining convolution layers is fixed to one pixel. The next layer has 256 kernels of 5 × 5 dimensions. The 3rd, 4th, and 5th layers have 384 384, and 256 kernels of 3 × 3 dimensions, correspondingly. Max pooling layer uses nonlinear down-sampling to abstract the system. Thus, they retain the major feature and reduce the amount of variables that the networks need to learn that reduces network usage. The 1st two FC layers contain 4096 neurons and the final one contains 1000 neurons that is similar to the ImageNet class. The neuron in FC layer is interconnected to each neuron in the preceding layer. The final layer offers the classification output with the softmax classification. It is taken into account as an outstanding technique for demonstrating definite distribution. The softmax function that is mainly utilized in the output layer, is a standardized exponent of the output value [16]. This function is distinguishable and characterizes some possibility of the output. Furthermore, the exponential component rises the maximal value probability:
o_i=e^(z_i )/(∑_(i=1)^M▒e^(z_i ) ) (1)
Whereas 0_i denotes the softmax output value i,z_i represent the output ibeforehand the softmax, and M indicates the complete amount of output nodes.
Fig. 2. Framework of AlexNet
EWO based Hyperparameter Tuning Model
Next, the hyperparameter values of the AlexNet model are decided by the utilization of EWO algorithm [17] which aids in improving classifier results. The EWO approach is stimulated by the reproductivity procedure of earthworms (EW) to resolve optimization issues [17]. It is based on the central rules in the following: (i) each EW in the population imitates offspring by 2 and only 2 kinds of reproduction. (ii) The genes restricted as child EW is s similar length as parent EW. (iii) EW individual of previous generation contain optimal is directly accomplished for the succeeding iteration without alteration. Therefore, the single parent EW generates a child EW alone. The reproduction_1 is determined by the following:
u_(i1,k)=u_(max,k)+u_(min,k)-αu_(i,k) (2)
The abovementioned equation describes the procedure of making kth component of child EW i1 from parent EW i.u_(i1,k) and u_(i,k) are kth component of EW i1 and i.u_(max,k) and u_(min,k) are efficient restriction of kth component of each EW. α denotes the similarity factor within [0, 1].
The Reproduction_2 exploits an improved kind of crossover operator. The total number of parent EWs (N) is integer that is surpassing 1. Here, uniform crossover is performed by N=2 and M=1. In 2 parent EW P_1 and P_2 are selected by roulette wheel selection:
P=[■(P_1@P_2 )] (3)
At first, two offspringsu_12 and u_22 are produced from two parents. rand denotes arbitrary number within [0,1] and kth component of u_12 and u_22 are made as:
When rand>0.5,
u_(12,k)=P_(1,k) (4)
u_(22,k)=P_(2,k)
Otherwise,
u_(12,k)=P_(2,k) (5)
u_(22,k)=P_(1,k)
At last, the EW u_i2 from Reproduction-2 are determined. Consider that rand1 be other arbitrarily made values within [0, 1].
u_i2={■(u_12 forrand 1<0.5@u_22 else)┤ (6)
Next, the created EWs u_i1 and u_i2, the EW u_i^' for following generation is calculated by:
u_i^'=βu_i1+(1-β) u_i2 (7)
Whereas β denoted as “proportional factor”. It is employed to manipulate the proportion of u_i1 and u_i2 that global and local searching efficiency is retained in balance:
β^(t+1)=γβ^t (8)
Whereas t denotes the existing generation. At first at t=0, β=1. γ shows the variable that results in cooling factor. The solution need that exists run-away from local optimal. Hence, the “Cauchy Mutation” (CM) has been performed. It improved the search capability of “EWO”. The CM operator is determined by.
W_k=(■(N_pop@∑_(i=1)▒u_(i,k) ))/N_pop (9)
Here, W_k shows the weighted vector for kth element of population i and N_pop signifies population size. The kth element of final EW design:
u_i^"=u_i^'+W_k*Cd (10)
Now, Cd indicates the arbitrary value from “Cauchy distribution” about =1. Here, τ characterizes the “scaling parameter”.
XGBoost based Classification
At the last stage, the XGBoost [18] model is employed for the classification of aerial images. It is XGBoost that indicates their current advantages from the study of ML. Further advantages are maximal performance of boosting method, as well as uses sparse data effectively and employ distributed with corresponding calculation. For a provided n*m feature matrix of training information, the predictor exploits the K addition function for the ensemble results.
y ̂_i=F(x_i )=∑_(k=1)^K▒f_k (X_i ),f_k∈φ (11)
Whereas X_i describe the common examples (i=1,2,…,n), ϕ={f(x)=w_s (x)}(s∶R^m→T,w_s∈R^T ) suggests the ensemble of trees, each tree f(x) comprise the structural variables and leaf weight w, w_i indicates the i-th leaf, Tr characterizes the amount of leaves from tree, K signifies the amount of trees that are exploited for ensemble the result and y ̂_i shows the prediction label.
L^((t) )= argmin∑_(i=1)^n▒[l(y_i,y ̂^((t‐1) ) )+g_i f_t (x_i )+1/2 h_i f_t^2 (x_i )] +Ω(f_t ) (12)
g_i=∂_(y ̂^((t‐1) ) ) l(y_i,y ̂^((t‐1) ) ) (13)
h_i=∂_(y ̂^((t‐1) ))^2 l(y_i,y ̂^((t‐1) )) (14)
Ω(f_t )=γTr+1/2 λ‖w‖^2 (15)
here g_i and h_i denotes the 1st and 2nd order gradient statistics on loss function, l(┤) suggests the loss function. The latter term Ω(f_t ) shows the penalty, γ, and λ indicates the parameter that handles the complexity of tree, the standardization term is applied to avoid overfitting through smoothing the last learned weight.
L_split=1/2 [(∑_(i∈I_L)▒g_i )^2/(∑_(i∈I_L)▒h_i +λ)+(∑_(i∈I_R)▒g_i )^2/(∑_(i∈I_R)▒h_i +λ)+(∑_(i∈I)▒g_i )^2/(∑_(i∈I)▒h_i +λ)] —γ (16)
In which I=I_L∪I_R,I_L and I_R shows the sample group of left and right nodes splitting. For receiving the consequence of split node from the tree, it can be calculated by the consequence of nodes relative variable from XGBoost technique:
I_j^2 (Tr)=∑_(t=1)^(J-1)▒i ̂_t^2 l(v_t=j) (17)
Now l indicates the indicator function viz linked to squared-influence, v_t represents the split variables associated with node t, and i ̂_t^2 represents the empirical development of square error formed by the split, i ̂_t^2 is demonstrated as:
i ̂_t^2=i^2 (R_l,R_r )=(w_l w_r)/(w_l+w_r ) (¯(y_l )+¯(y_r ))^2 (18)
Here ¯(y_l ) and ¯(y_r ) shows the mean of weight of left as well as right children nodes of t,w_l and w_r indicates amount of weights. For a group of DTs {Tr_m }_1^M, boosting has been obtained by the generalized of the average over all the trees from the series.:
I ̂_t^2=1/M ∑_(m-1)^M▒I ̂_t^2 (Tr_M ) (19)
Results and Discussion
This section examines the aerial image classification outcomes of the EWODTL-AIC model using the UCM dataset [19]. The results are inspected under various measures. A few sample images are illustrated in Fig. 3.
Table 1 and Fig. 4 highlights the comparative accu_y inspection of the EWODTL-AIC model with recent models. The experimental values indicated that the VGG-16 and PlacesNet model have accomplished lower accu_y values of 80.84% and 77.836% respectively. Followed by, the CaffeNet model has led to slightly enhanced accu_y value of 84.88%. Next to that, the AlexNet model has accomplished moderately increased accu_y of 89.38%. Though the SSCapsNet model has resulted in reasonable accu_y of 94.17%, the EWODTL-AIC model has offered maximum performance with the higher accu_y of 98.81%.
Fig. 3. Sample images
Table 1 Comparative analysis of EWODTL-AIC technique with CNN Methods
Methods Accuracy (%)
EWODTL-AIC 98.81
SSCapsNet 94.17
AlexNet 89.38
CaffeNet 84.88
VGG-16 80.84
PlacesNet 77.83
Fig. 4. Comparative analysis of EWODTL-AIC technique with CNN Methods
Table 2 and Fig. 5 highlights the comparative accu_y inspection of the EWODTL-AIC method with recent models. The experimental values indicated that the VGGRBFNN and CAGNLSTM techniques have accomplished lesser values of prec_n, reca_l, and F_score. In line with, the RNet-50 model has led to slightly enhanced prec_n, reca_l, and F_score values of 88.14%, 89.20%, and 86.41% respectively. Eventually, the CARNLSTM model has accomplished moderately increased prec_n, reca_l, and F_score of 90.37%, 91.85%, and 90.46% respectively. Although the SSCapsNet model has resulted in reasonable prec_n, reca_l, and F_score of 93.90%, 95.36%, and 95.08%, the EWODTL-AIC model has offered maximum performance with the higher prec_n, reca_l, and F_score of 98.24%, 98.48%, and 98.86% respectively..
Table 2 Comparative analysis of EWODTL-AIC technique with existing approaches
Methods Precision Recall F-score
EWODTL-AIC 98.24 98.48 98.86
SSCapsNet 93.90 95.36 95.08
CARNLSTM 90.37 91.85 90.46
RNet-50 88.14 89.20 86.41
CAGNLSTM 83.57 84.76 82.75
VGGRBFNN 80.66 82.75 79.88
Fig. 5. Comparative analysis of EWODTL-AIC technique with existing approaches
Table 3 and Fig. 6 highlights the comparative accu_y inspection of the EWODTL-AIC model with recent models. The experimental values indicated that the SCK and SPM model have accomplished lower accu_y values of 83.74% and 78.31% respectively. Followed by, the SC-Pooling model has led to slightly enhanced accu_y value of 87.90%. Next to that, the MOPSO model has accomplished moderately increased accu_y of 91.98%. Though the SSCapsNet model has resulted in reasonable accu_y of 94.87%, the EWODTL-AIC model has offered maximum performance with the higher accu_y of 98.81%.
Fig. 6. Comparative analysis of EWODTL-AIC technique with recent methods
Table 3 Comparative analysis of EWODTL-AIC technique with recent Methods
Methods Accuracy (%)
EWODTL-AIC 98.81
SSCapsNet 94.87
MOPSO 91.98
SC-Pooling 87.90
SCK Model 83.74
SPM Model 78.31
Finally, a brief running time (RT) results of the EWODTL-AIC model with recent methods are portrayed in Fig. 7 and Table 4. The outcomes demonstrated that the PlacesNet approach has reached to maximal RT of 109s. At the same time, the CaffeNet and VGG-16 models have resulted in slightly reduced RT of 92.65s and 99.79s respectively. In line with, the SSCapsNet and AlexNet models have reached to reasonable RT of 75.42s and 82.15s respectively. However, the EWODTL-AIC model has accomplished superior results with the minimal RT of 69.47s. From the detailed results and discussion, it is ensured that the EWODTL-AIC model has obtained maximum aerial image classification performance.
Table 4 Running time analysis of EWODTL-AIC technique with recent approaches
Methods Running Time (sec)
EWODTL-AIC 69.47
SSCapsNet 75.42
AlexNet 82.15
CaffeNet 92.65
VGG-16 99.79
PlacesNet 109.02
Fig. 7. Running time analysis of EWODTL-AIC technique with recent approaches
Conclusion
In this study, a novel EWODTL-AIC model has been developed for IoT enabled UAV networks. The major intention of the EWODTL-AIC technique is to effectually categorize different classes of aerial images captured by UAVs. The EWODTL-AIC technique follows a three stage process namely AlexNet based feature extraction, EWO based hyperapramter tuning, and XGBoost based classification. At the initial stage, the AlexNet model is applied as a feature extractor for producing optimal feature vectors. Next, the hyperparameter values of the AlexNet model are decided by the utilization of EWO algorithm which aids in improving classifier results. At the last stage, the XGBoost model is employed for the classification of aerial images. The experimental validation of the EWODTL-AIC model is performed using benchmark dataset. The extensive comparative analysis reported the better outcomes of the EWODTL-AIC technique over the other existing techniques. In future, advanced DL models can be utilized to improve the classifier results.
References
Kyrkou, C. and Theocharides, T., 2019, June. Deep-Learning-Based Aerial Image Classification for Emergency Response Applications Using Unmanned Aerial Vehicles. In CVPR Workshops (pp. 517-525).
Boursianis, A.D., Papadopoulou, M.S., Diamantoulakis, P., Liopa-Tsakalidi, A., Barouchas, P., Salahas, G., Karagiannidis, G., Wan, S. and Goudos, S.K., 2020. Internet of things (IoT) and agricultural unmanned aerial vehicles (UAVs) in smart farming: a comprehensive review. Internet of Things, p.100187.
Islam, N., Rashid, M.M., Pasandideh, F., Ray, B., Moore, S. and Kadel, R., 2021. A review of applications and communication technologies for internet of things (Iot) and unmanned aerial vehicle (uav) based sustainable smart farming. Sustainability, 13(4), p.1821.
Zhang, H. and Hanzo, L., 2020. Federated learning assisted multi-UAV networks. IEEE Transactions on Vehicular Technology, 69(11), pp.14104-14109.
Gebrehiwot, A., Hashemi-Beni, L., Thompson, G., Kordjamshidi, P. and Langan, T.E., 2019. Deep convolutional neural network for flood extent mapping using unmanned aerial vehicles data. Sensors, 19(7), p.1486.
Li, Y., Qian, M., Liu, P., Cai, Q., Li, X., Guo, J., Yan, H., Yu, F., Yuan, K., Yu, J. and Qin, L., 2019. The recognition of rice images by UAV based on capsule network. Cluster Computing, 22(4), pp.9515-9524.
Munawar, H.S., Ullah, F., Qayyum, S., Khan, S.I. and Mojtahedi, M., 2021. UAVs in disaster management: Application of integrated aerial imagery and convolutional neural network for flood detection. Sustainability, 13(14), p.7547.
Cai, W., Wei, Z., Song, Y., Li, M. and Yang, X., 2021. Residual-capsule networks with threshold convolution for segmentation of wheat plantation rows in UAV images. Multimedia Tools and Applications, 80(21), pp.32131-32147.
Saha, A.K., Saha, J., Ray, R., Sircar, S., Dutta, S., Chattopadhyay, S.P. and Saha, H.N., 2018, January. IOT-based drone for improvement of crop quality in agricultural field. In 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 612-615). IEEE.
Haq, M.A., Rahaman, G., Baral, P. and Ghosh, A., 2021. Deep learning based supervised image classification using UAV images for forest areas classification. Journal of the Indian Society of Remote Sensing, 49(3), pp.601-606.
Kyrkou, C. and Theocharides, T., 2020. Emergencynet: Efficient aerial image classification for drone-based emergency monitoring using atrous convolutional feature fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, pp.1687-1699.
Hua, Y., Mou, L. and Zhu, X.X., 2019. Recurrently exploring class-wise attention in a hybrid convolutional and bidirectional LSTM network for multi-label aerial image classification. ISPRS journal of photogrammetry and remote sensing, 149, pp.188-199.
Kareem, R.S.A., Ramanjineyulu, A.G., Rajan, R., Setiawan, R., Sharma, D.K., Gupta, M.K., Joshi, H., Kumar, A., Harikrishnan, H. and Sengan, S., 2021. Multilabel land cover aerial image classification using convolutional neural networks. Arabian Journal of Geosciences, 14(17), pp.1-18.
Lin, D., Lin, J., Zhao, L., Wang, Z.J. and Chen, Z., 2021. Multilabel aerial image classification with a concept attention graph neural network. IEEE Transactions on Geoscience and Remote Sensing, 60, pp.1-12.
Hidaka, M., Matsuoka, D., Sugiyama, D. and Murakami, K., 2022. Pixel-level image classification for detecting beach litter using a deep learning approach. Marine Pollution Bulletin, 175, p.113371.
Lu, S., Wang, S.H. and Zhang, Y.D., 2021. Detection of abnormal brain in MRI via improved AlexNet and ELM optimized by chaotic bat algorithm. Neural Computing and Applications, 33(17), pp.10799-10811.
Ghosh, I. and Roy, P.K., 2019, March. Application of earthworm optimization algorithm for solution of optimal power flow. In 2019 International Conference on Opto-Electronics and Applied Optics (Optronix) (pp. 1-6). IEEE.
Ogunleye, A. and Wang, Q.G., 2019. XGBoost model for chronic kidney disease diagnosis. IEEE/ACM transactions on computational biology and bioinformatics, 17(6), pp.2131-2140.
http://weegee.vision.ucmerced.edu/datasets/landuse.html