A Tagging Model using Segmentation Proposal Network

Suha Dh. Athab^1*, Abdulamir A. Karim²

^1,2Department of Computer Science, University of Technology, Bagdad, Iraq

Emails: ^.suha.athab@gmail.com ; 110004@uotechnology.edu.iq

Abstract

This paper presents a tagging model used the Segmentation map as reference regions. The suggested model leverages an encoder-decoder architecture combined with a proposal layer and dense layers for accurate object tagging and segmentation. The proposed model utilizes a pre-trained VGG16 encoder to extract high-level features from input images, followed by a decoder network that reconstructs the image. A proposal layer generates a binary map indicating the presence or absence of objects at each location in the image. The proposal layer is integrated with the decoder output and further refined by a convolutional layer to produce the final segmentation. Two dense layers are employed to predict object classes and bounding box coordinates. The model is trained using a custom loss function that combines categorical cross-entropy loss and means squared error loss. Experimental results demonstrate the effectiveness of the proposed model in achieving accurate object tagging and segmentation.

Keywords: Tagging; Encoder decoder; Semantic segmentation; Object detection