A Tagging Model using Segmentation Proposal Network

Suha Dh.; Abdulamir A.

doi:https://doi.org/10.54216/FPA.130212

Full Length Article

Volume 13 • Issue 2 • PP: 136-144 • 2023

A Tagging Model using Segmentation Proposal Network

Suha Dh. Athab ^1*

mail

,

Abdulamir A. Karim ¹

mail

¹Department of Computer Science, University of Technology, Bagdad, Iraq

* Corresponding Author.

DOI https://doi.org/10.54216/FPA.130212

format_quote Cite this article

Received: April 15, 2023 Revised: July 26, 2023 Accepted: October 08, 2023

View PDF open_in_new

Abstract

This paper presents a tagging model used the Segmentation map as reference regions. The suggested model leverages an encoder-decoder architecture combined with a proposal layer and dense layers for accurate object tagging and segmentation. The proposed model utilizes a pre-trained VGG16 encoder to extract high-level features from input images, followed by a decoder network that reconstructs the image. A proposal layer generates a binary map indicating the presence or absence of objects at each location in the image. The proposal layer is integrated with the decoder output and further refined by a convolutional layer to produce the final segmentation. Two dense layers are employed to predict object classes and bounding box coordinates. The model is trained using a custom loss function that combines categorical cross-entropy loss and means squared error loss. Experimental results demonstrate the effectiveness of the proposed model in achieving accurate object tagging and segmentation.

Keywords

Tagging Encoder decoder Semantic segmentation Object detection

References

[1] Y. Li, L. Yuan, and N. Vasconcelos, "Deep Hierarchical Semantic Segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1-10.

[2] S. Mehta and M. Rastegari, "Simple and Efficient Architectures for Semantic Segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2022, pp. 1-10.

[3] F. Lateef and Y. J. N. Ruichek, "Survey on semantic segmentation using deep learning techniques," vol. 338, pp. 321-348, 2019.

[4] J. M. Stokes et al., "A deep learning approach to antibiotic discovery," vol. 180, no. 4, pp. 688-702. e13, 2020.

[5] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.

[6] T. Kong, A. Yao, Y. Chen, and F. Sun, "Hypernet: Towards accurate region proposal generation and joint object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 845-853.

[7] J. H. Giraldo et al., "Hypergraph Convolutional Networks for Weakly-Supervised Semantic Segmentation," arXiv preprint arXiv:2210.05564, 2022.

[8] J. Fu et al., "Dual attention network for scene segmentation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3146-3154.

[9] A. Aakerberg and M. Felsberg, "Semantic Segmentation Guided Real-World Super-Resolution," in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2022, pp. 1-10.

[10] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520.

[11] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881-2890.

[12] J. Dai, K. He, and J. Sun, "Instance-aware semantic segmentation via multi-task network cascades," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3150-3158.

[13] E. Temlioglu, I. Erer, and D. Kumlu, "A least mean square approach to buried object detection in ground penetrating radar," in 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2017, pp. 4833-4836: IEEE.

[14] Z. Zhang and M. J. A. i. n. i. p. s. Sabuncu, "Generalized cross entropy loss for training deep neural networks with noisy labels," vol. 31, 2018.

[15] T.-Y. Lin et al., "Microsoft coco: Common objects in context," in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 2014, pp. 740-755: Springer.

[16] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Distance-IoU loss: Faster and better learning for bounding box regression," in Proceedings of the AAAI conference on artificial intelligence, 2020, vol. 34, no. 07, pp. 12993-13000.

[17] Z. Hao et al., "Automated tree-crown and height detection in a young forest plantation using mask region-based convolutional neural network (Mask R-CNN)," vol. 178, pp. 112-123, 2021.

[18] X. Zhou, D. Wang, and P. Krähenbühl, "Objects as Points," in arXiv preprint arXiv: 1904.07850, 2019.

[19] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, "Path aggregation network for instance segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8759-8768.

[20] K. Chen et al., "Hybrid task cascade for instance segmentation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4974-4983.

[21] P. Sun et al., "Sparse r-cnn: End-to-end object detection with learnable proposals," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 14454-14463.

[22] Z. Tian, C. Shen, H. Chen, and T. He, "FCOS: Fully Convolutional One-Stage Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 3, pp. 1089-1102, 2020, doi: 10.1109/TPAMI.2019.2951682.

[23] Y. Chen et al., "YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection," arXiv preprint arXiv:2308.05480, 2023.

[24] T. Kong, F. Sun, H. Liu, Y. Jiang, and J. J. a. p. a. Shi, "FoveaBox: Beyond anchor-based object detector. arXiv 2019," vol. 2, no. 5.

[25] F. Wei, X. Sun, H. Li, J. Wang, and S. Lin, "Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation," in Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 527–544.

Cite This Article

Choose your preferred format

format_quote

Athab, Suha Dh., Karim, Abdulamir A.. "A Tagging Model using Segmentation Proposal Network." Fusion: Practice and Applications, vol. Volume 13, no. Issue 2, 2023, pp. 136-144. DOI: https://doi.org/10.54216/FPA.130212

Athab, S., Karim, A. (2023). A Tagging Model using Segmentation Proposal Network. Fusion: Practice and Applications, Volume 13(Issue 2), 136-144. DOI: https://doi.org/10.54216/FPA.130212

Athab, Suha Dh., Karim, Abdulamir A.. "A Tagging Model using Segmentation Proposal Network." Fusion: Practice and Applications Volume 13, no. Issue 2 (2023): 136-144. DOI: https://doi.org/10.54216/FPA.130212

Athab, S., Karim, A. (2023) 'A Tagging Model using Segmentation Proposal Network', Fusion: Practice and Applications, Volume 13(Issue 2), pp. 136-144. DOI: https://doi.org/10.54216/FPA.130212

Athab S, Karim A. A Tagging Model using Segmentation Proposal Network. Fusion: Practice and Applications. 2023;Volume 13(Issue 2):136-144. DOI: https://doi.org/10.54216/FPA.130212

S. Athab, A. Karim, "A Tagging Model using Segmentation Proposal Network," Fusion: Practice and Applications, vol. Volume 13, no. Issue 2, pp. 136-144, 2023. DOI: https://doi.org/10.54216/FPA.130212

Digital Archive Ready