[1] Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. Computer Science, Computer Vision and Pattern Recognition, arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934
[2] Borkar S, Ghutke P, Patil W, Joshi S, Sorte S (2023) A review of pick and place robots for the pharmaceutical industry. 11th International Conference on Emerging Trends in Engineering & Technology-Signal and Information Processing (ICETET-SIP), IEEE, Nagpur, India, 1-6. DOI: 10.1109/ICETET-SIP58143.2023.10151652
[3] Cai J, Chen G, Yin J, Ding C, Suo Y, Chen J (2024) A Review of Autonomous Berthing Technology for Ships. Journal of Marine Science and Engineering 12(7): 1137. https://doi.org/10.3390/jmse12071137
[4] Cavegn S, Haala N, Nebiker S, Rothermel M, Tutzauer P (2014) Benchmarking high density image matching for oblique airborne imagery. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 40(3): 45. https://doi.org/10.5194/isprsarchives-XL-3-45-2014
[5] Chai J, Zeng H, Li A, Ngai EW (2021) Deep learning in computer vision: A critical review of emerging techniques and application scenarios. Machine Learning with Applications 6: 100134. https://doi.org/10.1016/j.mlwa.2021.100134
[6] Chen B, Ghiasi G, Liu H, Lin TY, Kalenichenko D, Adam H, Le QV (2020) MnasFPN: Learning latency-aware pyramid architecture for object detection on mobile devices. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 13607-13616
[7] Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The Cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 3213-3223
[8] Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 248-255. DOI: 10.1109/CVPR.2009.5206848
[9] Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) CenterNet: Keypoint triplets for object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 6569-6578
[10] Durlik I, Miller T, Cembrowska-Lech D, Krzeminska A, Zioczowska E, Nowak A (2023) Navigating the sea of data: a comprehensive review on data analysis in maritime IoT applications. Applied Sciences 13(17): 9742. https://doi.org/10.3390/app13179742
[11] Er MJ, Chen J, Zhang Y, Gao W (2023) Research challenges, recent advances, and popular datasets in deep learning-based underwater marine object detection: A review. Sensors 23(4): 1990. https://doi.org/10.3390/s23041990
[12] Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The PASCAL Visual Object Classes (VOC) challenge. International Journal of Computer Vision 88(2): 303-338. https://doi.org/10.1007/s11263-009-0275-4
[13] Girshick R (2015) Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 1440-1448
[14] Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 580-587
[15] Hackel T, Savinov N, Ladicky L, Wegner JD, Schindler K, Pollefeys M (2017) Semantic3d. net: A new large-scale point cloud classification benchmark. Computer Science, Computer Vision and Pattern Recognition, arXiv preprint arXiv:1704.03847. https://doi.org/10.48550/arXiv.1704.03847
[16] Han X, Zhao L, Ning Y, Hu J (2021) ShipYolo: an enhanced model for ship detection. Journal of Advanced Transportation 2021(1): 1090182. https://doi.org/10.1155/2021/1060182
[17] He J, Erfani S, Ma X, Bailey J, Chi Y, Hua XS (2021) α-IoU: A family of power intersection over union losses for bounding box regression. 35th Conference on Neural Information Processing Systems, 1-13
[18] He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(9): 1904-1916. DOI: 10.1109/TPAMI.2015.2389824
[19] He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 770-778
[20] Henderson P, Ferrari V (2016) End-to-end training of object class detectors for mean average precision. Asian Conference on Computer Vision, Springer, Cham, 198-213. https://doi.org/10.1007/978-3-319-54193-8_13
[21] Howard A, Sandler M, Chen B, Wang W, Chen LC, Tan M, Chu G, Vasudevan V, Zhu Y, Pang R, Adam H, Le Q (2019) Searching for mobilenetv3. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 1314-1324
[22] Hussain M, Saher N, Qadri S (2022) Computer vision approach for liver tumor classification using CT dataset. Applied Artificial Intelligence 36(1): 2055395. https://doi.org/10.1080/08839514.2022.2055395
[23] Iancu B, Soloviev V, Zelioli L, Lilius J (2021) ABOships—An inshore and offshore maritime vessel detection dataset with precise annotations. Remote Sensing 13(5): 988. https://doi.org/10.3390/rs13050988
[24] Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 532-546
[25] Islam MA, Mobarak MH, Rimon MIH, Al Mahmud MZ, Ghosh J, Ahmed MMS, Hossain N (2024) Additive manufacturing in polymer research: Advances, synthesis, and applications. Polymer Testing 132: 108364
[26] Ismail N, Malik OA (2022) Real-time visual inspection system for grading fruits using computer vision and deep learning techniques. Information Processing in Agriculture 9(1): 24-37. https://doi.org/10.1016/j.polymertesting.2024.108364
[27] Jocher G (2020) YOLOv5 by Ultralytics (Version 7.0). Computer software. https://doi.org/10.5281/zenodo.3908559
[28] Karas V, Schuller DM, Schuller BW (2023) Audiovisual affect recognition for autonomous vehicles: Applications and future agendas. IEEE Transactions on Intelligent Transportation Systems 25(6): 4918-4932. DOI: 10.1109/TITS.2023.3333749
[29] Kaur R, Singh S (2023) A comprehensive review of object detection with deep learning. Digital Signal Processing 132: 103812. https://doi.org/10.1016/j.dsp.2022.103812
[30] Khan W, Zaki N, Ali L (2021) Intelligent pneumonia identification from chest x-rays: A systematic literature review. IEEE Access 9: 51747-51771. DOI: 10.1109/ACCESS.2021.3069937
[31] Lenka AK, Tripathy HK (2024) 5 Computer vision for medical diagnosis and surgery. Healthcare Big Data Analytics: Computational Optimization and Cohesive Approache, De Gruyter, Berlin, 101-124. https://doi.org/10.1515/9783110750942-005
[32] Li Y, Moreau J, Ibanez-Guzman J (2023) Emergent visual sensors for autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems 24(5): 4716-4737. DOI: 10.1109/TITS.2023.3248483
[33] Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: Common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (Eds.). Computer Vision-ECCV 2014 (ECCV 2014). Springer, Cham, 740-755. https://doi.org/10.1007/978-3-319-10602-1_48
[34] Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2980-2988
[35] Liu S, Gao C, Chen Y, Peng X, Kong X, Wang K, Xu R, Jiang W, Ma J, Wang M (2023) Towards vehicle-to-everything autonomous driving: A survey on collaborative perception. Computer Science, Computer Vision and Pattern Recognition, arXiv preprint arXiv: 2308.16714
[36] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: Single shot multibox detector. European Conference on Computer Vision, Springer, Cham, 21-37. https://doi.org/10.1007/978-3-319-46448-0_2
[37] Liu Y, Lu B, Peng J, Zhang Z (2020) Research on the use of YOLOv5 object detection algorithm in mask wearing recognition. World Scientific Research Journal 6(11): 276-284. DOI: 10.6911/WSRJ.202011_6(11).0038
[38] Liu Z, Luo P, Wang X, Tang X (2018) Large-scale celebfaces attributes (celeba) dataset. Retrieved August 15(2018): 11
[39] Long X, Deng K, Wang G, Zhang Y, Dang Q, Gao Y, Wen S (2020) PP-YOLO: An effective and efficient implementation of object detector. arXiv preprint arXiv:2007.12099. https://doi.org/10.48550/arXiv.2007.12099
[40] Manakitsa N, Maraslidis GS, Moysis L, Fragulis GF (2024) A review of machine learning and deep learning for object detection, semantic segmentation, and human action recognition in machine and robotic vision. Technologies 12(2): 15. https://doi.org/10.3390/technologies12020015
[41] Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 3061-3070
[42] Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 779-788
[43] Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6): 1137-1149. DOI: 10.1109/TPAMI.2016.2577031
[44] Shao Z, Wu W, Wang Z, Du W, Li C (2018) Seaships: A large-scale precisely annotated dataset for ship detection. IEEE Transactions on Multimedia 20(10): 2593-2604. DOI: 10.1109/TMM.2018.2865686
[45] Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science, Computer Vision and Pattern Recognition, arXiv preprint arXiv:1409.1556
[46] Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, 6105-6114
[47] Tan M, Pang R, Le QV (2020) EfficientDet: Scalable and efficient object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 10781-10790
[48] Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: A brief review. Computational Intelligence and Neuroscience 2018(1): 7068349. https://doi.org/10.1155/2018/7068349
[49] Yan B, Peng H, Fu J, Wang D, Lu H (2021) Learning spatio-temporal transformer for visual tracking. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 10448-10457
[50] Yu J, Zhang C, Wang S (2021) Multichannel one-dimensional convolutional neural network-based feature learning for fault diagnosis of industrial processes. Neural Computing and Applications 33(8): 3085-3104. https://doi.org/10.1007/s00521-020-05171-4
[51] Zhang R, Ji X, Pan M (2022) Diversified assessment benchmark of vision dataset-based perception in ship navigation scenario. Proceedings of the 2022 5th International Conference on Signal Processing and Machine Learning, Dalian, China, 282-287. https://doi.org/10.1145/3556384.3556427
[52] Zhang YF, Ren W, Zhang Z, Jia Z, Wang L, Tan T (2022) Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506: 146-157. https://doi.org/10.1016/j.neucom.2022.07.042
[53] Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2017) Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(6): 1452-1464. DOI: 10.1109/TPAMI.2017.2723009
[54] Zhou Z, Sun J, Yu J, Liu K, Duan J, Chen L, Chen CP (2021) An image-based benchmark dataset and a novel object detector for water surface object detection. Frontiers in Neurorobotics 15: 723336. https://doi.org/10.3389/fnbot.2021.723336