Cow-YOLO: Automatic cow mounting detection based on non-local CSPDarknet53 and multiscale neck

De Li, Junhao Wang, Zhe Zhang, Baisheng Dai, Kaixuan Zhao, Weizheng Shen, Yanling Yin, Yang Li

Abstract


Cows mounting behavior is a significant manifestation of estrus in cows. The timely detection of cows mounting behavior can make cows conceive in time, thereby improving milk production of cows and economic benefits of the pasture. Existing methods of mounting behavior detection are difficult to achieve precise detection under occlusion and severe scale change environments and meet real-time requirements. Therefore, this study proposed a Cow-YOLO model to detect cows mounting behavior. To meet the needs of real-time performance, YOLOv5s model is used as the baseline model. In order to solve the problem of difficult detection of cows mounting behavior in an occluded environment, the CSPDarknet53 of YOLOv5s is replaced with Non-local CSPDarknet53, which enables the network to obtain global information and improves the model’s ability to detect the mounting cows. Next, the neck of YOLOv5s is redesigned to Multiscale Neck, reinforcing the multi-scale feature fusion capability of model to solve difficulty detection under dramatic scale changes. Then, to further increase the detection accuracy, the Coordinate Attention Head is integrated into YOLOv5s. Finally, these improvements form a novel cow mounting detection model called Cow-YOLO and make Cow-YOLO more suitable for cows mounting behavior detection in occluded and drastic scale changes environments. Cow-YOLO achieved a precision of 99.7%, a recall of 99.5%, a mean average precision of 99.5%, and a detection speed of 156.3 f/s on the test set. Compared with existing detection methods of cows mounting behavior, Cow-YOLO achieved higher detection accuracy and faster detection speed in an occluded and drastic scale-change environment. Cow-YOLO can assist ranch breeders in achieving real-time monitoring of cows estrus, enhancing ranch economic efficiency.
Keywords: cows mounting, automatic detection, Cow-YOLO, computer vision, CSPDarknet53, multiscale neck
DOI: 10.25165/j.ijabe.20241703.8153

Citation: Li D, Wang J H, Zhang Z, Dai B S, Zhao K X, Shen W Z, et al.Cow-YOLO: Automatic cow mounting detection based on non-local CSPDarknet53 and multiscale neck. Int J Agric & Biol Eng, 2024; 17(3): 193-202.

Keywords


cows mounting, automatic detection, Cow-YOLO, computer vision, CSPDarknet53, multiscale neck

Full Text:

PDF

References


Higaki S, Okada H, Suzuki C, Sakurai R, Suda T, Yoshioka K. Estrus detection in tie-stall housed cows through supervised machine learning using a multimodal tail-attached device. Computers and Electronics in Agriculture, 2021; 191: 106513.

Reith S, Hoy S. Behavioral signs of estrus and the potential of fully automated systems for detection of estrus in dairy cattle. Animal, 2018; 12(2): 398–407.

Perez Marquez H J, Ambrose D J, Bench C J. Behavioral changes to detect estrus using ear-sensor accelerometer compared to in-line milk progesterone in a commercial dairy herd. Frontiers in Animal Science, 2023; 4: 1149085.

Mičiaková M, Strapák P, Strapáková E. The influence of selected factors on changes in locomotion activity during estrus in dairy cows. Animals, 2024; 14(10): 1421.

Wang J, Bell M, Liu X H, Liu G. Machine-learning techniques can enhance dairy cow estrus detection using location and acceleration data. Animals, 2020; 10(7): 1160.

Tsai D M, Huang C Y. A motion and image analysis method for automatic detection of estrus and mating behavior in cattle. Computers and electronics in agriculture, 2014; 104: 25–31.

Wang S H, He D J, Liu D. Automatic Recognition Method of Dairy Cow Estrus Behavior Based on Machine Vision. Transactions of the Chinese Society for Agricultural Machinery, 2020; 51(4): 241–249. (in Chinese)

Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutionnal neural networks. Communications of the ACM, 2017; 60(6): 84–90.

Lodkaew T, Pasupa K, Loo C K. CowXNet: An automated cow estrus detection system. Expert Systems with Applications, 2023; 211: 118550.

Chae J-w, Cho H-c. Identifying the mating posture of cattle using deep learning-based object detection with networks of various settings. Journal of Electrical Engineering & Technology, 2021; 16: 1685–1692.

Redmon J, Farhadi A. Yolov3: An incremental improvement. Computer Science, 2018; In press. doi: 10.48550/arXiv.1804.02767.

Wang S H, He D J. Estrus behavior recognition of dairy cows based on improved YOLOv3 model. Transactions of the CSAE, 2021; 52(7): 141–150.

Cao Y, Xu J R, Lin S, Wei F Y, Hu H. GCNet: Non-local networks meet squeeze-excitation networks and beyond. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea (South): IEEE, 2019; doi: 10.1109/ICCVW.2019.00246.

Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, et al. Swin transformer: Hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada: IEEE, 2021; pp.9992–10002. doi: 10.1109/ICCV48922.2021.00986.

Jocher G. v6.0 - YOLOv5n ‘Nano’ models, Roboflow integration, TensorFlow export, OpenCV DNN support. 2021; Available: https://github.com/ultralytics/yol-ov5/releases/tag/v6.0. Accessed on [2023-10-20].

Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, 2021; pp.13708–13717. doi: 10.1109/CVPR46437.2021.01350.

Everingham M, Van Gool L, Williams C K I, Winn J, Zisserman A. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 2010; 88: 303–338.

Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection. Computer Science, 2020; doi: 10.48550/arXiv.2004.10934.

Liu S, Qi L, Qin H F, Shi J P, Jia J Y. Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, 2018; pp.8759–8768.

Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA: IEEE, 2016; pp.779–788. doi: 10.1109/CVPR.2016.91.

Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA: IEEE, 2017; pp.6517–6525. doi: 10.1109/CVPR.2017.690.

Tan M X, Pang R M, Le Q V. Efficientdet: Scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, 2020; pp.10778–10787.

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. Computer Science, 2014; In press. doi: 10.48550/arXiv.1409.1556.

He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA: IEEE, 2016; pp.770–778.

Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. Computer Science, 2017; In press. doi: 10.48550/arXiv.1704.04861.

Sandler M, Howard A, Zhu M L, Zhmoginov A, Chen L C. Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, 2018; pp.4510–4520. doi: 10.1109/CVPR.2018.00474.

Tan M X, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. Inter-national conference on machine learning. PMLR, 2019; 97: 6105–6114.

Hu J, Shen L, Albanie S, Sun G, Wu E H. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018; 42(8): 2011–2023.

Woo S, Park J, Lee J-Y, Kweon I S. Cbam: Convolutional block attention module. In: Computer Vision - ECCV 2018, Springer, 2018; pp.3–19. doi: 10.1007/978-3-030-01234-2_1.

Wang X L, Girshick R, Gupta A, He K M. Non-local neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, 2018; pp.7794–7803. doi: 10.1109/CVPR.2018.00813.

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, et al. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv: 2010.11929, 2020; In press. doi: 10.48550/arXiv.2010.11929.

Lin T-Y, Dollár P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA: IEEE, 2017; pp.936–944. doi: 10.1109/CVPR.2017.106.

He K M, Zhang X Y, Ren S Q, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015; 37(9): 1904–1916.

Zhu X K, Lyu S C, Wang X, Zhao Q. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada: IEEE, 2021; pp.2778–2788.

Neubeck A, Van Gool L. Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China: IEEE, 2006; pp.850–855.

Bodla N, Singh B, Chellapp.R, Davis L S. Soft-NMS - improving object detection with one line of code. In: 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy: IEEE, 2017; 5562–5570.

Solovyev R, Wang W, Gabruseva T. Weighted boxes fusion: Ensembling boxes from different object detection models. Image and Vision Computing, 2021; 107: 104117.

Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft coco: Common objects in context. Computer Vision - ECCV 2014, Springer, 2014; pp.740–755. doi: 10.1007/978-3-319-10602-1_48.

Girshick R. Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015; pp.1440–1448. doi: 10.1109/ICCV.2015.169.

Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017; 39(6): 1137–1149.

He K M, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017; pp.2980–2988. doi: 10.1109/ICCV.2017.322.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, et al. Ssd: Single shot multibox detector. In: Computer Vision - ECCV 2016, Springer, 2016; pp.21–37. doi: 10.1007/978-3-319-46448-0_2.

Ge Z, Liu S T, Wang F, Li Z M, Sun J. Yolox: Exceeding yolo series in 2021. arXiv: 2107.08430, 2021; In press. doi:10.48550/arXiv.2107.08430

Lin T-Y, Goyal P, Girshick R, He K M, Dollár P. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017; 42(2): 318–327.

Liu Z C, He D J. Recognition method of cow estrus behavior based on convolutional neural network. Transactions of the Chinese Society for Agricultural Machinery, 2019; 50(7): 186–193. (in Chinese)

Guo Y Y, Zhang Z R, He D J, Niu J Y, Tan Y. Detection of cow mounting behavior using region geometry and optical flow characteristics. Computers and Electronics in Agriculture, 2019; 163: 104828.

Noe S M, Zin T T, Tin P, Hama H. Detection of estrus in cattle by using image technology and machine learning methods. In: 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), Kobe, Japan: IEEE, 2020; pp.320–321. doi: 10.1109/GCCE50665.2020.9291987.




Copyright (c) 2024 International Journal of Agricultural and Biological Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

2023-2026 Copyright IJABE Editing and Publishing Office