Table 5. Comparison of improved mask R-CNN and other models for pig posture detection to provide insights into the relative accuracy and efficiency of each approach.

Model AP mAP References
Standing Sitting Lying Eating
YOLOv5s 99.4 98.7 98.0 86.8 [59]
YOLOv5 + EfficientNet 0.67 0.81 0.899 [60]
Yolov3 0.97 0.96 0.88 0.918 [61]
Faster R–CNN + NASNet 0.81 0.78 0.802 [45]
Faster R–CNN 0.90 0.84 0.891 [62]
Faster R–CNN + ResNet101 0.87 0.86 0.856 [44]
R-FCN + ResNet101 0.88 0.88 0.881
SSD + Inception V2 0.69 0.70 0.693
R–FCN + ResNet101 0.95 0.90 0.73 0.872 [42]
Faster R–CNN–Resnet 50 0.86 0.91 0.84 0.845 [63]
Mask R–CNN–ResNeXt 101 (piglet) 0.97 0.92 0.89 0.95 0.937 This study
Mask R–CNN-ResNeXt 101 (pig) 0.97 0.91 0.88 0.96 0.935 This study
YOLO, you only look once; R–CNN, regions with convolutional neural networks; FCN, fully convolutional network.