Abstract:To address the low detection rates in gesture recognition algorithms caused by complex indoor environments, diverse hand appearances, and variable recognition angles, and to facilitate deployment on mobile devices, we propose a novel SA-YOLOv8 gesture recognition algorithm. Initially, an improved CB-ShuffleNet V2 lightweight network is utilized as the backbone for extracting gesture features, ensuring accuracy while reducing model parameters and computational load, facilitating real-time recognition on smart home devices. Subsequently, an Asymptotic Feature Pyramid Network (AFPN) is integrated into the Neck layer for multi-scale feature fusion of gesture information, employing adaptive spatial fusion operations to mitigate interference from complex factors and preserve detailed hand information, thereby enhancing the model"s robustness. Finally, the Shape-IOU loss function is introduced during the loss calculation phase, increasing the model"s sensitivity and accuracy for irregular and small-scale gestures at a distance.The experiments demonstrate that SA-YOLOv8 achieves an average detection precision mAP50 of 99.8% on the ASL dataset, marking a 4.5% improvement over the original YOLOv8 model, along with an 80.18% reduction in parameter volume and a 77.46% decrease in computational demand. The improved algorithm shows a significant enhancement in gesture recognition performance and is more lightweight, making it suitable for deployment on mobile devices.