The project utilizes a DFRobot FireBeetle ESP32 board with a camera modulethat aims to monitor and provide voice feedback on bad sitting postures in real-time. The project is primarily based on the TensorFlow Lite official example for pose estimation, utilizing the MoveNet model. Additionally, it incorporates a simple fully connected neural network for pose classification to determine whether the person's posture falls into categories such as "standard sitting posture," "cross-legged," or "forward head and hunched back." The pose classification network is trained using a dataset of labeled images, which are processed to extract the landmark coordinates detected by the MoveNet model.
Motivation:The project aims to enable real-time pose estimation using the DFRobot FireBeetle ESP32 and a camera module, coupled with a server created with FastAPI. By leveraging the capabilities of the FireBeetle ESP32 and the server's computational power, the project provides a cost-effective solution for real-time pose estimation.
Functionality:- The DFRobot FireBeetle ESP32 with the camera module captures an image.
- The captured image is sent to the FastAPI server via an HTTP POST request.
- The FastAPI server receives the image and performs pose estimation using the MoveNet model.
- The detected keypoints are used as input to the TensorFlow Lite model on the server to predict the pose class.
- The predicted pose class is sent back to the FireBeetle ESP32 via an HTTP response.
- The FireBeetle ESP32 receives the pose class prediction.
To train the machine learning model, data collection is essential. However, finding suitable data for training a posture detection system proved challenging. Existing sources like Kaggle or Google Image Repository did not have relevant images of people sitting in chairs. Therefore, manual work was required. A Google Form was created, and friends were asked to provide images following specific guidelines. Additionally, gaming chair advertisements and illustrations of good/bad postures were used. Despite the limitations imposed by the pandemic, around 90 images were gathered. Although this is a small dataset, techniques like image augmentation can be employed to increase dataset size and variability. Importantly, the dataset was well-balanced, with an equal distribution of labels (e.g., 40 images with good posture and 50 with bad posture).
The dataset was augmented using techniques such as resizing, cropping, and rotation to increase variability. Preprocessing steps included normalization, landmark extraction, and encoding of target labels for training the machine learning model.
Model Training and Evaluation:The machine learning model was developed using a feed-forward neural network architecture. The model underwent training using the compiled dataset, employing the categorical cross-entropy loss function and the Adam optimizer. During training, the model's performance was monitored using evaluation metrics such as accuracy. The trained model was then evaluated on a separate test set to assess its generalization ability and effectiveness in classifying sitting postures accurately.
The experimental results demonstrated the effectiveness of the proposed system in accurately classifying sitting postures. The trained model achieved a high classification accuracy on the test set, indicating its potential for real-time posture classification applications. The system's real-time feedback capability offers individuals an opportunity to correct their sitting postures promptly, leading to improved ergonomic practices and reduced risk of musculoskeletal disorders.
Comments