Duplicate bridge is a logic game played with a deck of 52 cards. Regularly organized tournaments are very often broadcasted live for the fans from around the world. For this purpose each play is recorded manually by operator. Performed work is very monotonous and repetitive. The system presented here is designed to automate this process using two cameras placed above the table and directed at opposite parts of the table with implemented embedded vision system. Our goal is to propose solution characterized by both high accuracy, efficiency and ability to perform task in real time.
Duplicate bridge consist of two main stages: auction and play. For auctions 38 different bidding calls are used and during play phase - 52 cards. A key part of the bridge game reconstruction system is the card and bidding calls detection. In order to achieve high accuracy we choose solution based on Deep Neural Networks. However, such a solution is characterized by high computational complexity. To solve this problem we decided to apply hardware accelerated DNN on hardware platform Kria KV260 and AMD Xilinx Vitis AI DPU. Mentioned solution allows for acceleration of different types of DNNs architectures and layers. For the problem of detection we decided to apply YOLOv4 as a state of the art for the object detection. Both card and bidding calls detection systems have been prepared analogously. Therefore, we decided to present only the solution for card detection.
Parameters of network firstly need to be trained. This task we performed with automatically generated dataset of images with various cards located on different backgrounds and lighting condition. We achieved accuracy of over 99% mAP0.5. Next step includes quantization with Vitis AI Quantizer. Resulted Quantized Neural Network was next compiled to be understandable for DPU. Finally we needed to implement proper way of processing. We decided to join images of both camera (placed over the game table). Received images are splitted into 3 parts of 640x640x3 each. The YOLOv4 network returns 3 feature maps with 3 anchors each. Applying DNNs also requires some kind of pre and post processing. We used multi-threaded processing as a kind of coarse-grained structure. We implemented application which allows for detection of cards and bidding calls on given image or video streams. The highest achieved image processing speed by the DPU is 6 FPS. Due to the relatively slow pace of bridge play, the performance achieved is sufficient to reconstruct the course of the game in real time.
Comments