Hailo Demonstrates Accelerated LLM-Based Speech Recognition on the Raspberry Pi AI HAT+

OpenAI's Whisper is put to work recognizing speech from a USB microphone, performing inference on the AI HAT+'s 26 TOPS Hailo-8 chip.

Gareth Halfacree
13 days ago β€’ Machine Learning & AI

Machine learning and artificial intelligence (ML and AI) acceleration specialist Hailo has showcased a new ability for its Hailo-8 coprocessor running on a Raspberry Pi single-board computer: speeding up the performance of OpenAI's Whisper large language model (LLM)-based speech recognition model.

"With this simple set up, Hailo delivers live transcription with impressive efficiency, an early look at how AI can be embedded into everyday devices," a Hailo spokeperson explains of the brief video demo. "This advancement has far-reaching implications for the future of AI-driven voice technology. It brings speech recognition and transcription directly to compact, low-power devices, making AI more accessible, efficient, and secure for industries like assistive tech, smart homes, industrial automation, and more."

Hailo has shown off its accelerator speeding up on-device speech recognition on a Raspberry Pi 5, using the Raspberry PI AI HAT+. (πŸ“Ή: Hailo)

Hailo has been promoting its eponymous family of machine learning accelerators for years now, but has only recently turned its attention to the maker and hobbyist market β€” partnering with Raspberry Pi for an "AI Kit" bundle, which includes a M.2 HAT+ for the Raspberry Pi 5 single-board computer and a Hailo-8L M.2 accelerator module delivering a claimed 13 tera-operations per second (TOPS) of minimum-precision compute. This was followed by the single-board Raspberry Pi AI HAT+, which puts a Hailo-8L or faster 26 TOPS Hailo-8 on a dedicated expansion board.

Previous projects using the Hailo-8 on a Raspberry Pi have focused on vision models, performing tasks like object detection or scene segmentation. The company's latest demo, though, showcases its use for running generative AI (gen AI) systems based on large language model (LLM) technology β€” specifically, OpenAI's open-source Whisper speech recognition model, which is small enough to fit into the accelerator's on-board memory.

The demo shows the Raspberry Pi using its Hailo-8 accelerator to offload the work of inference, recognizing speech from a microphone connected to a USB port and displaying the resulting transcript in a web app window. Although the result is not displayed in real-time β€” with the recording appearing to be passed to Whisper once complete, rather than streamed live β€” the recognition stage completes in seconds, something Hailo says it is working to further optimize.

The video demo is available embedded above and on Hailo's YouTube channel; at the time of writing, the company had not released its pipeline for others to try.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Get our weekly newsletter when you join Hackster.
Latest articles
Read more
Related articles