Hackster is hosting Hackster Holidays, Ep. 7: Livestream & Giveaway Drawing. Watch previous episodes or stream live on Friday!Stream Hackster Holidays, Ep. 7 on Friday!

Shranav Palakurthi Taps into the Espressif ESP32-S3's SIMD Powers, Doubles TinyML CV Performance

Tweaked version of the FAST feature detector runs on-device at 11.2 megapixels per second.

Gareth Halfacree
6 months ago β€’ Machine Learning & AI

Self-described "aspiring computer engineer" Shranav Palakurthi has been working with tiny machine learning (tinyML) computer vision on an Espressif ESP32-S3 microcontroller β€” and has leveraged single instruction multiple data (SIMD) instructions to more than double its performance.

"For its price, the ESP32-S3 is a powerhouse of a microcontroller. Within its unassuming plastic package lies a dual-core CPU running at a maximum of 240MHz with a slew of peripherals, including Wi-Fi and Bluetooth Low Energy radios," Palakurthi writes. "While digging through its technical reference manual I discovered that the chip supports a limited set of SIMD instructions. For silicon that's cheaper than the average coffee, that's pretty cool."

Single instruction multiple data (SIMD) extensions are designed to speed up tasks where a single operation needs to be carried out on more than one datum: rather than executing the instruction on the first datum, then again on the next, and so on, SIMD allows one execution to target multiple data β€” which can dramatically improve performance.

The SIMD capabilities of the Espressif ESP32-S3, which is powered by a Tensilica Xtensa LX7 core, are "relatively unknown," Palakurthi explains, but well-suited to parallel tasks like computer vision. Working with the FAST feature detector, Palakurthi was able to create a corner pre-test and scoring function that used SIMD for acceleration β€” delivering, impressively, a 120% performance gain, boosting the throughput of the feature detector from 5.1 megapixels per second (MP/s) to 11.2MP/s on the same hardware.

"This," Palakurthi claims of the final performance figures, "is well within the acceptable range of performance for real-time computer vision tasks, enabling the ESP32-S3 to easily process a 30fps [frames per second] VGA stream. Not bad for $2!"

Palakurthi's full write-up is available on his website, while the source code for the project has been published to GitHub under an unspecified license.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles