Raspberry Pi's AI Eye: Hands-On with the Raspberry Pi AI Camera Module

Sony's IMX500 "Intelligent Vision Sensor" lets you run computer vision models entirely on-camera, freeing up your Raspberry Pi's CPU.

10 months ago • AI & Machine Learning / HW101 / Photos & Video

Raspberry Pi has been showing increasing interest in on-device artificial intelligence of late, partnering with Hailo to produce the Raspberry Pi AI Kit bundle for the Raspberry Pi 5 — and now with Sony for the Raspberry Pi AI Camera, an RGB camera module with a difference: it's also an accelerator for computer vision models.

Built around a Sony IMX500 "Intelligent Vision Sensor," the Raspberry Pi AI Camera Module can handle a range of model types that run on-sensor using an integrated accelerator and 8MB of dedicated memory — enough for real-time object detection, segmentation, pose estimation, and more, and all while keeping the Raspberry Pi's CPU and GPU cores free for other work.

Does the Raspberry Pi AI Camera Module deliver? Let's find out.

Hardware

Sensor: Sony IMX500 Intelligent Vision Sensor (12.3 megapixels, 7.857mm, 1.55μm pixel size, rolling shutter)
Resolution: 4,056×3,040
Framerate: 10 frames per second full-resolution, 30fps 2×2 binned
Focus: Manual 20cm-∞, 4.74mm focal length
Field of View: 66° horizontal, 52.3° vertical (±3°)
F-stop: f/1.79
Output: Bayer, RAW10, YUV, RGB, Regions of Interest (ROI), metadata
Maximum Tensor Resolution: 640×640
Precision: INT8
Memory: 8,388,480 bytes (~8MB) shared between firmware, weights, working RAM
Size: 25×24×11.9mm (around 0.98×0.94×0.47")
Price: $70

First teased at Embedded World earlier this year, the Raspberry Pi Camera Module features Sony's IMX500 image sensor — a 12.3 megapixel color sensor with 1.55μm square pixels. There's a manual-focus lens already fitted, with no support for interchangeable lenses, and the sensor board mates with the camera module board proper where an on-board Raspberry Pi RP2040 microcontroller handles interfacing the machine learning accelerator with the host Raspberry Pi.

The Raspberry Pi AI Camera Module is both a 12.3 megapixel camera and a fully-functional machine learning accelerator in one. (📷: Gareth Halfacree)

While this is Raspberry Pi's first camera module to feature the IMX500, it's not Sony's first IMX500 product to target the Raspberry Pi: the company's official standalone evaluation kits for the IMX500 all come with a conversion board to connect them to a Raspberry Pi as the driving device. It is, however, the smallest: the Raspberry Pi AI Camera Module has the same footprint and mounting points as the Raspberry Pi Camera Module 3, meaning it's drop-in compatible with many existing mounts and cases — though it's slightly deeper, so fully-enclosed cases designed for the Camera Module 3 may not fit the AI Camera Module.

The module connects to the host Raspberry Pi over its MIPI Camera Serial Interface (CSI) port, as with earlier non-AI camera modules, and two flat flexible circuits (FFC) are included. The first is only suitable for the latest Raspberry Pi 5 and the Raspberry Pi Zero range, narrowing as it does at one end; anyone with a Raspberry Pi 4 Model B or other variant with a full-size CSI connector will need to use the second cable, the same size at both ends.

The module is compatible with any Raspberry Pi model, bar the CSI-lacking Raspberry PI 400, though some models need a different FFC. (📷: Gareth Halfacree)

There are no real surprises on the module itself, although the use of a manual-focus lens is one of the few disappointments: the Raspberry Pi Camera Module 3 was the first official module to feature motorized autofocus capabilities, and it's a shame that hasn't carried across to the new model. Likewise, the IMX500's rolling shutter means it's likely to struggle with extremely fast motion — but that's hardly on Raspberry Pi as, at the time of writing, Sony had yet to announce a global-shutter equivalent to the IMX500.

Intelligent vision

Installing the camera is as easy as any other model: connect one end of the FFC to one of the combined CSI/DSI ports on the Raspberry Pi 5, or the dedicated CSI port on older models, and the other end to the camera. Make sure both socket flaps are secure, then power on.

Those used to running computer vision tasks on a Raspberry Pi will, at this point, be expecting to spend quite some time getting everything set up: installing the required software, pulling a model from wherever, setting up the camera, and finally connecting the video feed to the model — during which time you'll see at least one of the Raspberry Pi's CPU cores redline and stay there, unless you're using an external accelerator.

Models like MobileNet SSD (pictured) run on-camera in real-time, taking up no CPU time on the Raspberry Pi. (📷: Gareth Halfacree)

The Raspberry Pi AI Camera Module isn't like that. The software is integrated into the libcamera stack, and getting started is as easy as running rpicam-hello with your choice of two pre-installed models: the MobileNet SSD object detection network or the PoseNet pose estimation network. Pick your poison and wait: the model will be transferred to the camera module over the CSI connection, in a process that's handled by the on-board RP2040, and loaded into 8MB of on-device RAM.

For our pre-release testing, we had to install selected packages from a non-public repository; this shouldn't be required by the time the camera module hits retail. You will need the latest version of Raspberry Pi OS, and only a couple of models are included in the built-in software stack; more can be found on Raspberry Pi's IMX500 "model zoo" GitHub repository, acting as a handy reference for working with the camera in Python.

A model model

There's a bit of a delay when you first pick a new model, as it's transferred across to the camera. Once it's on there a preview window will appear exactly as though you were using rpicam-hello as normal, showing a live view from the camera — except this time it includes the output of your chosen model too, as both machine-readable metadata and an optional visible overlay.

Aside from generating the overlay based on the streamed metadata, this all happens on-sensor: open up a second terminal window and fire up the resource monitor of your choice and you'll see the Raspberry Pi AI Camera Module is taking up no more CPU time than the non-AI Raspberry Pi Camera Module 3 — all of which is going into image processing for the live preview, with none required to run the model.

New models are transferred to the camera's accelerator over CSI, using the module's on-board RP2040 microcontroller. (📷: Gareth Halfacree)

Anyone who's experimented with tinyML — running machine learning models on microcontrollers and other resource-constrained devices — will, at this point, be expecting to hear that the frame rate is measured in seconds-per-frame rather than frames-per-second. Thankfully, that's not the case: Sony has achieved something truly remarkable with the IMX500, and it's capable of running both the MobileNet SSD and PoseNet models in real-time overlaid on a video feed running at a smooth 30 frames per second.

Real-time computer vision without tying up a CPU core and without a separate dedicated accelerator is impressive; that Sony and Raspberry Pi have been able to squeeze the technology into the same footprint as the regular Raspberry Pi Camera Module 3 is nothing short of incredible. It's a minor power hog, however: while actively running a model and streaming video to the host, the Raspberry Pi AI Camera Module accounts for about 1.87W of an overall 4.12W power draw on a Raspberry Pi 5 2GB. The accelerator is also always-active: if a model has been loaded into memory it will always run and stream metadata along with the video, though you can disable visualization by excluding the matching post-processing file from your command.

Dear zoo

Unless you have very basic requirements, you'll soon tire of watching the camera overlay a stick-figure on your gyrating body or identify a pair of scissors sat on your desk. The Raspberry Pi AI Camera Module, however, is not a two-trick pony: the hardware's launch comes alongside the opening of a "model zoo," with three core Python examples for image classification, object detection, and segmentation.

These examples can be used with one of a range of models, including EfficientNET v2, MobileViT XS, ResNet18, SqueezeNet, and YOLOv8n. Some, like YOLOv8, can run at the maximum 640×640 resolution available to models on the camera; others are limited to 320×320, 260×260, 256×256, or as low as 224×224. As you'd expect, the more demanding the network the lower the limit.

A "model zoo" provides Python examples for classification, object detection, and segmentation and a range of ready-to-use models. (📷: Raspberry Pi)

The Raspberry Pi AI Camera, then, isn't a magic wand: its 8MB of RAM has to be shared between its firmware, the loaded model weights, and working memory. While tools are offered for adapting your own projects to run on the camera, compatible with both TensorFlow and PyTorch, don't go into things expecting to be able to run massive models: the camera has no access to the Raspberry Pi's RAM, so that 8MB is all you have regardless of whether you're connecting it to a 512MB Raspberry Pi Zero 2 W or an 8GB Raspberry Pi 5.

Conclusion

At $70, the Raspberry Pi AI Camera Module is nearly three times as expensive as the Raspberry Pi Camera Module 3 — for obvious reasons. If you're not interested in low-power edge AI, it's an easy choice: stick with the Camera Module 3 and enjoy the far broader benefits of motorized autofocus. This goes, too, for those working on battery-powered projects, where the near-2W draw of the AI Camera Module could prove an unwelcome drain if you're primarily interested in just plain video streaming or still-image capture.

If you have any interest at all in on-device computer vision, though, the Raspberry Pi AI Camera Module comes highly recommended. Being able to use all your CPU and GPU cores for other work while an object recognition model runs in real-time is a real boon, and as everything happens over the CSI connection you've got all the Raspberry Pi's USB ports, and the Raspberry Pi 5's PCI Express lane, free for other hardware.

Anyone with even the slightest interest in computer vision and $70 to spend should pick up the AI Camera Module. (📷: Gareth Halfacree)

For those already experimenting with artificial intelligence on the Raspberry Pi, the new camera module is a must-have addition to your setup. It can be used alongside existing accelerators like the Hailo-8 or the Google Coral Edge TPU, and there's nothing to stop you running multiple models at the same time — one on the camera, one on the accelerator, one on the Raspberry Pi itself.

While it might have been nice to see a global shutter, the rolling shutter is far from a deal-breaker — and while the models themselves are limited to a maximum input resolution of 640×640, having access to a quality 4,056×3,040 image sensor when you need it is nothing to be sniffed at.

The Raspberry Pi AI Camera Module is available to buy now from the company's official resellers, priced at $70.

camera

machine learning

computer vision

artificial intelligence

photography

Gareth Halfacree

Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.