A Picture Is Worth a Thousand Verses

The Poetroid, a new twist on instant cameras, captures scenes as poems using machine learning, blending nostalgia with modern technology.

Nick Bild
10 months agoPhotos & Video
The Poetroid prints poems, not photographs (📷: stopsendingmejunk)

Polaroid’s instant cameras of yesteryear may not be the number one must have item on many people’s wish list these days, but they were revolutionary when they first appeared on the commercial market. Whether for reasons of nostalgia, or due to people’s desire to have a physical copy of a picture in their hand instead of just a digital copy on their phone, a new generation of these cameras has emerged and captured some interest.

For many people, being introduced to an interesting but antiquated technology for the first time, or the first time in a long time, can trigger thoughts of what-if scenarios. What if this technology was still around? What would it look like today? And so on. A hacker that goes by the handle stopsendingmejunk imagined a modern version of instant cameras that blend the old concept of instant photographic prints with the latest in machine learning. The result is the Poetroid camera that captures a scene not in the form of a photograph, but rather in an instant printout of a poem that describes the scene. Given the unique nature of this device, I thought it would be only fitting that it be described in the form of poetry. Ahem…

In a world of lenses and shutters tight,

A camera emerged, a poet's delight.

No snapshots captured in pixels or frames,

But verses spun, inspired by its aim.

With every click, not a shutter's sound,

But stanzas whispered, poetry profound.

A lens that weaves words in the golden light,

Crafting verses in the softest night.

Okay, okay, I won’t quit my day job. The Poetroid is built into a small, suspiciously lunchbox-like case that contains an Orange Pi Zero 3 single board computer, a webcam, a thermal printer, and a small display. When a button is pressed, the webcam captures an image, and passes it to a multimodal large language model along with a request to compose a poem about the scene. When the response is returned, it is printed out on the onboard thermal printer.

Ideally, stopsendingmejunk would like the camera to run the language model on-device, but that proved to be too challenging. Existing multimodal language models simply required too much computing power for the camera to in any way be called “instant.” So, until technology catches up with the idea, the system uses an external computer (with an NVIDIA GeForce RTX 3090 GPU) to run the language model, and the Poetroid wirelessly communicates with it to send requests and retrieve the responses. Even still, the camera can take 20 to 30 seconds before a poem is composed, but it looks like it is well worth the wait.

The Poetroid comes with a few other nice touches. It has a speaker to play a sound like the clicking of a shutter when a digital image is captured for an authentic-feeling experience. This speaker is also used to play some fun sounds, like dial-up modem sounds, for entertainment while the poem is being generated — that way you can really hear it working, or something.

This is a very unique build. Be sure to follow the project so that you can keep up-to-date on the latest happenings.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles