Johannes Stelzer Navigates the "Latent Space" with This Knob-Controlled Generative AI Demo

Using a real-time text-to-image generative AI model, this interactive experiment lets you twiddle the knobs of "reality."

Johannes Stelzer, an artist who has embraced generative artificial intelligence (AI) with gusto, has shown off just what's possible with technology — by using a knob board originally designed for use in a MIDI system to tweak variables on-the-fly.

"[This is a] new way to navigate latent space," Stelzer writes of the experiment. "It preservers the underlying image structure and feels a bit like a powerful style-transfer that can be applied to anything. The trick is to selectively alter the embeddings in the decoder part of the diffusion process."

Stelzer's creation uses SDXL Turbo, a text-to-image generative AI model from Stability.ai — designed to sacrifice the quality of the company's more well-known Stable Diffusion models in favor of performance suitable for real-time use. "SDXL Turbo is based on a novel distillation technique called Adversarial Diffusion Distillation (ADD)," the company explained when it launched the model, "which enables the model to synthesize image outputs in a single step and generate real-time text-to-image outputs while maintaining high sampling fidelity."

In Stelzer's case, the real-time functionality of the model is being used to great effect by tweaking variables and seeing their impact on the generated image without delay — using the Lunar Tools toolkit, developed by Stelzer's Lunar Ring cooperative for the development of interactive exhibitions.

Lunar Ring has experimented with the concept of latent space before, with Stelzer contributing to an interactive installation, which opened in January. (📹: Lunar Ring)

Rather than tweaking these variables by hand, though, Stelzer has opted to tie each to a different knob on a control surface originally developed for MIDI use — though the artist hasn't entirely abandoned the idea of tying audio into the project too. "[I'm] working on coupling this to real time audio generation," he notes, "leveraging synesthesia & immersion."

More information is available in Stelzer's Twitter thread; Lunar Tools is published on GitHub under the permissive BSD Three-Clause license.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles