Dmitry Grinberg's ROMRAM Adds 8MB of External RAM to the Humble Raspberry Pi RP2040

Not content with 264kB of RAM, Grinberg's "nasty hacks" — including an abused error handler — up the total to 8MB.

Gareth Halfacree
2 years agoHW101

The dual-core Raspberry Pi RP2040 microcontroller packs a surprising punch for its price point, but for those who find the 264kB of RAM limiting Dmitry Grinberg has a solution: adding a whopping 8MB of RAM over the quad-SPI bus.

"RP2040 is a rather versatile chip. One of its most convenient features is support for flash XIP [Execute in Place] via SSI [Synchronous Serial Interface]," Grinberg explains. "SSI is quite configurable and can support all sorts of flash chips. It is, of course, not entirely bug free (try to configure it for SPI commands and QPI addresses, for example, see how that goes), but a large memory with a fast cache is super nice. There is only one issue: RP2040's XIP mode only supports read and execute accesses, not writes."

A read-only RAM isn't much use — you might as well simply use a ROM for that — and doesn't help to extend the chip's 264kB of on-board RAM, which is where Grinberg's work comes in. The first step: adding a dual-OR gate and NAND gate, plus two resistors, to build an external circuit which allows the RP2040 to boot from flash storage then copy its contents to a RAM chip with XIP enabled.

"This is quite useless," Grinberg admits, "since we cannot write to it directly. Oh, sure, we can issue SSI commands, but this is (1) annoying, (2) boring, and (3) will not allow unmodified software that needs a few megabytes of RAM to run. How do we make this better? With nasty hacks, of course!"

Those "nasty hacks" start with write protecting a region of memory such that any attempt to write to it triggers a hardfault error. When that happens, a custom error handler interprets the instruction responsible for the fault, emulates the write such that it succeeds, flushes the cache, and resumes the program. The secret: a "super fast partial ARMv6M emulator," which can emulate any write instruction — including STMIA, which writes up to eight words at any word-aligned address, a trick which requires speed.

"RP2040's SSI seems to ignore the programmed 'NDF' value for write-only transactions," Grinberg explains. "Once it has started a write, it will raise nCS anytime the TX FIFO is empty. This means that you need to fill it just fast enough to keep it busy. There was also an issue I found with writing to the SSI FIFO too fast (even when it is empty) and a NOP was needed. Do not ask..."

The resulting device, which Grinbeg calls "ROMRAM," offers native performance for memory reads and executions but is relatively slow for writes — taking 363 clock cycles for an STR(immediate) instruction. "Some of this could be cut a little with some creative work (eg: by overlapping more of the SSI write and the data-getting,)" Grinberg notes. "This optimization is also left as a exercise to the reader."

Grinberg's full write-up is available on his website, along with the source code under the permissive BSD two-clause license. "I am too lazy (and disgusted) to turn this into some sort of an Arduino or a MicroPython plugin," he notes, "but I am sure someone else will. My provided code will build standalone with no dependency on anything."

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles