Tenstorrent Launches Its RISC-V-Powered AI Inference Acceleration Boards, the Grayskull e75 and e150
Designed for 64-bit x86 Ubuntu Linux hosts, the Grayskull boards promise energy-efficient acceleration for both existing and new models.
Machine learning acceleration specialist Tenstorrent has announced the launch of two new inference-only Grayskull cards, designed to speed up inference on PCI Express-capable 64-bit x86 hosts: the Grayskull e75 and e150, both of which support the company's TT-Buda top-down and TT-Metalium bottom-up software stacks.
"Today we are officially launching our Grayskull Dev Kit, available for purchase on our website," Tenstorrent writes in its announcement, brought to our attention by Linux Gizmos. "This is our first gen AI [Artificial Intelligence] PCIe [PCI Express] card — an inference-only hardware kit we are releasing alongside TT-Metalium — our open source software stack."
The Grayskull boards, e75 and e150, are built specifically to accelerate the inference process — though the company's underlying technology should be compatible with training too. Each board's processor is made up of a grid of Tensix cores, each of which is in turn built from five process cores built around the free and open-source RISC-V instruction set architecture. The Tensix cores also feature a tensor array math unit, a single-instruction multiple-data (SIMD) unit, and dedicated hardware accelerators for network operations and compression/decompression.
The entry-level board, the Grayskull e75, features 96 Tensix cores running at 1GHz, 96MB of on-chip static RAM (SRAM), and 8GB of external LPDDR4 memory. The half-height card connects to a host — with only 64-bit x86 hardware supported at launch — over a 16-lane PCIe Gen. 4.0 link and draws a claimed 75W, requiring a bundled active cooling kit.
The Grayskull e150, meanwhile, offers improved performance — though still only for inference workloads — from 120 Tensix cores running at a higher 1.2GHz clock speed. Those additional cores mean a total of 120MB of SRAM, though the board only offers the same 8GB of LPDDR4 — albeit with a higher peak transfer rate of 118.4GB/s to 102.4GB/s on the e75. They also mean a higher power draw, with the full-height card pulling 200W at peak load.
Both boards require a 64-bit x86 host running Ubuntu 20.04 LTS with 64GB of RAM and at last 100GB of available storage, with 2TB or more recommended. The Grayskull e75 also needs a power supply with a free PCIe six-pin power connector, while the more powerful e150 needs a six-pin and a six-plus-two pin connector.
On the software front, Tenstorrent is offering two software stacks. TT-Buda is, the company says, designed for those who want to "run models right away," offering a top-down approach that takes existing models from popular frameworks like PyTorch and TensorFlow and makes them executable on the Grayskull hardware.
TT-Metalium, by contrast, is a bottom-up low-level programming framework built around the TT-NN neural network library, designed for those building models from scratch to get the best possible performance — or for anyone interested in experimenting with running non-machine-learning workloads on the Tensix cores.
Both boards are available on the Tenstorrent store now at $599 for the Grayskull e75 — described by the company as "limited-availability" — and $799 for the Grayskull e150. The company has also designed a pair of network-enabled accelerators, the Wormhole n150 and Wormhole n300, with pricing and availability yet to be announced.
The TT-Buda and TT-Metalium sources, meanwhile, are available on their respective GitHub repositories under the permissive Apache 2.0 license.