Ruff Day? Let Spot Fetch You a Donut.
Who needs delivery when this AI-powered robo-dog can trot across town and fetch you a donut?
Boston Dynamics’ distinctive dog-like robot named Spot has become something of an icon in the world of robotics. The powerful, agile robot is capable of performing complex, autonomous tasks in construction, manufacturing, utilities, mining, and beyond. With a price tag in the neighborhood of $75,000, Spot is well out of reach for almost all hobbyists — but if you could get your hands on one, what would you do with it? Automate tedious work around the house? Create a watchdog that prowls your property, keeping watch for intruders?
YouTuber Dave's Armoury did get his hands on a Spot robot, and what grand idea did he have planned for it? He taught his robo-dog to fetch apple fritters from his favorite neighborhood bakery. While that may sound like a trivial use of Spot, it really is not when you think about all the details that are needed to make that happen: navigating from Dave’s home to the bakery, avoiding obstacles, watching out for cars when crossing streets, communicating with the workers at the bakery, etc. When you really dig it, a lot goes into picking up an apple fritter.
To accomplish the goal, the stock Spot needed to be outfitted with some additional hardware. First, an NVIDIA Jetson Orin was included to handle collecting and processing sensor data, and running machine learning models. The Orin is a beast, with roughly ten times the computational horsepower of the previous generation Jetson Xavier NX. With so much power onboard, all computation could be performed locally, avoiding the latency and privacy issues that arise when working with cloud-based resources.
The MicroStrain 3DMGQ7 navigation solution with super-accurate centimeter-level accuracy was included so that Spot would be able to navigate from one location to another. This unit provides location and orientation information with the help of dual GPS antennas and an RTK unit. If GPS signals are lost, for example in passing beneath a tree, an IMU temporarily takes over to keep Spot on track.
Spot already came equipped with five depth cameras; two in the front, one on each hip, and one in the back. These cameras each have dual lenses and infrared emitters to give Spot an accurate measurement of how far away each object that it sees is. With five such cameras, the robot has a 360 degree view of its environment. This expansive view helps Spot to avoid running into things, and also keeps the ground in view to help determine optimal foot placement while walking.
To simplify processing of the 3D visual data, the point cloud is collapsed into a 2D cost map that represents nearby objects, and shows where the robot can and cannot go. When Spot approaches a road, camera data is also fed into NVIDIA’s DashCamNet model, which detects automobiles, people, and other objects. This allows the robot to make a determination as to whether or not it is safe to cross the road. Dave noted that in some cases cars were detected on the late side, which could make poor Spot go splat. He determined that the reason for this delay was the resolution of the camera; a higher resolution device would yield a better result.
When Dave wanted a snack, he instructed Spot via voice commands with the help of NVIDIA’s Riva, a GPU-accelerated SDK for building speech AI applications. Riva translated his speech into text and interpreted it, which sent the dog on its journey. When Spot arrived at the bakery, Riva was once again used to vocalize the request for an apple fritter. It may seem like a frivolous task for such powerful hardware at first blush, but when getting into the details, picking up an apple fritter is actually a very complicated task. Remember that the next time you head to your local donut shop — it was quite an accomplishment and you should be proud of your hard work. Go ahead and get sprinkles; you earned them.