The Death of a Salesman

Have products like the Humane AI Pin or Rabbit R1 failed? Or are they the next underlying model around which our computing will be built?

aallan

8 months ago • Internet of Things / Wearables / Machine Learning & AI

The Death of a Salesman (generated by Midjourney)

The transcript of a talk I gave at the Particle Spectra conference in 2024, during which I talked about machine learning, edge computing, technological readiness, and how abstraction changes computing.

If you think about your phone, it’s mostly just a bundle of sensors and antennas, it reports your location back to the cloud, and in return for that you get information — train times, bank balances, messages from your loved ones. Without a network connection to the computers in the cloud our phones aren’t all that useful. How many of the apps on your phone — that you use everyday — work at all when you don’t have any signal? Probably not that many.

But it doesn’t necessarily have to be that way, and the way computing architectures change with time (at least so far!) seems to be cyclic. Historically the bulk of our compute power and storage is either hidden away in racks of distant servers, or exists in a mass of distributed systems, much closer to home. A cycle between thick, and thin client, architectures.

We started off with mainframes, and ended up with desktop computers. But then things changed again, and the bulk of our compute moved back up and into the cloud.

What was left behind was laptops, tablets, and of course telephones, but still tied to the cloud. They gather data, you interact with them, but the data is stored and processed far away from you, on someone else’s computers.

However right now we’re seeing first signals that we’re at the beginning of a swing back, back towards distributed systems once again. Away from the cloud, and towards the edge.

Widely lauded as one of the digital pioneers, this from Alan Kay, who back in 1972 anticipated the black rectangle of glass and brushed aluminium that lives in all of our pockets today — 1972 was the year that the C programming language was first released.

But while Kay’s prediction of the existence of the smartphone was almost prophetic, it was also in a way naive. He imagines users still writing programs for their “carry anywhere” device. Living down in the guts of the beast as he was, he was working at a level of abstraction far different than we do today.

Because historically the thing that drives the shifts, these changes in how and where our computing lives, is our technology. Technological readiness and levels of abstraction.

The move from mainframe to desktop was driven by Moore’s Law, our ability to produce compute in denser and denser packages that much more cheaply.

The move from desktop to the cloud was driven, in part at least, because Moore’s Law was slowing. Failing. But mostly because there was a new technology in play. The network. Connectivity was driving our computing, not processor power.

Computing became about the network, not the location. While the computing moved, the users did not.

I started my career in the dying gasps of the mainframe era, and the mainframe was a temple, and bringing your punch cards to it was very similar experience to attending church. It was ritualistic.

This time around we’re seeing a hybrid model. There is a swing back towards distributed computing, back towards local processing of data as well as data gathering, but with some of the heavy lifting being left behind in the cloud.

We saw this first with the Internet of Things, which started off heavily reliant on the cloud — the temperature sensors in your home being unable to talk to your thermostat other than routing messages through a data centre in Wisconsin —but this is now changing.

In part this is due to machine learning, allowing us to move the smarts closer to the data. We no longer have to appeal to big iron in the cloud to make decisions, instead interpreting sensor data can be done at the edge.

But it’s not just the Internet of Things anymore.

We’re become used to yearly, or faster, upgrade cycles. A new phone each year, a new laptop every couple of years, and each year the slabs of aluminium, plastic, glass, and silicon get a little bit thinner and weigh a little bit less.

But the underlying model around which our computing is built doesn’t change quite as rapidly as the computing itself.

Until of course it does. For the first time since the iPhone really did “change everything” back in 2007 — and that’s seventeen years ago now — there’s the possibility that model, how computing is offered to us, and how we interact with it, is changing.

Which brings us to the what is now being advocated, at least by some, as the next form factor, wearable AI. The Humane AI Pin.

The Humane AI Pin (📷: Humane)

The Humane AI Pin was seeking to replace the most successful computing platform ever invented, the smartphone. That black rectangle that lives in everyone's pocket, including yours.

Shrouded in mystery before it finally launched a product, Humane is (perhaps more accurately was) a fascinating case study in hubris, and the launch reminds me very much of Kamen’s launch of the Segway.

Around the time of the company’s Series C round last year, when they raised (another) 100 million dollars bringing the total pre-product investment to 230 million dollars. John Gruber — writing about a leaked copy of Humane’s 2021 pitch deck — said “The deck describes something akin to a Star Trek communicator badge, with an AI-connected always-on camera saving photos and videos to the cloud, and lidar sensors for world-mapping and detecting hand gestures.”

The idea is simple, it’s a phone without a screen. Instead of getting your phone out of your pocket if you want to make a phone call — if anyone actually do that anymore? — or send a message, or ask a question, you just ask the AI Pin.

The pin is online all the time, and up in the cloud an AI model (or more accurately probably a collection of models) try to answer your questions and execute your commands. It’s not an app; it’s all the apps you have on your cellphone.

However — after disastrous reviews during the pre-launch campaign — the Pin was dead-on-arrival when it shipped to its first users back in April this year, with the Verge saying, “…it just doesn’t work.”

The founders are now trying to sell the company rather than the product the company built.

The problem here is technological readiness, this may well be the next big thing. But it isn’t ready yet.

Of course we’ve seen this before. Who here remembers the Apple Newton?

Introduced in May 1992, it started shipping more than a year later in August 1993. Realistically this was Apple’s first attempt at building the iPhone. This was Alan Kay’s “carry everywhere” device, or at least a rather interesting first attempt.

However, just like Humane’s AI Pin today, the Newton was widely mocked in popular culture, it became a poster child for expensive but useless gadgets.

But while it was widely criticized as being “too expensive” the Newton retailed in 1993 at $699. That’s just over $1,500 in today’s money. Which is the same price as a high end iPhone 16 Pro.

I’m not sure if that tells us more about the Newton, or the iPhone. But for a high end piece of technology aimed at early adopters, that’s not entirely out of line.

But if it wasn’t the price, then, why did it fail? The technology wasn’t ready.

Back in 1987 when Apple started working on the Newton they commissioned AT&T to design a low-power version of its CRISP CPU, which became known as the AT&T Hobbit. Unfortunately, the Hobbit was “rife with bugs, ill-suited for our purposes, and overpriced,” according to Larry Tesler, Apple’s Chief Scientist at the time.

Prototypes of the Newton built around the Hobbit needed three of AT&T’s CPUs, and cost upwards of $6,000 each.

Apple eventually turned to Acorn and — along with VLSI — took a 47% stake in the company. This cash infusion allowed Acorn to develop the processor for the first Newton.

As an aside, Apple’s sale of their 800 million dollar stake in ARM in 1999 — the company created when Acorn spun off its microprocessor business — funded the development of the iPod.

Which arguably saved Apple from bankruptcy. There literally wouldn’t be an iPhone without the Newton, for more than one reason.

But it wasn’t just the hardware that wasn’t ready.

The Newton's handwriting recognition software was unreliable and inaccurate, and the media widely criticized it for misreading characters.

It was so bad that it was parodied in both the Simpsons, where "Beat up Martin" became "Eat up Martha,” and in Garry Trudeau’s highly influential Doonesbury comic strip — where the Newton misreads the words "Catching on?" as "Egg Freckles”

"Egg Freckles" (📷: Garry Trudeau, Doonesbury)

At the heart of Newton's failure was the "second-stroke problem." Each time a user's pen lifted off the tablet and set back down, he Newton's detected a pause and became uncertain. It couldn’t be sure whether the next stroke was part of the current letter, or the start of a new letter, or even a new word.

It turns out, many (most?) letters of the alphabet need multiple strokes, or at least the way many (most?) people write them. The letters "T" and "X" normally have two strokes. "H" needs three. Add to this user hesitancy, pauses for thought, and handwriting recognition is a hard problem. And that's just English.

…and because Newton's recognition engine was so unsure, so often, it routinely threw a list of possible words at the user. This proved more than somewhat inconvenient.

Worse, if you wanted the Newton to learn a word outside its native database, you had to train it. You first had to write the new word out longhand, and then painstakingly type it letter by letter using an on-screen keyboard, each keystroke done with the stylus.

Later versions of the Newton had better, marginally at least, handwriting recognition, but it was too late.

To save the users from having to adapt their writing habits the Newton forced them to train the device. It was an admission that the Newton wasn’t capable of performing it’s core function. Later versions of

Jeff Hawkins, and Palm, who built the first really successful handheld, did the opposite. They changed the human, not the device. Instead of the device having to learn how you write, you learned how write for it. No second stroke problem, because Palm’s Graffiti alphabet eliminated that.

The first Palm Pilot was released just a few years after the Apple Newton, in 1996.

It suffered from some of the same flaws as the Newton, like a slow data connection — the 802.11 WiFi standard would not even be published until the following year, 1997, and cellular phones were still using analog signals — but the core functionality of the device worked.

You could write on it, and it knew what you were writing. The technology of the time was ready for this. It was more limited, but it did what it claimed to do without problems, unlike the Newton.

The Graffiti alphabet (📷: Palm Computing)

There was a time when I could write in Graffiti faster than I could write English normally. For me Graffiti lived on, well past time when I gave up my Palm Pilot, in my notes written on real paper.

Until of course the iPhone changed everything in 2007.

The original iPhone was a fairly limited device. Network coverage — and download speeds — were limited compared to the iPhone of today. Copy and paste didn’t come along until two years later, in 2009. It was immature technology, but it was good enough.

Technology had finally caught up with Alan Kay’s “carry anywhere” device.

We saw this same pattern happen again with Google Glass in 2013.

Tearing down Google Glass (📷: Scott Torborg and Star Simpson)

Glass had short battery life, slow upload times, poor camera quality, and spotty voice recognition abilities. The technology just wasn’t ready to build the device Google was trying to build.

But it honestly might have succeeded anyway, like the original iPhone it might have been good enough. But Glass had an additional problem, sociological not technological.

Google Glass lasted less than a year on the market and failed mostly because nobody was sure what it was really for, or what problems it actually solved that existing technology didn’t solve almost as well, or better. Glass couldn’t compete with faster processors and superior cameras that already existed in your smart phone.

Alongside this people pushed back against a product that didn't do what they wanted, or perhaps more correctly, did things they didn't want. It became the very public face of of the hype cycle. People wanted it to fail.

We can see the same technological readiness problems with the Humane AI Pin as we saw with Apple’s Newton and Google’s Glass.

Humane spent the year in the lead up to the launch of the Pin trying to make the case that it was the beginning of a post-smartphone future, where we’d spend more time back in the real world, and less time looking at a screen. How that might work, whether it’s something we wanted, and whether it’s even possible are very much open questions.

A major problem with the Pin is the latency between making a request and its response. Network latencies we might find acceptable when dealing with our smart phones, when interacting with an app, are entirely unacceptable in a voice interface.

When we talk to our devices we need immediate feedback. That’s why voice-controlled smart speakers like Amazon’s Echo have “wake word” capability. The onboard hardware has small machine learning model that will run offline, without having to talk to the cloud. While it’s certainly possible to run wake word models on even microcontroller level hardware, the Pin didn’t.

Alongside these processing and networking problems is the UX problem.

The Pin’s “Laser Ink” projector was sold as better than the screen, summoned by tapping on the Pin or asking it to “show me” something, the green monochromatic 720p image — vaguely reminiscent for those of us that were around of the glow of the VT100 terminals — was more or less invisible in bright light. But mostly described as “good enough” by reviewers.

Interacting with it however was described as “bananas” and involved moving your hand forward and backwards, pinching your thumb and forefinger together. You scroll my tilting your hand, and rolling your hand around like you’re trying to balance a ball in your hand.

It felt too many like the decision had been made early on that the Pin couldn’t have a screen, no matter what, and nobody was willing to walk that decision back when they couldn’t build a good user experience around their technology.

With technology not quite ready to allow them to build what they wanted to build, they tried to be Palm — to change the human rather than the device — and they couldn’t quite pull off gesture based Graffiti.

Of course the Humane AI Pin isn’t the only product in the new wearable AI space. This is the Rabbit R1.

A jailbroken Rabbit R1 (📷: David Buchanan)

Unlike Humane, Rabbit chose a different path. It’s not even exactly a wearable, but it does have a screen.

Plagued by security problems — including hard coded API keys giving attackers access to pretty much everything done on the device, including every response ever given by any R1 device — the Rabbit too failed.

Five months after launch, and despite the hype after its CES debut, only 5 thousand out of the 100 thousand people that bought the R1 are still using the device. A depressing number, but straight from the mouth of Rabbit’s founder Jesse Lyu. Who has also explained that it was launched before it was ready in order to beat other companies to the punch.

And the design of the Rabbit R1 has a lot of hints that it was a prototype that escaped into the world. The hardware is just a generic Android phone with the touchscreen disabled, you can even run Doom on it.

Both it and the AI Pin feel like they rushed to market unfinished, and there are echos of both General Magic and the Segway — along with the Sinclair C5 before that — in the way they have come to market.

Like Google Glass they are the very public face to the last two years of hype and search for market fit for large language models. We have AI everywhere, but haven’t quite convinced ourselves it’s all that useful yet.

Difference between what was promised and what was delivered

But despite these high profile failures, people have not stopped trying. This is the Friend.

The "Friend" pendant (📷: Friend)

The company behind it raised 2.5 million dollars in venture capital, and in what has been described as a “bold move” spent 1.8 million dollars to fund the purchase of the friend.com domain name.

The small puck shaped friend pendant hangs around your neck and records every word you say, and then, responds via text message. Is it a “Tamagotchi with a soul,” or an episode of Black Mirror?

It’s hard to decide, and it’s hard to tell, how seriously to take it. Either way, it seems little more than Eliza with a Bluetooth connection rather than an actual contender.

Interestingly though, none of these devices are the Apple Newton of the wearable AI category. That privilege goes to the now almost totally forgotten Pebble Core.

Almost totally forgotten, this is the Pebble Core (📷: Pebble)

Part of the company’s final Kickstarter all the way back in 2016 — well before large language models were a thing — and before the company was bought by Fitbit and stripped down for parts, Pebble Core was a smartphone in a box without a screen with access to Amazon's Alexa allowing you to ask it to perform tasks on your behalf. It was supposed to serve as the hub of your personal computing, a platform for other wearables and device manufacturers to build around.

Brought down for the most part by their own hubris, and by totally misjudging the wearables market, Pebble had real potential and a track record of building usable devices. The Pebble Core was a prototype that just never made it to become a product, but it was still there first.

Just like Apple’s Newton so far there seems to be a real divide between what was promised, and what was delivered.

The Humane AI Pin can’t set an alarm or a timer. It can’t add things to your calendar, or tell you what’s already there. Forget about translation — one of the most hyped features — because it just doesn’t seem to do it at all.

Every time the Pin tries to do pretty much anything, it has to process your query through Humane’s servers. There is no local processing. That leads to long and unacceptable latency, and a lot of just flat out failures.

It feels like the device where the hardware simply can’t keep up, and a good indicator for that is that it’s pretty much constantly warm to the touch. Not necessarily physically uncomfortably so, but enough to make you feel a bit mentally uncomfortable about having that a lithium ion battery pinned to your chest.

To give the device credit it is fairly aggressive about shutting down when it overheats. Which happens a lot.

As does running out of battery. Like Google’s Glass before it, the device’s battery life doesn’t seem to be enough to use it day to day, every day.

But technological readiness problems aside, and there are obvious technological readiness problems. Are there also sociological problems? Interface problems?

If you can’t see how it works, is it even possible to figure out how to use it in the first place? Our phones constantly feed back to us, the UI subtly guiding our choices and actions. There isn’t any of that with a spoken interface.

There also seems to be a growing backlash against people using spoken interfaces in public, arguably Glass failed due to being seen as socially unacceptable. How socially acceptable is standing on the street muttering to yourself.

Even taking a call on Air Pods can get you sideways looks.

And yet. There seems to be something here.

Setting aside the comparison with Apple’s Newton, I’m tempted to make a comparison to the early days of the iPhone. Because the original iPhone wasn’t actually very good.

I was involved with the iPhone ecosystem in the early days and spent a lot of time down in the guts working with sensors — the accelerometer, the magnetometer, and later the gyroscope — to try and eke out the battery life of the phone when using the GPS for positioning… using a combination of sensor data for dead reckoning, and to figure out whether there had been significant movement since last time we knew the user’s location for sure.

It’s been interesting to see the iOS mature, and the level of abstraction that developers working in the ecosystem need to make use of the underlying hardware change. It wasn’t really until release of the iPhone 5 in 2012 that using the GPS in the background became practical without an understanding as to what was going on under the hood.

It’s tempting to say that we’re going to see a similar maturing ecosystem when it comes to machine learning, artificial intelligence, on the edge.

Right now with machine learning there is a split that can be made between development and deployment.

Initially an algorithm is trained on a large set of sample data, that’s generally going to need a fast powerful machine or cluster, but then that trained network is deployed into an application that needs to interpret real data in real time and that’s a easy fit for lower powered distributed systems. Sure enough this deployment, or “inference,” stage is where we’re seeing the shift to local processing, or edge computing, right now.

Partly this shift is driven, yet again, by technological readiness

All the way back in 2019 I spent a lot of time looking at machine learning on the edge. Over the course of about six months I published more than a dozen articles on benchmarking the then new generation of machine learning accelerator hardware that was only just starting to appear on the market.

A lot has changed in the intervening years, but after a getting a recent nudge I returned to my benchmark code and — after fixing some of the inevitable bit rot — I ran it on the new Raspberry Pi 5.

Inferencing time in milli-seconds for the MobileNet v2 model (left hand bars, blue) and MobileNet v1 SSD 0.75 depth model (right hand bars, green), trained using the Common Objects in Context (COCO) dataset with an input size of 300×300. Timings are for Raspberry Pi 3, Model B+, Raspberry Pi 4, and Raspberry Pi 5 using TensorFlow and TensorFlow Lite. Comparison timings from our original benchmark are shown for the Google's Coral Dev Board using the Edge TPU.

However perhaps the more impressive result is that, while inferencing on Coral accelerator hardware is still faster than using full TensorFlow models on the Raspberry Pi 5, the Raspberry Pi 5 has similar performance when using TensorFlow Lite to the Coral TPU, displaying essentially the same inferencing speeds.

The conclusion is that custom accelerator hardware may no longer be needed for some inferencing tasks at the edge, as inferencing directly on the Raspberry Pi 5 CPU — with no GPU acceleration — is now on a par with the performance of the Coral TPU.

But this shift is also being driven by an awareness of hardware limitations. It was fascinating to watch younger developers — who had never had to struggle through the era of more limited resources — trying to fit their code inside the limitations of the early iPhone models. We’re starting to see similar things with AI.

The latest large language models, like OpenAI's flagship GPT-4o, live up to their name. They are anything but small. Smaller alternatives, colloquially know as small language models (SLMs), like Microsoft's Phi-3 can — depending on the task — be just as capable, but can be run using much less computing power and hence much more cheaply.

But what if you're trying to operate at the edge with not just much less, but almost no, compute power. Then you might need to take a different approach. For instance a new approach by Edge Impulse is to use GPT-4o to train a Small AI model, one that's two million times smaller than the original GPT-4o LLM, which will run directly on device at the edge.

This is a very different approach something like Picovoice's recently released PicoLLM framework, which is intended to be chained with existing tinyML models used as triggers for the more resource intensive SLM.

Both approaches allow you move inferencing out of the cloud at to the edge, but the Edge Impulse approach potentially allows you to reduce the amount of compute down much further.

What they're doing is not exactly the same as the architectures we've seen before now, which use tinyML models to select key frames to feed into a larger SLM or LLM. Instead we're using a full-scale LLM running in the cloud to classify and label data to train a "traditional" tinyML model, such as a MobileNetV2 that can be deploy to the edge and run on microcontroller-sized hardware, inside a couple of hundred KB of RAM.

It's a genuinely fascinating shortcut to use the larger more resource intensive model as a labeler for training a much smaller tinyML models that can then be used on device. It's going to be intriguing to see if models trained this way perform differently — have different perceptional holes — to models trained directly on human labeled data. Whether these AI-trained models are more or less flexible when presented with different and divergent data than their human-trained counterparts.

Of course you have to ask yourself, given how the current generation of devices have performed — rather poorly — whether it is a good idea at all? Is what has happened to the AI Pin and the R1 a marketing or technology failure.

I’d argue it is both. If the AI Pin had worked, if it had performed as promised, or on the flip side if they had promised less to start off with — things within its capabilities — would it have succeeded, or at least not failed? Remember that original iPhone.

Because of course there are alternatives.

With much touted AI integration coming to both iOS and Android phones over the next few months, undercutting what makes the new wearable AI unique, it could also mean it will be a long time before the ubiquitous black rectangle that all of us carry in our pockets gets replaced.

Apple Intelligence is deeply integrated into iOS 18, and macOS Sequoia. Google’s Gemini models, and generative AI, are coming to Android.

Are our phones still good enough? A shift in the model of computing requires that the next big thing, the something different, fulfils a need that isn’t being met. It also needs to fulfil that need in a way that’s significantly better than the current model. If we can just get by with our phones, AI wearables are going to stall.

Which, Google Glass aside, is arguably what has happened to the what I’d consider the main competition. Smart glasses.

We’ve seen several iterations of this technology since Glass was withdrawn from the market. The closest I’ve seen to something that might work out was Focals by North back in 2019.

Focals by North (📷: Alasdair Allan)

Focals by North looked (more or less) like regular eyeglasses, they had a beautiful UI with sparse, clean screen elements, and a simplified controller.

They really felt like a serious push beyond the Glass. It was AR that didn’t look silly. But the display technology was fussy, they were suffering from those technological readiness problems again.

Google acquired them a year later, and apart from a concept video at Google I/O a couple of years back they haven’t been heard from again.

When you’re developing technology you have to ask. Does it solve a problem we actually have? But you also have to remember that end users don’t care about about what technology is used to solve the problem. We do. They don’t, and if you rely on the technology to sell your solution. You’re probably going to fail.

There are good ideas here, but arguably the technology is not ready for the problem we’re trying to solve.

When developers and companies see an emerging technology and don't know what to do with it, they tend to build platforms rather than products. If that goes on too long, then we have a problem. The IoT had — and still has to an extent — a huge problem with too many platforms, and you can see it elsewhere in the industry.

Right now however, I think AI wearables could do with a bit of platform problem. We need to build out the infrastructure to allow us to do these sorts of things, make it less costly to take shots at building AI-based products.

Almost fifty years ago now two men named Steve built a business out of hardware that — at least for a while — was the most valuable company the world has ever seen.

Time passed, and technology became more complicated, so much more complicated that it became much harder to do that. But that’s beginning to change again, the dot com revolution happened because, for a few thousand or even just a few hundred dollars, anyone could have an idea and build a software startup.

Today for the same money you can build a business selling things, actual goods, and the secret there is that you don’t have to train a whole generation of people into realising that physical objects are worth money the same way people had to be trained to realise that software was worth money.

Behind every successful idea is the same idea done by someone else, just too early. Where Apple failed, at least the first time around, before the iPhone changed everything, Palm succeeded. This time around we know only two things.

None of the wearables we’ve seen so far have succeeded, but despite that some interesting people still have faith in the concept. For instance, after the recent New York Times profile, we know that Jony Ive is working along with Sam Altman on an AI hardware project.

Perhaps the next platform will have better luck. But for now? I’ll be hanging on to my phone.

internet of things

wearables

artificial intelligence

aallan

Scientist, author, hacker, maker, and journalist. Building, breaking, and writing. For hire. You can reach me at 📫 alasdair@babilim.co.uk.

The Death of a Salesman

Have products like the Humane AI Pin or Rabbit R1 failed? Or are they the next underlying model around which our computing will be built?

Sponsored Articles

Latest Articles