Posted on August 15, 2024 in Technology

Inaccurate To a Fault

The single biggest issue facing content generative AI is that no matter whose product it is, they almost never tell you “I don’t know” when you ask a question. They will make something up if they have to. They are not trained on datasets where “I don’t know” is a viable response. This applies to every product that you, as the average consumer, can access. It doesn’t seem to really be improving, either. The biggest LLMs in the world have brief periods where they spit out nothing but garbage, seem to improve and become coherent again, and then dip back into “Cleveland, Ohio is a city located in Utah”.

The two assistants I’ve been writing about, the Humane AI pin and the Rabbit R1, both use a mix of preexisting LLMs and a proprietary version developed explicitly for each device, and seem to handle themselves quite a bit better, although not flawlessly. Truthfully, their functionality is meant to be more like Siri or Alexa than ChatGPT. You ask it questions, you ask it to write a text for you, it does those things. It’s not writing papers or imitating singers.

That said, when it does need to deliver, sometimes it doesn’t. They will sometimes say “I can’t do that yet”, a totally reasonable response, but when it comes to the more impressive technical pitches (‘how many calories are in this handful of almonds’, for example, initially demoed by the Humane AI pin before the video was edited to better reflect its current abilities) they are still spitting out unreliable answers rather than telling you it cannot tell. The EndGadget reviewer of the Humane pin tried to send a text to a friend, and rather than ask her what she wanted to tell him, it filled in a default text for her asking how his day was going and sent it without confirming first. To ask it to look at things, she had to begin commands with “look”. It’s inconsistent. Sometimes it wants you to hold it’s hand, other times it’s confident enough to send things off by itself.

Both devices are clearly trying to improve on the assistant programs already out there, but being less dependent on human input means making assumptions of what the human wants. When it couldn’t possibly know that without looking at prior data, like your text history, it plays it safe and goes bland on you. That’s preferable to digging through your text history to formulate something, sure, but is it better than asking what the wearer wants written? When it can’t do even that (like when it can’t guess you want it to analyze whatever is in front of you) it gets stuck just like the ones already on the market.

It’s not impossible for these things to be better. It’s not impossible a series of software updates could lead to Rabbit R1 or Humane outpacing the competition, but the competition is stiff. To catch up to Alexa or Siri or Google was already quite a feat – exceeding them to get to the next level will be even tougher.

https://www.yahoo.com/tech/humane-ai-pins-laser-display-083124742.html

https://www.engadget.com/the-humane-ai-pin-is-the-solution-to-none-of-technologys-problems-120002469.html

https://www.zdnet.com/article/humane-ai-pin-what-went-wrong-and-how-it-can-be-fixed-before-its-too-late

https://humane.com

https://www.tomsguide.com/ai/rabbit-r1-vs-humane-ai-pin-vs-limitless-pendant-which-ai-wearable-could-win