Robot developers keep making it seem like housebots are imminent when they’re decades away

You are currently viewing Robot developers keep making it seem like housebots are imminent when they’re decades away
<span class="bsf-rt-reading-time"><span class="bsf-rt-display-label" prefix=""></span> <span class="bsf-rt-display-time" reading_time="4"></span> <span class="bsf-rt-display-postfix" postfix="min read"></span></span><!-- .bsf-rt-reading-time -->

The walking, talking, dancing Optimus robots at the recent Tesla demonstration generated huge excitement. But this turned to disappointment as it became apparent that much of what was happening was actually being controlled remotely by humans.

As much as this might still be a fascinating glimpse of the future, it’s not the first time that robots have turned out to be a little too good to be true.

Take Sophia, for instance, the robot created by Texas-based Hanson Robotics back in 2016. She was presented by the company as essentially an intelligent being, prompting numerous tech specialists to call this out as well beyond our capabilities at the time.

Similarly we’ve seen carefully choreographed videos of pre-scripted action sequences like Boston Dynamics’ Atlas gymnastics, the English-made Ameca robot “waking up”, and most recently Tesla’s Optimus in the factory. Obviously these are still impressive in different ways, but they’re nowhere near the complete sentient package. Let Optimus or Atlas loose in a random home and you’d see something very different.

Another major problem is what we can call social AI. Leading generative AI programs such as DeepMind’s Gemini and OpenAI’s GPT-4 Vision may be a foundation for creative autonomous AI systems for humanoid robots in the future. But we should not be misled into believing that such models mean that a robot is now capable of functioning well in the real world.

Interpreting information and problem solving like a human requires much more than just recognising words, classifying objects and generating speech. It requires a deeper contextual understanding of people, objects and environments – in other words, common sense.

To explore what is currently possible, we recently completed a research project called Common Sense Enhanced Language and Vision (CiViL). We equipped a robot called Euclid with commonsense knowledge as part of a generative AI vision and language system to assist people in preparing recipes. To do this, we had to create commonsense knowledge databases using real-world problem-solving examples enacted by students.

Euclid could explain complicated steps in recipes, give suggestions when things went wrong, and even point people to locations in the kitchen where utensils and tools might typically be found. Yet there were still issues, such as what to do if someone has a bad allergic reaction while cooking. The problem is that it’s almost impossible to handle every possible scenario, yet that’s what true common sense entails.

This fundamental aspect of AI has got somewhat lost in humanoid robots over the years. Generated speech, realistic facial expressions, telemetric controls, even the ability to play games such as “rock paper scissors” are all impressive. But the novelty soon wears off if the robots are not actually capable of doing anything useful on their own.