Abstract

I started building microcomputers when I was about 13, 14. I built an Acorn Atom. It is similar to an early Apple computer. Acorn, which built that machine, and became Acorn RISC Machine later on. I learned code. I was writing Assembly and then it came time to go to university and I couldn't really decide what to do. I liked making things that moved, that worked, things that were a kind of reflection on life, things that were imitating living things. I decided to study arts instead of engineering because I felt that I would have more freedom within an arts course. Which was true, but then I probably didn't get all the technical support that I would have gotten from an engineering course.
Then, when I left university, I worked in special effects, film, and TV for a while, also making robots. And what you can do on screen of course is very different from reality. And I wanted to make things that were real, things that were close-up. I would say that if you interact with a robot for real, close-up and you talk to it, it moves around you. Imagine you've gone to the cinema, and you've watched Jaws, the movie. It's a scary film all about sharks, but when you get up close, if you were in a swimming pool with a shark or you were in the sea and there was a shark there, that is a very different thing.
I wanted that kind of experience. I wanted to make machines that really connected with people, that you really felt an impact from. That was what I was moving toward. It's always been there, this idea to make something that appears to be alive, and I think that is the fascination for me. Can we make a machine that appears to be alive?
It seemed like a good idea for automation, and it was popular, and it worked, but I can't say that we had a super solid business plan. The response was good, and it grew organically from there. Ameca is our latest generation of humanoid robot. Ameca was conceived as a platform for not just us, but for other people to use. Many of the first Amecas have gone to university laboratories, quite a few in Germany. There are some fantastic advances happening with large language models, computer vision, machine learning, but what we were looking at is how can you integrate all of these things? How can you put them together and make an experience that connects with people? It is quite a difficult challenge. Robotics is really an integration problem.
You have the actuation, the hardware, and then you have many different software modules that all have to work together. Debugging a complex system like that was just a huge headache. If you can't visualize the data and what's happening within a system, it is almost impossible to debug it. A lot of the company's focus has been on how to build the framework for AI, for robotics, and how do you do that integration and how can you rapidly iterate, try out different things. Of course, we 've tried GPT-3.
There are many others, GPT Neo, GPT-J, different computer vision, libraries, media pipe, tensor flow, and lots of different components and things that you want to try. How do you tie them all together? It is a communications problem. You need fast protocols. There are a lot of data moving around. You need to have low latency. Imagine a robot that is tracking objects. You need to do that pretty fast, otherwise you have got a lag in the eyes of the robot. And that's very, very noticeable. Also consider things like response time in conversation.
These are some of the problems that we spend our time solving, but it is always with the aim of building a better experience for the person interacting with the robot. It is always about the people.
It's a very, very tiny market. It's low volume, quite high value. I hope one day that we'll get to a wider volume, lower cost. You need to get unit cost down. If you think about where expensive complex robots are successful, think of factory automation. You might have a million-dollar robot, but it makes 10 million automobile parts, so the cost per part is insignificant. If we are going to make a robot that is going to interact with people, it needs to interact with tens of thousands of people to justify the cost of it. We have always focused on public venues, science museums, theme parks, visitor attractions, that kind of thing. Places where you have a high volume of people, then the cost of a robot is quite easy to justify, and you can make a business model around that.
The connection you see between the people and the machines is just astounding. And why are we so captivated by that? That's the real question. And it has nothing to do with utility, so do not fall into this, I call it, the utility trap, “This robot's going to assemble pipes in my car factory or whatever.” These things are not significant. What is significant is how we feel about the technology.
So as soon as you start putting high gear ratios into a robot, it is very very hard to make that robot behave in a human-like way or an animalistic way, a biological way, because that is not how we are. We operate with generally antagonistic pairs. There are two muscles working against each other. Now, some engineers would look at that and say, “Hey, I can use one motor instead of two muscles because my one motor goes forwards and backwards.” But it's not a position problem, it's a force and compliance problem. The great thing about having two muscles is they can both relax. If I relax my wrist, it just flops around. And if I wave my arm, I can make my wrist move around, but I'm not using the muscles that are directly coupled to my hand.
I'm just waving my arm and I'm following that dynamic motion. And if you look at efficiency in walking gait or grasping objects, it's all about this ability to be able to be very highly compliant to have this relaxed floppy state and you are not doing it with a very highly geared drive chain. Now bring in pneumatics. What does pneumatics give us? It gives us a high-power density in a small space. You'll note some early Boston Dynamics robots were pneumatic. Mostly I think Atlas is pretty much hydraulic. Spot BLDC, brushless DC. We've seen big advances in the power density that you can achieve with a brushless motor. There's an idea, I think, it's credited to MIT, the kind of quasi-direct drive. We are not quite at the point where you can direct drive with a brushless motor, but if you put a very low gear reduction, quasi, you can pretty much get there.
I you have four to one, five to one, something like that, you can get enough torque without sacrificing too much of the drive transparency, which is a very very desirable characteristic. A lot of the quadruped robots that you see, they use this quasi-direct drive principle. What if you could reduce the load much further? We have an industrial robot arm called Poiser that we have some patents on and haven't talked about very much. It's parallel pneumatic brushless DC and the pneumatic component takes about 95% of the load and the brushless DC component takes the other 5%, but also provides the precision in the drive. It's quite analogous to skeletal muscle. We have a fast twitch and a slow twitch muscle. I think there are more than seven kinds of human muscle, but the two main skeletal types.
One is really good at sustained high loads, the other is very good fast response. Now, you need to couple these two properties together. There isn't one actuator that solves everything. With hydraulics, you can get a lot of the properties you need. And I can see why Boston Dynamics is using hydraulics for Atlas, because it really does solve a lot of the problems, but it also creates a heap of headaches as well. Leaking oil, for example, is not an easy one to deal with. There's also no inherent energy storage in the system. With pneumatics, you effectively have an air spring, so you can have some energy storage in the system. With hydraulics, that's much harder. You could have parallel springs, but then how do you allow the hydraulics to work in such a way that you can be transparent enough to store the energy in the springs as well? Generally, hydraulic robots are inefficient, and they run very hot. I've never stood next to an Atlas robot, but I've heard it's a hot experience if you do, because they dissipate many hundreds, even thousands, of watts.
And then picking up on behaviors of the eyelids. Look at yourself in the mirror, look up and down and you'll see that your upper eyelids track your iris almost perfectly. Now, when you blink, the blinking behavior overrides that. We look at these kinds of human features, the way the eyelids move in relation to eyeballs, the rapid eye movements, the way we focus on different objects and the way we move between details. We try to emulate that.
More recently we've been working on lips, lip sync. Lip sync is a big challenge because there's so much flexibility. It's hard to apply engineering to something as squidgy and soft as a human mouth, something so difficult to define. We started out by making a lot of scans, so we have a photogrammetry rig here, with which in a thousandth of a second we can capture a human face.
We then look at the different shapes, the extremes that a given human face can make. We take marker points on the skin surface and track what the motion would be. We then try to emulate that motion with a mechanism under the skin. Generally, we 're using small brush DC motors. The reason for not using brushless is just the amount of wires, to be honest. It's harder to integrate. Some people will push back hard against using brush DC motors. If you use the very high-quality ones, they are absolutely fine.
Those are the kinds of things that we are working on. Again, it comes back to parallel actuation, so we're looking at hybrid, pneumatic, and electric. We are going to have some really good stuff to show. I've already seen some stuff in the lab that's super, super encouraging, so I think it's the way to go.
From the mechanical point of view, we are looking for smooth biological motion, so the motion curves are important. The compliance of the robot is important, how back drivable it is. We try not to get too lost in technical detail that doesn't have much impact. The big challenge on the brain side, on the cognition, the conversation, is integration. There are many great components and of course we use a lot of open-source code. We use some proprietary pay for components, but the real challenge is how you put all of that together. We have a component called Viz, which is a 3D visualization, something like Rviz if you've used Robot Operating System or similar to Gazebo maybe.
That allows us to see not only the robot's body pose, but also any conversational input. What the robot's heard, what the robot's estimation of the people around it is, how many people are there, where are they, what are their facial expressions, what did they say last, what is their attention to the robot. When you've got a very large number of parameters and also a large number of contributing software modules. You might have a computer vision module, you might have automated speech recognition (ASR), you might have, say, also a LIght Detection And Ranging for location estimation, and all of these are running on separate modules. How do you aggregate all of that? How do you do the sensor fusion? That's a big problem and the visualization is key to that. You need to be able to just look at a 3D scene and see basically what the robot thinks is going on. You do not have a hope of debugging something like this on command line, watching a bunch of strings of text whiz past on a screen.
The same is true for visualization of the actuation of the robot. We have a kind of virtual oscilloscope application called probe, which allows us to monitor any particular signal on the robot. And that can be anything from hardware to software, so it could be the current in a motor, the position of a motor, or it could be an event fired by a facial recognition program, or it could be the location of a face within an image. Whether it's a piece of software running remotely or whether it's a piece of hardware on the robot itself, we can visualize all of that data and that makes it much easier to debug.
There is the hydraulically amplified self-healing electrostatic, I think it's called the dielectric membrane type actuators and there is an electroactive polymer. There is shape memory alloy, SMA. There are a lot of different ideas, a lot of technologies that people have tried. Nothing comes close. One of the big limitations on most artificial muscles is contraction ratio. The contraction ratio of human muscle from its relaxed state to its tense or contracted state, the difference can be up to 75% with human muscle, more typically around 50%. Now, if you look at something like a fluidic muscle or McKibben type actuator, you are lucky to get over 25% and really the theoretical maximums are around 30%. You might think this is not a problem, you can just change the leverage of the attachment points, but having spent a decade working on that, I can tell you it is a big problem.
The other problem is attachment, the way that you can distort the muscles. For pneumatic McKibben-type muscles, they cannot be twisted, they cannot be misaligned. Now, it is typical in the human body that our muscles do twist, and they attach in a very fluid and organic way and this doesn't cause a problem. But with the artificial muscles we have, we don't have good end attachments. And the way that they fuse to the skeletal structure underneath is clunky. It's difficult to emulate a biological system, so I wish we did have it. Many people have come to us and said can we do it, but no amount of money can do it. There is a fundamental breakthrough that needs to happen and it's not here yet.
We'll talk about the energy part first. Ameca, the non-mobile versions that we're making at present, are powered from a main electrical outlet, so autonomy is not a consideration. Now, obviously as soon as we are adding mobility, autonomy is a big consideration. And this is another reason that we're very interested in the parallel actuation in energy storage. This is why I prefer the hybrid pneumatic electrical approach, because the pneumatics are actually an excellent way of storing energy. A hydraulic robot is poor at storing energy. With a motor driven robot, theoretically you can back drive the motors to generate some power and store it, but, in reality, that's actually pretty inefficient and difficult to do. Storing energy just in a spring is a pretty good way to go.
At the moment, if you meet Ameca for a second time, it's got no recollection of the first time you met. How can we build the memory? How can we make it more human-like in that way? Those are the goals. It's about making a machine that feels intuitive and easy to use. It's as easy as talking to a person. And make that experience as fun as possible. That's what we're aiming toward.
