My mates and I pushed each other through many tough subjects and built strong foundations for future challenges, including wrapping our minds around the subject Analogue and Digital Control. Ah, the fascinating realms of analogue and digital, the building blocks to our real and technological worlds. Our entire world is made up of greyscale – figuratively speaking, nothing is truly black or white. We live in a universe with an entirely continuous, analogue, infinite make-up. Let’s take live music, for example. When we listen to these soundwaves brought into harmony, they are just that – waves. They can be smooth, move around, up and down, and they flow. Analogue technologies find ways to represent the infinite spectra found in real-world stuff – like audio, light, time – through continuous variable measurement and transmission of the signals. Grooves on a record convert those audio waves from music into physical changes to a material, so when playback occurs, the needle continuously reads these grooves.
Digital – the building blocks of computing – on the other hand, is discrete (individual, detached, not continuous) and usually refers to binary digits, made up of ones and zeros. This means it can be represented by electricity on a binary system, which is either on (represented as a 1) or it’s off (represented as a 0). Digitised music, like mp3 music you might listen to day to day, is converted from continuous analogue audio signals, through a process called sampling using an analogue-to-digital converter, into discrete digital representations of ones and zeros that can be stored on your device. This digital version is an estimate of the analogue signal and never as perfect as the original signal. When you want to listen to a song, your device converts the digital signals back into music through a digital-to-analogue converter, which reconstructs it, sets the mini earphone speakers vibrating and allows the signals to, once again, return to being continuous analogue waves.
This is important, as it’s so often at the interface between technology and the real world, where analogue and digital signals are constantly transferred back and forth. A way of thinking of this process is if you draw a wave across a piece of paper on a table, then take ten dominoes all lying face down next to each other, long sides touching, and arrange them so the top-left corner of each is touching the wave you drew, staggering the pieces but keeping their long sides in contact with the adjacent pieces. Hold all the pieces in place and slide them down the paper (or slide the paper upwards) so it’s completely clean underneath the dominoes. Now draw dots at each of the top-left domino corners, remove the dominoes, and try to trace a waveform through those dots as your guide. This reconstruction might not end up perfectly the same as the original wave, but it should be close enough. This is a simple analogy of the process of continuous analogue signals (the smooth-drawn wave), being sampled and converted into discrete signals (the blocky dominoes), transmission (moving the dominoes with respect to the paper) and reconstruction of the original analogue signal (tracing the dots to re-create the original wave).
The infinite nature of the real world can be challenging to represent as ones and zeros, but advanced technology has managed this beautifully. Every piece of digital technology we own operates on a binary counting system (base 2 – consisting of 0 and 1). The decimal system we use every day has ten digits (base 10 – consisting of 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9), and when we run out we add another digit to the front to continue counting up – forming 10, then 100 and so on – before we repeat all ten digits again to form every combination. In the binary system, however, we start repeating numerals only every two digits instead of every ten, as the full range here is only 0 and 1 before we add more digits on (0, 1, 10, 11, 100, 101, 110, 111, 1000, etc.). A single bit of data can carry just one of these fundamental blocks, meaning it can contain a 0 or a 1, making up two unique value combinations that this building block can provide. A group of eight bits is known as a byte of data, which can form 256 unique values (bitcombinationsbits = 28 = 256). A thousand bytes (well, roughly for argument’s sake) is a kilobyte (KB). A million bytes is roughly a megabyte (MB). A billion bytes is a gigabyte (GB). A trillion bytes is a terabyte (TB).
Historically, a byte of data was the number of bits you needed to encode a single character of text on a computer. In other words, if you press the letter a on your keyboard, a byte of data, or eight bits (again, each bit being a 1 or 0), will be sent as a serial stream of data (in a row marching one after the other) to the computer. The lookup table, known as the American Standard Code for Information Interchange, tells us how every character should be represented in binary; the a being sent to the computer is streaming in as 01100001. This means that, creatively, we can turn practically anything into a stream of ones and zeros.
Even the entire picture on your colour television is dictated by large streams of binary data, as is the image projected in a movie cinema. A 4K UHD (ultra-high definition) screen has roughly 4K or 4000 pixels from left to right. Well, close enough – in actual fact it usually has 3840 pixels across the horizontal and 2160 pixels down the vertical, making up a total of 8,294,400 pixels or 8.3 megapixels. One of the ways each of these pixels can be encoded in binary is through red, green, blue (RGB) colour channel combinations for each pixel on the screen, and these can be represented by eight bits per colour channel (eight bits red, eight bits green, eight bits blue). So just say the top, far-left pixel of the screen is completely red. That means red intensity is all the way up, or all ones, green intensity is all the way down, or all zeros, and blue intensity is also all the way down, or all zeros. The (R, G, B) values for this pixel in binary are (R = 11111111, G = 00000000, B = 00000000). So this stream of ones and zeros gives us one of our pixel colours of the 8,294,400 pixels on the screen, and that’s for just one frame of the sometimes 60 or even 120 frames per second that are shown to our eyes. A single pixel on its own changing colour many times a second doesn’t look like much, but thousands or millions of pixels tightly packed gives us the illusion of a picture, and when they’re controlled in unison to flow quickly from picture to picture, we’re presented with the illusion of moving pictures, video. These display technologies found in TVs, monitors and smartphones are so commonplace that we often take them for granted, but they are astounding in their capabilities. They are the result of generation upon generation of advancements in mathematics, physics, electronics and technological evolution.
If you think about it, a TV doesn’t have any cognitive ability, it doesn’t know it’s creating a picture, or for that matter many pictures per second, with its streams of ones and zeros. It’s just a big system that converts streams of encoded electricity into light and sound. What puts together the meaning behind it all is us. We perceive this collation of tightly packed pixels as a picture, and when the pictures change fast enough we visualise smooth movement. We can become engrossed in it, absorbed, immersed. This is because the human mind is so imaginative and our species has over time developed some great ways of hacking our senses.
The way the human mind applies meaning to various levels of technology is also important for the rise of robotics. There’s an everyday human habit that will always mean we meet robotics halfway along the interaction spectrum. But it’s not just limited to robotics. Have you ever found yourself seeing faces in cars or naming your plants? I’ve definitely named most plants in my house – there’s Groot, Baby Groot, Devil, Basil, No-name Friend, Treebeard . . . Okay, you get the idea. We also find that animals display human emotions, and many languages apply a gender to everyday objects. Why is it that we have this innate tendency to personify things around us and attribute human characters to non-human entities? Well this, my wonderfully witty friend, is called anthropomorphism.
There’s a famous experiment where a teacher stands in front of a class of students. They take a pencil and hold it up, stating the fact that they are holding a pencil, then break it. Strange, but no student really reacts. The teacher then picks up another pencil of the same type, and says this one is a special pencil. His name is Jonathan and he’s different from the other pencils because he enjoys being used to write on the page, particularly for colourful drawings of wildlife. He feels proud getting to work with students to achieve creative things. A brief pause for the students to imagine these scenarios and briefly empathise with Jonathan is suddenly cut short as the teacher snaps the pencil, to the students’ horror. Some instantly feel anger, shock and sadness at the loss of Jonathan because it’s like he was just killed. But really, ‘he’ was no different from the previous pencil, except that the imaginations of the class had been evoked to anthropomorphise this one. It was now a him rather than an it.
Other studies have found that children mentally project a kind of soul onto some of the most precious possessions they grow up with, particularly things like bears, dolls, toy cars or trucks. If that toy is lost and replaced with the exact same type of toy, the child likely won’t have the same attachment to it. This is because in the eyes of the child, the toy is not the same. Their toy was special because they experienced all those many events together, and this new one did not.
This is a very important trait to recognise during the rise of social robotics in particular. The robot doesn’t actually have to have a complete personality, self-awareness or cognition before we start to find ourselves connecting with it in some way. These traits come across in a vast array of features that we usually don’t even realise before we personify them – the name, the look, the movements, the voice, the apparent attention it will give us, the decisions and actions it chooses to take. So many traits, directly as a result of the design and programming, will give a robot a personality in our minds.
After finding through experience the propensity of people to anthropomorphise interactivity of computers and robots, I later discovered that this phenomenon has been coined the ELIZA Effect, which explains our unconscious tendency to assume that the behaviour of computers is analogous to that of humans. And the concept has been around for a long time. ELIZA was a chatbot developed in 1966 by MIT computer scientist Joseph Weizenbaum. It was found that ELIZA’s ‘DOCTOR’ script had a particular ability to elicit emotional responses from ‘patients’ through emulating Rogerian psychotherapy, a type of person-centred therapy. It mostly rephrases patient responses into new questions or allows the person to elaborate and continue speaking. Over the course of the discussions, the patients would often attribute character and personality to ELIZA. I’ve tried typing with an online implementation to get a feel for how it worked:
ELIZA: |
How do you do. Please tell me your problem. |
Me: |
I want to explain what it is you do. |
ELIZA: |
Suppose you got to explain what it is I do soon. |
That would be great and I’d put it straight in the book. |
|
ELIZA: |
That’s quite interesting. |
Me: |
Yeah, it is. Such an interesting thing when computers seem to exhibit human qualities. |
ELIZA: |
You don’t think I am a computer program, do you? |
Me: |
Oh, much more than that. I think we’re friends now. |
ELIZA: |
Do you say you think we’re friends now for some special reason? |
Me: |
You’re not? |
ELIZA: |
Does it please you to believe I am not? |
Me: |
Nope. I want us to be friends. |
ELIZA: |
What would getting us to be friends mean to you? |
Me: |
A lot, it would be amazing. We’ll chat about all the things. |
ELIZA: |
What does that suggest to you? |
Me: |
That we’re friends now. Thank you ELIZA! |
These conversations can easily make a lot less sense than this one and can quickly show the cracks in ELIZA’s programming. By standards of what computers can do today it’s a very simplistic system, but at the time patients would often feel like ELIZA had understanding and empathy as they worked through their troubles.
Such effects have also occurred with robots that do not even simulate conversation. There have even been recorded cases of soldiers, such as those in the US Army and Navy, developing attach-ment to their robot partners. These machines help them out in dangerous tasks like explosives inspections and are often human-controlled. They sometimes look like chunky metal boxes with tank wheels, and basically just process ones and zeros to capture information about the environment and move a bunch of motors. Needless to say, they are incapable (currently) of loving the soldiers back. But this hasn’t stopped their operators naming them after family members, friends, pets, celebrities, or individually, and building bonds with their anthropomorphised friends. Soldiers would talk to their robot buddies, take care of them, make them medals and grieve their loss when they were disabled or blown up. They’d sometimes hold mock funerals for their destroyed metal comrades. All this even though these particular robots were never designed with social interaction in mind. There’s no doubt that we are complex and incurably social creatures.
Could any good come from building emotional attachment to robots? What problems could be caused as a result? I can see the feeling of companionship being a positive one, similar to the way we build these feelings towards pets. As these technologies progress, we will also be able to get closer to a human-level interaction with these machines, and hence a human-level feeling of connection. I believe these will have big roles to play in the areas of loneliness, providing company and assistance, for example, in aged or palliative care. These should not replace pets or human companionship or absolve us from our social responsibilities, but where this is not possible or there is a big void, robots could possibly fill that gap.
Thinking back to Dad’s chess-playing robotic arm RTX, I had always applied a personality to him. Firstly, it was a he in my mind. He would learn through practice. He seemed to like playing board games. He was even cheeky when he’d taunt me with the dumping of my pieces. Again, I never felt like I was simply playing a machine. I felt I had a gaming companion and opponent. I could be the only human in the room, yet I didn’t feel like I was alone. This is significant as we move into the future of robotics, particularly social robotics, which will aim to handle day-to-day human interaction.
I really got to start discovering the power of anthropomorphism while designing my first robot in uni. After moving through many intermediary subjects, I finally made it to Advanced Robotics in 2006 and it was time to build my first robot. It would perceive its environment through a lidar sensor. Lidar (a combination of the words light and radar) can beam laser light around and detect the returning pulses of light once they have bounced off objects. The time it takes for the light to return allows the system to work out the distance to surrounding objects and build up a 3D image. Because light travels fast, lidar can build a very fast representation of entire environments very quickly, especially when the sensors can spin around and detect up and down, creating a 3D map as they go.
I had a small team of new mates – Marc, Michael and Jason – to work with on this, and we decided to build a tour guide robot for the university. With a round, red, battery-driven, three-wheeled buggy base (the power and doing bits), our robot consisted of a lidar system (the sensing bits) called a SICK Laser Rangefinder (which looked like a drip coffee machine my mum used to own), speakers (the communicating bits), a laptop computer (forming the brainy bits) and a basic body and head made out of spare metal, wood, acrylic, printer parts and other odd bits and pieces we could find. The curious thing we found was, by cutting holes to make a face in red acrylic, then simply placing a couple of disconnected webcams to make the eyes and an arc of LEDs (light-emitting diodes – the tiny lights found in most electronic devices, more doing bits) for a mouth, it suddenly looked like our robot had a smiley, friendly personality. The persona this robot was taking on was female, so I eventually named her SANDRA. I don’t know why we like acronyms in robotics but that’s just what seemed to be done at the time – even soon after this project, in the 2008 animated movie, WALL-E stood for Waste Allocation Load Lifter (Earth-class) and his love interest EVE was an Extra-terrestrial Vegetation Evaluator. So SANDRA stood for Students’ Autonomous Navigating and Directing Robot Assistant. I spent a whole day coming up with that name after looking through many baby-name books!
The fact that I worked on this just a few years after seeing A.I. Artificial Intelligence, the Steven Spielberg film about a social android, was not purely coincidental: I was fascinated with the idea of a social robot, and the quest for coding interactive human connection into a machine, so I’d always wanted to build one. The goal of the project was to base SANDRA at the main UTS lobby, from where she would roll up to visitors and ask if they would like a tour of the university. She would later be designed to hold a touchscreen with all the various event locations for UTS Open Days, and any other main locations she could assist visitors to find within the main floor of UTS Buildings 1 and 2. If the visitor selected a landmark they’d like to be taken to, SANDRA would ask them to follow and guide them there, providing fun facts on the features of UTS and making comments on some of the sites along the way – you know, to avoid the awkward silence with the not-too-speedy robot taking you on a tour. When visitors arrived at their destination, SANDRA would ask if they’d like to be taken anywhere else. If not she would bid them farewell before travelling back to her post in the lobby.
She had quite a large personality. In the first versions before a body and head was made, her little heavy tank body holding a lidar and laptop would often drive straight through my legs when testing her obstacle-detection capabilities. Following optimisation of this (and no longer having my legs attacked), it was time to give her a body and head. We built her at 1.8 metres tall, not far off my own height of a bit over 1.9 metres. As a result, she towered over some of the shorter students around the campus. I soon realised, as she giraffed her way around, that her centre of gravity was far too high, causing her to wobble and display a questionable ability to stop in time for obstacles – particularly people walking across her path of travel.
In a reaction-speed test I ran in from the side and jumped in front of her while she was travelling forward. Her base suddenly hit the brakes and her upper body rocked forward, almost knocking me out with a headbutt. That was a close one! None of us would have been amazingly popular if this robot went around randomly clocking out people around the uni. The simple solution was to bring her height down a bit and lower her centre of gravity. But a strange thing instantly occurred. Now that she stood at 1.2 metres, her personality seemed completely different. We only brought down her height to improve her stability, but it also altered her movements. SANDRA was now a different robot. It was fascinating to realise how much every little detail can affect a perceived personality. She no longer tried to headbutt people and her stability improved the smoothness of her navigation, so she seemingly became more confident and less aggressive.
SANDRA was definitely one of those projects you leap into when you don’t know how much you don’t know. Through the struggles I learnt a number of lessons that would eventually help me create a thought-controlled smart wheelchair. The first of these lessons was how to optimise the commands from the computer to her wheels and steering. It sounds mundane compared to AI and thought-control, but if you can’t control the robot then you’re not helping anyone.
The real work in automating SANDRA was providing a baseline dataset that meant she always knew where she was, a process known as localisation. That’s a human quality too, where we’re constantly working out not only where we are now in relation to where we came from and where we’re going, but also where we are in relation to our environment, to features or landmarks and to moving things such as other people and vehicles. There’s a problem when you try to do this in a robot. For a start, localisation is a complex set of information processing, synthesis and decision-making. Localisation is the key coding we have to be able to walk and run on two feet and keep our balance. Every split second we’re making allowances for where we’re walking now in comparison to where we were a split second ago. The fact that we humans have the capacity to practise until we can become Michael Jordan or Roger Federer doesn’t mean a billion decisions aren’t being made by our brains every second. Our brains are just very good at learning, adapting and optimising.
Another issue with localisation: the robot doesn’t have eyes or ears – well, not eyes and ears of the power and sophistication of a mammal. When we’re kids, we seldom know where we are unless we’re near known landmarks. When we move out of our small known zone and are driven far away from it, we’re always asking the adults, ‘Are we there yet?’ As we mature we develop incredible databases of landmarks, features and moving objects that help us develop our localisation and our sense of direction. It really comes down to Where are we in relation to the things and places we know?
Robots need this too. SANDRA operates in a known environment, meaning she already has a floor layout map. In this case, localisation means Where am I in my map and which direction am I facing? – so she uses her lidar to obtain a sort of puzzle piece, matches that up on her map and then figures out, I must be in this location facing this direction. When SANDRA moves along her route, she has to be able to see the significant, non-changing landmarks of her environment and understand her position in relation to them. Her particular model of laser rangefinder sends out a beam that flashes into a spinning internal mirror and measures the speed that the laser light goes out and returns. Her laser scans in a 180 degree arc, taking glimpses at each 1 degree point at a rate of 30 frames per second, so a full 180 degree scan occurs 30 times for each second that passes. To save on processing power, the system always assumes that it’s very likely, since the last check a fraction of a second ago, that she hasn’t moved far. She ain’t no teleportation robot!
Now that SANDRA knows where she is, how does she know where she’s going? This is called path planning. When a person selects where they would like to go, or alternatively when SANDRA knows where she needs to move to next, the system places a destination on the map and calculates the quickest pathway from where she’s located to that destination. Just in case there are any inaccuracies or minor faults, I make sure on her map that there are a few no-go zones that her path planning algorithm can’t plan a route through or near. These are mostly around escalators and stairs, for a few obvious reasons – I don’t want her tripping anyone up near escalators and I don’t particularly want to see her tumble down stairs. I mean, it would be devastating . . . pretty funny I’m sure (like Steve the K5 robot), but no, mostly devastating. So the adult thing to do is to pre-empt these potential problems and avoid them in the design.
Finally, path following means she can actually take the guided tour by following the imaginary path that has been planned out in her programmed map of the environment. The following is easy enough but the obstacle avoidance quickly becomes, itself, a bit of an obstacle. We now have a robot that can move, see its environment, localise itself in that environment, adjust for obstacles and plot in real time the best path to a destination. There’s another important feature we need: she has to interact, at least a little, like a ‘human’ because we’ve built her as a social robot and many of her obstacles would be humans, whether stationary or moving. This is where I get a little cheeky in my programming, to take some of the work out of this already mammoth project we have to complete in a short time frame. Instead of designing a more sophisticated obstacle-avoidance system so that SANDRA can independently navigate her way around people and obstacles, I simply get her to ask anyone standing in her way to move, through her robot voice playing on her speakers. We don’t want any collisions because she’s so heavy she’d bowl a person in her way right over. If they don’t move, the LED arc – which makes her appear to smile – inverts and she looks angry, then she tells them to move. My thoughts are, if an angry robot tells you to get out of the way, what would you likely do?
Well we’re pretty happy with this and it all seems to be working. SANDRA is projecting a personality of confidence and warmth, she really knows her way around, she is fun and chatty, and she can be left completely on her own to handle crowds with her limited, yet effective, social interaction. All future tests will still require supervision for this version.
But one morning a few days on, we find that we haven’t taken into account the unpredictability of people. We’re testing SANDRA in the lobby of UTS and she can’t find a way through a group of students since her path is blocked by a nineteen-year-old student. She says, ‘Please move out of the way so my tour can pass through,’ at which he just laughs.
Hmm.
I’m sitting on the stairs a small distance away, watching all of this. I’m intrigued. This was the first time in live trials that someone hadn’t moved when SANDRA asked them to. Her request foiled, the program progressed to an order, so she said with her angry face, ‘Get out of the way.’
I have a good laugh then quickly realise said student is still not moving. He takes a half-step back, hesitates, and then with a grin of arrogance he calls over his mates, ‘Oi guys, check this out!’
I find myself feeling a little frustrated for her and imagining I’ve built in a water spray or spring-loaded boxing glove. Meanwhile, I’m thinking back to my programming trying to figure out what SANDRA will do next, as this isn’t a situation I have ever planned for. I just can’t picture what the code is meant to do right now. I think she’ll either stay paused in this state or she might tell him to move again.
Nope. My programming, facilitated by what’s called a state machine, has a minor defect and always assumes the person will move so she can transfer out of her ‘path blocked’ state and back into her ‘tour guide’ state. This means that SANDRA’s program assumes she, all 25 kilograms of her, will be on her merry way again with a clear path of travel. The code crunches all these decisions in a fraction of a second. One moment I see her being taunted by this fool and his mates . . . the next her beautiful LED arc smile literally lights up her face once again, and suddenly . . .
BAM!
. . . That’ll do, robot. That’ll do.