Speaker 1: 00:00 [inaudible] Speaker 2: 00:07 every time I come over to your apartment, I see this weird like electronics stick with wheels on it. Heck is that I know. Electronic stick with we. Oh, found it. I, that's what I should have named it. You're, you're uh, you're a wonder with words that's balanced by James Balanced Bot. It has a name. It reminds me of Forkey from the new toy story. Four movie trailer. If you've seen that. Oh No. Four key. So he is a fork type type in 40 toys. Don't know how to spell that for key. F. O. R. K. Y. Y. Why not? It's a sport. Wow. It's, and it's the child's new favorite toy that is like running away. And that's what it kind of reminds me of. It looks a little hideous. Kids like this or kids are afraid of this. I'm scared of it because whenever I come over to your house, I get scared by this month. Speaker 2: 01:03 Monstrosity of cables and cords. Like what is this thing balanced by? Okay, well it's just a robot. Just don't robot James. It's just a robot. Um, that I work on from time to time because we all need a break from our day jobs. That's my hobby. I like to, um, do electrical engineering as my hobby. But the truth is, um, it's also just practice and it's fun. It's a, it's a game. And I like to see if I can build something big and I'm not interested in little robots running around on the floor that you can step on. I'm interested in big things, human sized. And so I'm always trying to make big stuff that can destroy your house. Yeah. Cause you say you're not interested in the look at this little tiny robot that you know can drive. Like maybe you can drive around, a little monkey can sit on top or maybe you can suction up stuff, you know, from the floor. Speaker 2: 01:57 That doesn't interest you at all. No. Well my floors too messy for those little suction bots to no, but if you could imagine what my goal here is, um, imagine like a segway but without someone on it. So it's just got two wheels at the bottom and a big stick and a little head. And I guess like if I'm ever trying to give it a justification for existence, you know, um, what's that great Rick and Morty part where like he creates little robot and it passes the butter and he says, what's my purpose in life? And he says, your purpose is the past. The putter. Well, what is, what is its purpose then in life? It has no purpose. The product does for me to learn how to build a robot. But James, bye bye. I've made up purposes. It could be a camera bot. Speaker 2: 02:44 It could follow you around and help you record your podcast and your video casts. Not so much your podcast, your video cast, your web blog, and your philosophy and all the things that I know you want to do. I know that you want to be walking around your apartment while you record your streams. I know it, it definitely seems like an ideal use case. Now that being said, I don't quite understand how you plan to do that with this super long. Like, okay, people, I want to peek you a picture. Individuals imagine a three foot tall stick. It's like, uh, a tube. Okay. And then one doll, people dowel. Thank you. That's the word. It's a dao. It's a DOL robot. Now on the bottom are the two smallest wheels you've ever seen in your entire life or like they're tiny. We can get into the problems of sourcing your parts from Amazon and then yes, on top there is a huge, it's, it's the size of my head essentially. Speaker 2: 03:49 A huge like, I don't know, it was a battery in electronics and cords coming off of it and as a smiley face on it and it's supposed to balance. Frank, how has that, how is that supposed to balance and move? I don't understand because there's only two wheels on it and I don't understand. I have a bicycle, but the two wheels go the other way, which makes sense. These ones don't make any sense. I'm confused technical term for then. I can't remember it when the wheels are in front of you and behind each other anyway. Well James, you know that my favorite transportation devices, the one wheel, it's yes, literally got one wheel. So you could say, how does that balance, and it's the same answer, the magic of Control Algorithms. So these are digital controllers that are making micro adjustments to the motor, to the Torque to keep itself upright. Speaker 2: 04:43 And the exact same way that a unit cyclist can stay upright on a unicycle after they've practiced for months, I'm sure. Cause it's not easy. Um, with just the right amount of Torque, you only need one wheel. So I put on too just to have, um, lateral horizontal stability. I didn't want to work in two dimensions. I just want to go one dimensional robot that could go forwards and backwards steering later. So then it's going to go, yeah, something like when I'm on, when you're on a Unicycle, your body and Torque of, of the wheat, like the cranks are the ones that are balancing or have control. And then when you're on a one wheel, I've tried to get on your one while I've basically died and that you're balancing, it's like a skateboard to that that has four wheels, but you still need to balance and go left and right to do the turn. Speaker 2: 05:35 So I imagine on a bicycle I'm doing the same thing, correct? I am, yeah. Exerting certain force either go forward or go back and to stay upright in general, like I have to keep a certain amount of, if I stop moving, then I will fall over. Like in general, there's actually a really cool term for this. This is actually a general problem and controls. It's very well studied and it's called the inverted pendulum problem. So if you imagine what a pendulum is, you have, um, a ball hanging off of a rope and it swings back and forth. Fine. Okay. Now imagine you make that rope, um, strong, rigid, like I wooden wall. And now you put the ball above your hand and you try to balance the ball, uh, using just the bottom of the wooden dowel. It's a much more fun game. Did you ever balance a poll on your hand as a kid? Speaker 3: 06:29 Oh, always. Yeah. You, you would take it and you see how long it would stay up right for, and you would move around and try to jiggle all over the place and then it would fall. And then you'd be like, go do it all over again and that's a fun afternoon. Speaker 2: 06:42 It is. It really is. I love doing that as a kid and you would find bigger and bigger poles. Like what can I balance? You know, to the point of you can barely lift it. Uh, and it's actually the same problem that rockets have when they're trying to leave the earth. You can imagine that a rocket is a big heavy thing and then below it is this thruster that can kind of wiggle its bottom around a little bit and it's solving the inverted pendulum problem. There's some mass, but then the thing that you control is below that mass and a gravitational well with gravity. We're doing this on earth. Everyone. Yeah, we in it. So it's the same problem sort of over and over and over again then. Yeah. And that's why it has this wonderful name, the inverted pendulum problem because it applies to so many different things. Speaker 2: 07:27 Now here's the real trip. Ready for this one. I'm ready. This is a fundamental step to making a walking robot because it turns out the way we walk. It's more like you put your foot down on the ground and then you pivot over it. You put your next foot down, you pivot over it. It's a constant switching of following. Walking and running is actually following you do a thrust and a fall thrust in a fall. Guess what James? That's the inverted pendulum problem. So we are practicing that on a daily basis. Humans are awesome at the inverted pendulum problem. That's why we find robots so ridiculous when they fall over because they're can't just stand up. Can't you solve the inverted pendulum? Speaker 3: 08:14 The problem I have, the problem I have frank, is that when we take a robot and make it solve that problem, what, what does that, the Boston dynamics or whatever, and they do it. It's the freakiest thing in the entire world. No quarters going on. I don't understand right now. Speaker 2: 08:28 Uh, my mind is blown because all of a sudden it looks natural because that's how humans move. That's how animals move. Okay? Speaker 3: 08:34 We don't take the uncanny valley thing. Is that what's kicking into Oreos? Speaker 2: 08:39 No, no. This, you're out of the valley at this point. You're on the other side. This is the fear part. This is like please God, no one put a gun on that machine Speaker 3: 08:47 because then we're all human terminator, Robocop, classic eighties movies basically at this point. Speaker 2: 08:56 I really loved the direction this episode has gone. You're welcome. You're welcome. Yes, yes, yes. So I'm very interested in the inverted pendulum problem because it applies to all of these things. And so I eventually want to make walking robots too. And so I have to get really good at inverted pendulum, you know, baby steps step before you can run, I guess. Got It. Step. But the other one, I'm sorry, I just, I'm, I'm having fun, nerding out and I have to continue to, this is a fun, um, difference when we talk about how we're running here, where it's a constant motion of jumping and falling, jumping and falling. Um, it's a thing called dynamic stability instead of static stability. So a lot of times when you see robots move these days, they're achieving static stability. They're um, they're stable and then they move a foot and get stable again. They move a foot and get staple again, move a foot and get stable again. And that's one way to make something move. But I'm very much more fun way is that you keep it kind of always in motion. You have dynamic stability. It's never stable at one point. If you hit it, it would probably fall over or something like that. If it doesn't have feedback. But the idea is instead of always being stable, we just, we keep motion moving and we'd be dynamically stable. Speaker 3: 10:15 Yeah. So often I'll watch these videos because you always see as soon as a robot, like Boston dynamics is doing something. Those, the, the Internet explodes with. Oh my goodness. And yeah, I remember seeing the first ones, they sort of like sort of mood. One foot stopped, moved, one foot stopped and then you started to see the bigger robots that we're in constant motion. And I have to imagine that that is an amplitude and more complex problems to solve because our brains are doing that constantly nonstop. And to attempt to do that while balance, probably hundreds of thousands of pounds of equipment and sensors and motors, like it's weird that your brain just knows how to do it and then you just do it. You have to think about it. But we have to teach these robots how to do it. Speaker 2: 11:07 More importantly, you do learn to walk, so your body is set up to walk. Our physiology is set up to walk. Like there are triggers in your spinal cord that are gonna make your legs twitch, but you don't have a sense of balance when you're first born, or at least most important thing is your muscles just aren't strong enough. You're just a little chubby little baby in your legs, can't really hold you up. And uh, when you talk about advancements in robotics, it always comes down to that what is our power source and how powerful are our motors? So a device like the one wheel wasn't possible until one, we developed these new kinds of motors called synchronous DC motors. They're fancy, they're awesome. They use permanent magnets. And also that our battery technology got strong enough. And so it's always an issue of power too. We have control algorithms. We've known how to do digital controls for a very long time now. And in the case of Boston Dynamics, they have very, very, very complicated digital controls. But, um, we know how to do it. We've known how to do it since at least the eighties. So what we've really lacked is powerful actuators and motors. In fact, the very first Boston dynamics robots where, um, pneumatic based, they use compressed air as all the muscles and everything because electric motors weren't strong enough at the time. Yeah. Speaker 3: 12:29 Oh yeah. Because just the amount of, and I think of the actuators, I remember like when you walk into Ikea and their stress testing, Speaker 2: 12:37 the, Speaker 3: 12:39 they're stressing like their couches or if they had and they would have the actuators like cause they're applying all the torque down cause they have a lot of strangles that the same type of actuator that they were using. Speaker 2: 12:50 Uh, similar even sent the simpler, uh, just a piston, you know, like you would see on a crane. Um, you know, but do you have a cylinder and a cylinder inside of it and one pushes the other out? Yeah. Uh, Yep. So I mean the neat thing about that is very analogous to how a human muscle or an animal muscle works. You have a piston attached to two bones, but it's not as flexible and there's other issues with it. But it works out well. It turns out, you know, you can copy human or animal biology quite well. In fact, um, Boston dynamics has been quite an inspiration for me. They don't talk a lot about how their technology works. But the cool thing was they started out at MIT and back when they were at MIT, they did use to publish how things worked. They used to work in something called the leg lab and in 1983 they published a book on how um, their legs worked and you would think an 83 that they would be very primitive, but no. Speaker 2: 13:52 And 83 this is basically, you can see the path that they were at 83 to what they would eventually become. Today's Boston Dynamics and the atlas robot and all that. Here's the problem, James, the stupid book is $1,000 on Amazon. No one's got a copy of it. My library didn't have a copy of it. I actually had to drive up to Vancouver and went to the library at the University of British Colombia. Had to go to like the special books section and then, you know, people had to check it out. I wasn't allowed to leave the room with it, but I sat my butt down and I read the entire book on how their stuff worked and it inspired me to get into all this. Speaker 3: 14:33 Do you have a favorite debt? Boston Dynamics robot? Speaker 2: 14:38 Not really. I think if I had a favorite it would be their very first original one and nope. Nope. Uh, back when they were at MIT, it had James one, so it was a little bouncing robot, but it was, I saw this robot when I was a kid on TV and I just thought, what a ridiculous thing, how stupid. But once I read this book and what that robot represented and how it worked, that was the first robot that achieved this um, idea of an in air pendulum with dynamic stability. That running is actually a more natural movement. Then slowly walking, things like that. This was a very inspirational robot. You can go on youtube, look up at MIT like lab bouncy, single leg robot and you'll see it. It's terrible. But I think that's still my favorite because it rep, it was like, here we solve the problem. The rest is up to your imagination, God. And we're all still trying to catch up with them. Speaker 3: 15:41 Their website is super duper fun. I mean they go back to the rex at least. So, which just probably like 2010 I want to say. But it was definitely super, I mean just super cool to see all of like the information, right? Like the original racks are h e x, which was a six leg robot with high mobility is you know, looking very, it looks, yeah, 14 centimeters in height, 12 kilograms in way. I mean they have it's battery powered, it's actuation like six joints. It has like a tele tele operation camera for perception. Like kind of cool that they talk about and give you some videos and information on it. Very Open. But you know, they have lots of money. Frank, how do I build a rex robot? Like how do I build a balanced bar all and how do I do this? Speaker 2: 16:30 And they love the show you those pictures. But they never ever talk about their software and how it actually works. So yeah, there's, there's two issues to building a robot. Jane's, let's say you want to do it first, you have the hardware problem, you need a robot and then you have a software program problem. You need to write some code that makes the robot do interesting things. Uh, boy, it's hard on both ends. I decided that I got to start from scratch. I got to pretend that I'm in the MIT like lab and learn all my lessons the hard way. The only way I ever learned lessons, you know, start building. But I guess if you had a lot of money you could always go buy a robot. They're out there Speaker 3: 17:15 offensive. I did buy a, I bought a, bought a little robot that can go around and um, you know, sweep up my floor so that, that is a robot. It's $100. That's not bad. Speaker 2: 17:26 Ah, okay. But you can't reprogram that one, Speaker 3: 17:29 right? No, it just sweeps floors, floors. It's like done Speaker 2: 17:33 boring. Not going to carry a camera around, not going to go fetch you a beer from the refrigerator. So, um, where did I start? Well, um, honestly you just, you start with the worst sketch of the and the world of a robot on a piece of, I learned that I draw like a two year old and I put a little smiley face on it, two big wheels. And I said, well that's all the design. I mean, unfortunately I've, I have a three d printer, so I Three d printed the chassis of the robot. Uh, the different parts that needed to hold like the motors and the parts that need to hold the electronics very slowly. I have developed a skill using um, tinker cad online. It's an um, I get these companies confused. Autodesk or Adobe, one of them tinkers are companies, those are companies. Speaker 2: 18:30 They are and they both do graphics. That's really hard to keep them straight. Um, either way, tinker, tinkercad, Google it. Um, it's kind of the best three d software I've seen within a reasonable budget, that reasonable budget being $0 million. So I kinda love it. Like free. Free is good for is good. Yeah. Yeah. And it's, it's a web Ui, but you can deal with it. It's not too flaky. And the truth is, um, I do the iterative design approach. I did a version one of balanced Bot and it was just mistake after mistake after mistake. For example, I printed this very flat pizza plastic and just bolted the motors onto it. Well that plastic bent. Okay. So then I tried with a wooden stick that was like, um, meter and a half tall, two meters tall, far too tall and the stick kept bending and eventually broke the chassis. Speaker 2: 19:25 So basically I just keep building and learning my lessons and breaking things. Uh, I think he just have to accept that you're going to break things and work that into your budget. Don't buy expensive things, always buy things and you know, quantities of 10 not too, uh, yeah, it's hard. But there is a little bit of design and unfortunately I do all my design on Amazon because eventually you have to buy motors. It's that problem I was talking about earlier about sourcing actual good components and it's hard. You, no one writes their spec sheets the same way. They all give the properties and different units, you know, different physical units. You have to do a bunch of conversions, you buy them, you realize that you can't use them at all because of this reason or that reason. It's a real mess. You have to have a little bit of a budget for it. Speaker 2: 20:18 But the good news is most of this stuff is rather cheap so you can play around a bit. Oh that's nice. That's good. Yeah. See it's as that you can pick up some parts and ideally, like you said, buy in bulk because you're probably going to mess up. Is that, that's kind of what I'm hearing constantly. I have broken, I've broken whole raspberry pies. I've broken my wall. I'm broken motors. I've broken wheels. You were making fun of my wheels, James. Let me explain my wheel situation here. Go ahead. Go ahead. Cause they're, I mean, they're cute. I like Speaker 3: 20:48 it. Speaker 2: 20:48 Yeah. Thank you. Thank you. The problem with buying all your parts on Amazon is that the hobbyist robotics community is more into small and cute and lesson two industrial wall breaking. And I am in the other camp. You know, I'm on some industrial stuff. Promise industrial stuff's very expensive. So I bought these wheels and yes, they are tiny because they at least had a metal axle and a metal wheel hub. I have learned over time that plastic does not work, especially when you take it outside, especially when you take it around, you know, I want a robot that can go through the forest, not that is going to be stuck on perfectly level ground all the time. And so metal, you got to buy metal and Amazon is sorely lacking in sheet metal parts. Speaker 3: 21:36 Got It. Yeah man, everything's going to be plastic or metal is also heavy to ship. So that's also a Speaker 2: 21:42 problem. Yeah, I would assume it's very annoying. I, I'm actually trained as a machinist. I know how to operate a mill and I would love to have a giant machine that I could grind out all my parts out of like solid steel or at least the aluminum or something like that. But I'm not at that stage. I told myself I have to get good at plastic and then I'll be allowed to have an awesome meal. Speaker 3: 22:05 So what type of those are the parts, but then what powers they're like, oh, okay, so you need a pro user. There's two problems like hardware and then I assume some sort of Speaker 2: 22:16 like hardware, software combination. Correct. Now, Yep. Now we're into the software part. This is, this is fun. Um, you have to choose a microcontroller. James, what microcontrollers do you know name Amal? Speaker 3: 22:30 There's the raspberry pie pies, right? Is that, so is that a micro controller, would you consider Speaker 2: 22:36 for this argument? Yes. Let's go with it. Some parents might yell, but yes, Speaker 3: 22:41 there's like a net dweeb. No. And Ardwino a raspberry Pi. Raspberry Pi's pretty popular. A pie of raspberry. Speaker 2: 22:52 Yep. You got it. That's about all I know. There's also another nice chip out there called the ESP 82 66 or the ESP 32 and these are all good things for robotics, but James, am I going to be satisfied with just making a robot? No. I need to make it a neuron network. Speaker 3: 23:12 Gosh, here we go. Speaker 2: 23:15 Machine Force Ai. Yeah. Speaker 3: 23:18 It would not be a merge conflict without a little Speaker 2: 23:20 machine learning. Yes, yes, yes. This is the exciting part for me because technically I went to school and I learned how to control these things. I have a degree in digital control theory, so I should be able to take something as basic as an Arduino and make this thing balance. But I'm more interested these days in getting ais to learn controls. And so not only do I want this to be a balanced spot, but I want it to be a self learning balanced bought neat. Huh. Speaker 3: 23:53 So as this robot moves around, it will, apparently, my thought here is that it will do some things correct and something's wrong and if it has sensors in which it is moving too far on one way or the other way, it would learn that and then be able to restabilize itself until it's in a perfect harmonious, never fall Speaker 2: 24:21 [inaudible] Speaker 3: 24:21 percentile. Like how does that work? Perfect. Yeah, you pretty much nailed it man. I'm great. Speaker 2: 24:27 Um, moves around and sometimes does something wrong. I would like to rephrase that into it's constantly falling over and never does anything, right. Yeah. Okay. Well Cup half full cup, half empty for Socrative year. Gotcha. But I've done a lot of catching of this robot falling over, so I do not want to give it any credit at all for anything. Got It. Okay. Okay. Okay. So this technique is called reinforcement learning. And it's all the rage in. I don't know, my world, I don't know a world that is for the frank, the world of France and what it is. It's actually just um, it's an architecture. James. Here we are architectures. It's how to consider building a robot. And it's very simple. You have your robot itself. Uh, the robot is allowed to take a set of actions on the world. In my case, it's moved them all words, motors, forward, move the motors backwards, and then they're allowed to, um, experience the world, like you said, through the sensors. And in my case, I have basically the same sensors with, we've had on iPhone forever, that accelerometer and Gyro. And they tell me mostly how the robot is moving but not well, we can talk about that later. And then here's the trick though. So I mentioned those things. There's a robot that acts on the world. There's a world that you can sense. And then the third thing is you get a reward. So you have to reward the robot for when it's doing a good job. And that is Speaker 3: 26:00 crocs. Isn't this the, what's the one with the dog where you can train? Like they train it based on like treats or whatever. Like the, that's the same, right? If you have a, you have a dog and it does something good. And as you give it a reward when it, when it does it. But do you want to get her award in response? Pavlonian Yep. And then, but do you, and then when it, yeah, but then would, did you say you want to give them the rewards all the time or not all the time? Speaker 2: 26:33 Oh, these are really big open questions. And this field in this field of reinforcement learning. And the neat thing is this is a broadly applicable thing. It's not just robots, but things like you said, dogs can obey it. So what you're talking about is continuous rewards versus sparse rewards or to rephrase it even differently, uh, continuous rewards versus maybe even sparse punishments. So when my robot is active, maybe I'm not ever giving it a reward, but if it falls over, I give it a little negative reward, a little punishment. And so bad to think of it that way I guess. Don't anthropomorphize. Yeah. Um, yeah. But uh, to show you how general these things are, reinforcement learning is used to beat video games right now. So star craft and particular, um, and Dota, I don't know what Dota is. Do you know what Dota is? Speaker 3: 27:25 Dota that yes, it is a, I'm not an mm. Oh, she's what is, Speaker 2: 27:32 it's one of those like action packed kind of Starcraft II is, Speaker 3: 27:36 it's a Moba. It's a multiplayer online battle arena. Moba. It is based on a world of or war, sorry, Warcraft three. And this is like a, it's a community. Create a mod on top of it. And I'm not a Dota player. I was never really a Warcraft three player. I have modded it back in my day. But um, I'm more of a fort type players, so, Speaker 2: 28:00 well, here's what's really cool. This reinforcement learning concept has been applied to video games and without cheating. It's just allowed to look at the screen and it's given the reward. And guess what the reward is in a video game. It's your score. So just given your score and the screen. So the screen is the sensor, the reward is your score and you just let this thing learn. And right now it beats all top players. If you look into Google's deep mind, I don't know all their, all their projects are deep. Something you can find it reinforcement learning, Google, and they are just crushing it. They're beating any game they care to beat pretty much at this point. Um, with the simple algorithm, this act on the world since the world get a reward and iterate reinforcement learning. Speaker 3: 28:52 Yeah. They have been doing this with chess and go and a bunch of classic tabletop games, uh, which is always mind blowing when that happens. Speaker 2: 29:03 Yeah. That's, those aren't necessarily reinforcement learning. But the video game one video if you, yeah, if you zoom into those, it's mostly starcraft and Dota, but they be every Atari game and they're like, well that was dumb. So now they're like moving on to a contemporary modern games and trying to win those. It's pretty fantastic. Speaker 3: 29:25 Well, so how do you then teach this balance? Bought the, this reinforcement learning because you can't really give it a treat. I mean, what is the treat for the robot? Like what's the reward? Speaker 2: 29:38 This James is probably the hardest part of all this. Um, I'd have gone back and yeah, I know. Like I w the reward is hard, so I can tell you rewards I've tried and the problems with them. Ready for this? I'm ready. Okay. So let's say I tell it, I want you to stand upright for, you know, that's my definition. Balance. Stand upright and don't budge, you know, be upright. That's, and as much as you're upright, that's how much I'm going to reward you. So if you're perfectly upright, I'll give you a treat. If you're 50% upright, I'll give you 50% of a tree. Okay guys, quickly that, yeah, so that made a robot that could kind of stand up. But here's the issue. The moment it starts driving forward. Okay. Imagine this weird tall robot on two wheels driving forward in your head. Okay. Speaker 2: 30:34 And now it wants to change directions. Well, the way to stably do that is you actually throw your body over a little bit. Think about how you would do it if you're running forward and then you needed to go the in the exact opposite direction. You need to do a crazy pivot on your own legs. You need to throw your weight over in a way that's not going to break your knees or your ankles or anything like that. So the robot actually needs to be able to get into these very strong angles from time to time. Uh, imagine if someone punches their robot, very meanly punches my robot. It needs to be able to recover from that. Yes. But if the report is always just always tried to stand up straight, then it's never going to practice these dynamic maneuvers and it becomes a very rigid and fragile robot. So it stands up, but it's fragile and it never learns how to control the dynamic environment very well. Speaker 3: 31:29 Yeah. The moment that something happens out of its element. It's some, it's sort of, if you have is like if I always stay, it's that experience where for instance, I see it all the time in Arizona when I lived there, no, no shade on Arizona cause it's sunny all the time. But the first time as an Arizona, uh, I was driving around and you know it's, it's super sunny like 300 some odd days a year and people are really good drivers there. They are great at merging. They drive super duper fast and just really like this great grid system. But then the moment than an outside element like liquid falls from the sky, all everything goes to hell. It is just a, it is, it is like what people are pulling off to the right. People forget everything and that sort of a real world sort of, we know that that thing exists and we know the impact of it, but we don't know how to cope with it. So like the pushing, if you have never been pushed over, you don't know how to react to the pushover. That's, so those are like my, correct me if I'm wrong, but that's how I'm bridging the, our worlds together with the world. Speaker 2: 32:40 Yeah, absolutely. And you can even think of that as you just haven't experienced falling over very much. So of course he can't deal with it. You've doubt your majority of your life experience has been standing perfectly upright. So it's just, it's just you don't have the experience. And this is all about experience and learning. It's important. Okay. So that reward wasn't great. It kind of worked, but it wasn't great. Can you just, well, I'm not going to quiz. You ready for my next one? I'm ready. I'm ready. Yeah. So I'm like, let's look to the animals, James. How do animals balance and the way we pretty much do 99% of our activities, even our physical activities in this world are we're trying to minimize energy use, um, conserve energy, everything. The way we stick out our leg, you know, anything is usually just because we're trying to reduce the amount of energy we have to do. Speaker 2: 33:35 Where are the lazy, we are the lazy, but all animals are. And it's a guiding principle of a lot of things. So I decided, you know what, let's forget this balance thing. I'm just gonna tell you, minimize the amount that you use the motors. Crazy. Huh? Minimize the amount in which you use the mode of motors and that's not quite enough. So then if you fall over, um, I punish you. No simple way of saying it. But um, yeah. So anytime you use the motor, you get a light punishment and anytime you fall over you get a big punishment, uh, very ward anymore. It's a negative reward. But you can think of it as a reward if you invert all those sentences into arrest. This works pretty well. I hate to say it. Um, it's a good lesson to learn to just minimize the motors. Speaker 2: 34:34 Um, because a lot of times if I was to do this with a more classical controller, like a PID controller, you get a lot of jitter. Like the motors are going back and forth, back and forth, back and forth, trying to hold that upright position. But if I tell it to minimize the energy, then it's like, Ooh, I don't really want to move unless I absolutely have to. And that's where the problem comes in. It does get a little bit lazy and a little bit laggy, but it's still nice because it dampens down the system a little bit. I like that. And in that is a, it's, it's almost part of its body. It's part of its body to some extent. And the reward versus not like me, it's almost like, do I want to stand or do I want to sit and lean or lean or even when I'm standing, am I leaning on something on my shifting body weight back or forth, or you're just on one leg. Speaker 2: 35:25 We lean on one leg because it's actually a little, we're really good at the inverted pendulum problem. You can just balance on one leg. It's fine. We don't need both. Hmm. Interesting. Hmm. But this one had own problems and they're, they're a little hard to get into. But basically, again, it wasn't acting like a good dynamic robot. So the truth is you blend all of these together and you give them different weights. And then there's just one more that I want to throw out you because you nailed it when you first talked about this. You do give it a reward and you give it a reward for experiencing new things. So every time it's like, oh, I've never been in this state, or I've never had that happen to me. No one's ever hit me with a hockey stick of before you actually reward it. And this makes them more curious robot and one that actually learns quicker because it's just trying to seek out new experiences. Cool. Huh. That Speaker 3: 36:20 is nifty. I wish that we had more of that in the real world. I like, I'm just trying to apply how we can apply your, your balance bought lessons learned and takeaways to my real world. Like maybe every time I do something new, I reward myself in some way. Yeah, that'd be cool. Speaker 2: 36:37 Yeah. Uh, there's a fun, um, there was a corollary that goes with this. It was an experience they had with the researchers who did this initial breakthrough. That curiosity is actually really important to learning for these robots. They've created a maze and they did this with videos. So this is a camera controlled robot. My robot doesn't have a camera. It will someday, but it doesn't right now I don't have enough computing power, but they put, um, and this is all virtual, but they put the robot in amaze and said, you know, figure out the maze, figure out how to get around, figure things out. Does a glorious job because it's curious. It's looking around every corner doing this and that and it develops a very fluid way of moving. It's a great robot, except they put a TV on one of the walls that just showed a new picture every frame. And that by definition was how they were. That was their definition of curiosity. Experience something different every chance you can. And so the stupid robot would sit there and watch TV instead of following the maze because the TV was more interesting than the world around it. If you want your real world corollary, James, there it is. We thrive on curiosity and new experiences and we glue ourselves to TVs for that. Speaker 3: 37:58 So, and that is what robots have taught us. So really when people are scared about the row robot takeover I got to do is turn on the television and Bam, problem solved for about at nine. I'm going to move. Speaker 2: 38:10 Just wow. I thought you were going to talk about the human. The human would be pacified hot so you don't have to worry. Here's some new experiences. Speaker 3: 38:17 Wow. Mean some would argue that I do watch too much television already am it could be like it's the Netflix Syndrome that there's so much choice and so many options that everything is new all the time that you can't not do it. And then also it's the net, like the autoplay is like worst thing ever. Because what autoplay does is it gives you, I would, I would consider this reward system, what we would contribute to like a dopamine hit. Yeah, right. Sure. 100% direct corollary. Yup. And, and you get that dopamine hit often when you experiences new things like Ooh, this is really core. You get excited about something and that's what Netflix tricks us to do. It's the same thing in video games, especially mobile games where they're always popping up banners and rewarding with little bits or coins all the time. You get that dopamine hit, you're being rewarded. Even if it's, if it's ancillary and the Netflix autoplay is the same exact thing, which is, oh it's going to start in 10 seconds and like they show you a quick preview of the next thing. Like you're getting a preview in between thing. There's, there's a dopamine hit cause you just successfully finished something. Then you get a quick preview of something and then you actually get to see the next thing and there's a countdown. So there's anticipation like they've gamified and dopamine defied the entire process of watching television in that way. Speaker 2: 39:40 This is turning into my favorite episode by the way, that you see there's a new dark mirror out. New experiences. Speaker 3: 39:47 I can now I did the, did you do the Bandersnatch dark mirror? I wanted, I did. I did. How did you feel about it? Speaker 2: 39:56 Well, I watched it with my mom, so it was a little awkward. So I think it was okay. The choose your own adventure. Par. I wasn't too down with because I really felt like I was locked into a very narrow path and every time I tried to diverged they would just reset my game. So I, I ended up like dying multiple times. Yeah. And so I was like, this is getting a little boring. And so finally I was just like picking the option that I knew they wanted me to pick so I could get through it and get the whole story. It was all right. I A B plus for effort. Yes. Yeah. Speaker 3: 40:27 Cool idea. I wanted to experience the story and I wish that there was a way to just go through it and say, all right, give me the story as you want, even though it's not what they want. And that's not the concept. But knew towards the end I started to give it the answers that I thought that it wanted and then it wasn't, oh, you were thrown off. I was, I think I might've been down some rabbit hole and then I looked it up and there was hundreds of like not hundreds, maybe like a hundred different possibility, like things to go through. So whatever you saw might've been completely different from me. Uh, and then there's some really tough decisions too and it's very is if you haven't played, I mean I'm a huge black mirror fan. Heather cannot watch it. She hates it. She, she enjoyed the first few and then there was some, well not that far. I didn't like the very first one episode was the worst. Yeah. Then there was some, we didn't watch that one. I made sure to skip that one when I introduced it to her. But then there was some ones and there was, it just got really strange because it was new. I needed to sort of watch them first before I introduced them to her and she was like, I don't like to show any more. I can't do it. So Speaker 2: 41:33 I get to my, but I love, I think it's, it's very interesting. I can't wait for the new episodes. Yeah. The last couple of seasons. We're good. So we'll get that. See we got distracted by TV, how powerful it is. All right, so does this thing work? I want to know like the NRC, so we're not going there yet James. We're not going there yet. Okay. We can go there. I know we're going by, but I just want to talk about training real quick. Yeah, hit me because I did give myself a hard problem here with this reinforcement learning. It doesn't do the experience once or twice and learns from it. It needs to do it 10,000 times before it learns. Yeah, these algorithms are good, but they're not back. Good. And lessons I've learned over this is different. Neuro networks have different learning rates, all sorts of things like that. Speaker 2: 42:21 But I should mention my biggest pet peeve is I don't want this thing falling over 10,000 times. It's tiring. It's literally exhausting to have to keep picking it up. And so I tried to do training on the computer via simulation. So I wrote a simulator for the robot, a physical simulator, and try to make it as accurate as possible. And that was such a rabbit hole, but still kind of fun because then you can get like a virtual presentation of your robot. Oh, that's cool. Huh. Now the problem though is that all simulations are lies. They're all terrible. So I would do all this training on the simulation, get it all tuned up, be like, Yep, that robot can balance it, can do these cool tactical maneuvers that control its body over when I ask it to perfect. Uh Huh. And then I install it on the physical robot and I say, go a little guy, have fun in your world. Speaker 2: 43:17 And it jitters full throttle's and flans itself into the wall. And I'm like, good job exploring exploration. Yeah. So the truth is, and this led me down to a very long rabbit hole. I read so many, um, different techniques on how to write physical simulators. I've had to go back to my old college textbooks. I've tried a little bit of everything and what I've just concluded as all simulations are terrible. Uh, and so I started doing something crazy. I'm like, well, you know what? Machine learning to the rescue. So I started training a neural network to simulate the robot based upon the past experiences of the robot. So now I have this very interesting scenario or I have a neuro network training a neural network so that it can be installed on the robot. Hmm. Which I thought was ridiculous, but the more I looked into how people actually do this stuff, you do end up creating that kind of pipeline, um, where you're, it's a constant feedback loop, euro run, run the hardware, collect data, use that data to improve your simulation, use your simulation to pre train a model, put that model back on the robot, let it improve on the robot and can keep continuing this way. Speaker 2: 44:36 And I think that's been the biggest lesson I've learned through all this is a how to develop that pipeline of constantly improving. And that is truly the overall learning algorithm. Like the high level version of it, mind boggling, Speaker 3: 44:53 you lost me basically is what I heard. Right. See like what happened is we started early. I try it there. We started to talk about Netflix and Black Mirror and I was like, oh, this is great. Like you see, I was like, I was making all these great analogies and then frank went off frank on me, and then boom. Speaker 2: 45:07 All right. Okay, let me, let me end on a punchline then. No, it doesn't balance out. Speaker 3: 45:13 I've seen it almost balanced, frank almost. And it actually kind of kind of ID. Speaker 2: 45:18 Yeah, it's kind of okay. It does a little bit balanced. That can stay up like around five seconds is kind of, it's happy spot right now. And the truth is, this is all the fault of the motors. The motors just aren't powerful enough. So if it does start to lean in one direction, you really need to like grip the ground and really accelerate into it and it just doesn't have enough power to do that. So eat your cheerios in the morning, I guess is what's the human analogy for make sure you have strong muscles. Sure. Pushups. Sure, sure. Why Speaker 3: 45:51 I'm an apple a day. Keeps the doctor away, but all right, so we're done. Bring it back to Netflix. Okay. Oh, we're done. We're out of here. Well, I think that may do it for this week's podcast. You feel pretty good, frank, you feel pretty good about the podcast. Speaker 2: 46:04 Thanks for letting me talk about this. I've wanted to talk about this robot on this podcast before ever. I can't believe you finally let me. Speaker 3: 46:11 Absolutely. Well, you know, last week I got to talk, it was a James talk show and on this time it was a frank, let me interview you. So I have fun because I've always really wanted to get into robot building and understand about the different microcontrollers and if somebody has boggled my mind for a while. So I, I appreciate sort of the breakdown of not only just what you kind of need in a background of building it, but what goes into building it and then also trying to train it and make it smart. So instead of just having it remote control, it's a little bit more than that. So I appreciate that. Right? Speaker 2: 46:41 Yeah. I had remote control cars as a kid. Now I want cars that can kill me. Oh Gosh. Just trying to help the robots in their uprising. Speaker 3: 46:50 There you go. All right, well, I think that's going to do it for this week's podcast. Frank, thank you so much. Thanks to all of our listeners. You can of course find us everywhere on the Internet. Go to merge conflict.fm. There's a contact button. There's, you know, places to follow us to, you know, all the favorite podcast players. Hit subscribe, telephone and do all the things, but he's going to do it for this week. Speaker 4: 47:10 Until next time, I'm James Matson Mag now and I'm frank [inaudible] and thanks for listening.