chinmay-jain-edited-audio === [00:00:00] Alright, chin Mei, welcome to the show, man. Thanks for coming on. Hey Jeff, thank you for having me. And happy Friday. Happy Friday the 13th. Actually, it won't be Friday or the Friday the 13th when it's published, but hopefully we have a good show. It is an auspicious time to have you on 'cause one piece of good luck is Waymo is making its interest to Boston. There's been a few cars spotted around the city and our Reddit boards are lighting up with Spottings and all sorts of stuff. So welcome to Boston, even if you're not here. Thank you. I mean, we would love to serve more and more people across the us, across the world, and Boston is of course an extremely important city for us to be in. I do wanna talk about Waymo, obviously, but you've done a lot more than that. You were YouTube for quite a while during a very key part of that McKinsey consultant. Before that, could you just like give us the TLDR on, like how did you get here and what were the key things that have led to where you are now at Humo? Yeah, absolutely. I've been at Waymo for almost [00:01:00] seven years now, more than seven years. Before that, I spent some time at YouTube. I was working with the creator team, and before that I worked on my own startup and spent some time at McKinsey as the management consultant. And I think all of that did come together to get me to Waymo, like 2018. When I joined Waymo, it was still much smaller. And to be honest, we were still in a phase where it was not clear if this could actually become a good product. But as was thinking about it, it was something that was still cutting edge. And I mean, again, if we could make it successful, then it's definitely civilizational changing, right? Yeah. And so while I was at YouTube, I got to know about Waymo and got extremely excited about it, and now I'm here. It seems like you were at YouTube at a time of huge growth for the company and a really major important thing, and it was just going really strong. How did you think through, let's go to this startup, you know, very early startup that point that, like you said, wasn't even sure if it would work. Sometimes you have tech companies like, oh, it might be hard, but it's gonna [00:02:00] happen. This is like fundamentally don't know if the thing will happen ever or not. Yeah, absolutely. I mean, maybe it was a mix of being extremely excited about working on this advanced piece of technology that could save lives. And then second, I think I always have tried to go back and work on something going from zero to one and way more presented with that great opportunity to take it from zero to one. And maybe there was some ity on my part to say, you know, you don't fully understand everything, so sometimes you have to take a chance. Yeah. And I think I took a chance. I feel like the world would be a very boring place if there wasn't a proper spot for a little bit of naivety and maybe hope every once in a while. Right? Yeah. You gotta kind of have that positive, this thing could work and it's more likely to, if I can help make it happen kind of thing. Absolutely. And I feel, to be honest, maybe Valley Silicon Valley is one place where it is a lot more beneficial to be optimistic about the future. Exactly. And I think that's a great mindset to have here in the valley. That's what [00:03:00] power so much has been like. Yeah. Is this possible? Who knows? But we're gonna make it happen. Yes. So I'm glad you joined it. 'cause looking at where we are now, you guys are in a bunch of cities, but also beyond that, the impact of driverless cars is potentially, you said huge, because I think I talked to a lot of people who haven't really kept up with this and sometimes like, oh great, I get to take an Uber without a driver in it. Oh, maybe I don't have to talk to a driver if I'm feeling like ornery that day. So much more than that. It's humans are fallible by definition. They don't pay attention. Sometimes they get distracted. I mean, it's a huge like safety thing. Right. And more than that. Absolutely. I mean there are so many aspects. Of course people like it for the privacy. Yeah. But there are also aspects of it being 90% safer than human drivers. Right? Yeah. There are think if I remember correctly. 40,000 fatalities every year in the US because of traffic accidents. Right. And that's something that Waymo could really help improve. And not just that, I think also think about the improvement in accessibility it provides. Mm-hmm. People who have other accessibility challenges. VMO is a great way for them to [00:04:00] now be free. Yeah. Get their own freedom. And so I think there are a lot more benefits to vmo and I do believe as we grow, as people use us, more and more people will find even more use cases digging in. One thing that's interesting for you guys is across ai, this, this idea of evaluation models, right? And probably, you know, look, if chat GPT, their model regresses a little bit, what's the worst that's gonna happen? I'm gonna have like an M dash where I don't want it, or a sentence might not be as. Exacting and compelling as I hope. But here any shortfall in anything can be pretty serious. What was it like going this from YouTube, which is very test heavy, you know, very digital stakes aren't that high from a life or death standpoint, over to Waymo where there's probably a very different process for evaluation. How do you guys kinda look at that and think that through, and how do you still run tests when downsides can be bad? I mean, that's probably one of the most important things I learned. For coming to vmo. So as you mentioned at U YouTube also you launch something, you do an AB testing, you understand how [00:05:00] users are responding to a change, and based on that learning you launch or you don't launch a particular feature. Yeah. So in that sense, you are testing across all the digital tools software. One difference of course at VMO is we are a physical AI company. Yeah. Our products are not digital products. They're out there in the world, so it's not as easy. Or safe or cost effective to do all the testing on ground. Mm-hmm. So you have to have a system which is really well tested, uh, before it goes out. The second common, of course, is, and this is I think becoming a lot more obvious to people post the LLM revolution, we understood it even before LMS, was that if you largely depend on machine learning systems, then you need to have extremely strong evaluation. And in some sense, evaluation is your way to define what needs to happen. To test if that is exactly what is happening, because machine learning is not as logical or as deterministic as old software. So I think those two things are [00:06:00] extremely important to keep in perspective and that's why evaluation at Waymo is extremely important in making sure we can successfully deploy things in the physical world. Mm-hmm. Thinking about that a little bit more deeper, how are you actually running a valuation? How are you? Running tests or quantifying does a new model is probably not the right word, but a new version. Does it improve, does it not, because you can't just throw it on the street? Are you guys like running a simulation? There's just some giant Waymo video game basically that's really high quality or there's not one thing that we do. That's the other thing I'm quite to understand, right? So like in the whole development cycle, the testing is basically part of the development cycle, and it happens as, let's say. A researcher or an engineer are trying to improve something, they keep evaluating it. Then we have this big software that's, uh, ready to go out. We have to run huge simulation and testing to test that out. Mm-hmm. And then even in the physical world, before going full driverless. We'll have testing done with the drivers in the driving seat. [00:07:00] Mm-hmm. And so there are multiple layers of testing and you can think of it as like multiple layers of Swiss cheese, which is like a problem maybe was not caught at the first level, but will be caught second or third. And so if you have a lot of evaluation coming together, it makes it very easy to catch, as you're saying, maybe there is a regression that's possibly there. Some issue that should not, absolutely not go in the physical world. We end up catching it long before. Right. Are you saying there's kind of multiple potential evaluation paths going on and maybe something dangerous or risky gets caught on? Two of them, not one, but that's exactly how it's meant to be done is why you kind of have Yeah, of course. Multiple, uh, you know, evaluation techniques might catch the same thing. Absolutely. Yeah. We do have, of course, an overall framework. But as I'm saying, you'll do the evaluation that helps you understand, you know, the behavior with respect to a framework, but also you will do evaluation at various steps of the. A development life cycle. And every step, the key point is to understand if I'm making a change, do I understand what it does? [00:08:00] And so you need to have those feedback loops at various levels to help people understand, you know, developers, understand product managers, understand our external stakeholders, understand how the behavior is changing. What do those outcomes look like when you're saying we're trying to change this behavior or we're trying to improve X factor, is it as blatant as. We wanna be better about stopping when someone's in a crosswalk or is, is it more fundamental than that? It could be at various levels, to be honest. Mm-hmm. Yeah. I mean the system is extremely complex. That makes me feel better though than it is. So it was like, no, it's like three decisions and that's it. I'd be really worried. Yeah, exactly. We take decisions at various levels. Yeah. I'll give you an example of how do we go about, uh, evaluating. So let's say we have to make an improvement on how do you get closer to the curb while doing the pullover. Yeah. And so how do you do that? I think the simplest thing, of course, is we'll have some old cases where we are not getting as close to the curb, and so you can simulate over them. You say, I have this new change now. If I see how the [00:09:00] software now performs on those old cases, is it getting closer to the curb? That's the simplest form of evaluation. And then you can have huge test sets if try and do that, that can really give you confidence on how good the performance is now, and then you can do it at various levels. Again, the size of the test set could change. Maybe it's very small initially when the engineer is working on it, but maybe it's much bigger when you're actually doing the whole release evaluation because you need a lot more confidence on the capability. So I think that's how you go. You keep going. And then some of this, as I said, we could use our own driven logs to test. Mm-hmm. But we could also create fully simulated environments to test it. We have various tools to create these systems for validation. They have their own use cases. It's like a matrix of using all of them at various stages. But based on the use case, we'll end up using the right tool. It seems like almost everything now is an exercise and pick the right tool for what you're trying to do, whether it be something, any kinda application you're writing that's [00:10:00] leveraging ai, usually it's probably not just the biggest, best model. There's usually mething specific that's gonna be better. Has the growth and improvement in capability of AI helped advancement here? Because I have to assume at some point in history that parking to the curb was a very deterministic, okay. As you pull up, calculate the exact angle of approach and as you hit this and just like you proud to be pretty strict about certain computation things to approach the curb now is a little bit more approach the curb and, and kinda help a couple parameters, but it can be done a little bit more fluidly. I answered two positives. First, like the close to the curve problem may sound like very simple thing. Oh no, I parallel parked a lot. I think it's pretty hard. I think again, the real world is very complex. Yeah. Like the seam could change a lot. Maybe sometimes you don't want to get close to the curb. You want to be away from the curve because there's something there, right? Right. So like the complexity is what makes it harder for a simple logic to help, and that's where we need more machine learning systems, which are more scalable, more generalizable. Even a situation [00:11:00] changed a bit, a machine learning model will have a better answer to how to respond to something like that. Going back to your first question of then machine learning, all this evolution, machine learning, help us Absolutely. I think all these LMS became a big thing you, you know, a couple of years back maybe, was it around like 20, late 21, 22? I think VM, of course, has been using the same transformer technology for much longer, and we discovered that as we're using more of machine on systems, we were seeing a lot more improvement in the behavior of the car. Yeah. And so absolutely as machine learning is evolving, as it's becoming more powerful, we do hope that we can leverage all of that. To make even better drivers, even seafood drivers and more generalizable drivers. If you drive a lot, you just kind of go on automatic when you drive, right? Because I live in downtown Boston, which is notoriously hard. We just had a giant blizzard about two weeks ago, and the snow has stuck around until, I mean, I think it's still there. So there's still areas you can approach the curb. If you do what would have normally been a good [00:12:00] curve approach, basically the rider has to step out into two and a half feet of snow, but in other spots it might be easy. Those are all the little nuances that I think we as humans take for granted that we just naturally process Exactly. Human mind again, has learned so much. Some things that are very obvious to us may not be as obvious to machines. Right. But that's where machine learning comes into play. Hopefully learning from humans. Mm-hmm. Things that are easy for humans kind of become easy for machines too. Thinking about how you actually train these and get like, I mean at simple point, you just need miles, right? To train the models. I mean, the problem most of driving, it's really boring. You don't have. People crossing the street in cows and curbs with snow on it and people changing and falling and cars stopping last minute, 99 point something percent of the time. Is it just like this huge amount of data that just kind of isn't really that important and you have to key on certain, like how do you get enough of the really important data to do this? That's a great question. And so I was talking about, you know, building evaluation. There is also the evaluation aspect of all the driverless driving we are doing out there.[00:13:00] As you're saying, can we figure out what are the more interesting aspects of driving? Mm-hmm. Because 99, maybe even 99.9% of the driving is pretty boring. Right. You hope. And so, I mean, this is going back to something similar to evaluation to say, do we know when a situation is interesting? Mm-hmm. Are there systems that can figure that out? And that's why valuation needs to be important because you also don't want any problems on ground that you're not catching, right? So you need very robust system evaluation that's not just catching problems before, but also is great at catching any things that happen on ground, which then helps us improve the system, helps us learn more. Then as you're saying, f filtering out the most interesting aspects of driver, and I mean, in some sense this is like you would've heard of this term, like the data flywheel approach, which is to say, okay, now if I have a way to find all the cases that I need to improve in an automated manner, then I can just find them and then feed that back to my machine learning system, and my machine learning system can improve on those.[00:14:00] So I think if you can create this whole flywheel to say, machine learn system did something I evaluated, I think it's great, let's go, let's go on ground. And then I evaluate the on ground driving. And if there is something that we need to improve, I bring it back to my machine learn system in the right manner, then we'll improve. And then this is a flywheel that can keep moving and the faster it moves, the faster the improvement. And again, I mean, this is more of an ideal picture. There's a lot of effort that goes into making this flywheel work. Yeah. But ideally, in the future, if you can automate large parts of this flywheel, then we are all talking about how fast machine learning is improving, right? The speed of improvement could be even faster and faster. That's what I was wondering is like, is all the miles on the road, while a huge amount of it is boring, is it fundamentally important? Because that's how you find the cases that you really need to work on, and so you need some level of miles to go find that. But then I guess I hadn't quite thought how important, like the simulation aspect of being able to just run. Those kind of things in a much more dense [00:15:00] environment, and both things are absolutely, both are important. Think about it this way, which is like, again, what we learn more and more is that the physical world is chaotic. Most of the time it's normal, but then there are these edge cases that only happen 0.1% of the time. So to be even know about them, you have to drive a lot on ground. And if you keep driving, then you see all those situations and then you understand how your system is performing in them. You can improve on that. Mm-hmm. And at the same time, you can take all those miles and then use a simulation system to 10 x the learning, a hundred x the learning. And so both are important. You cannot just have very few miles and then just try to learn from it because you have not even seen a lot of edge cases. Once you see them, then of course you can dramatically increase the learning from the same set through simulation. You'd assume at some point you get to a level where it's no longer a ability to process and, and understand the road standpoint of, of what holds you back from going city to city, but more just a, a capacity of [00:16:00] cars or something like that. Yes. I do have to say I appreciate the leaning towards safety and be a little bit more conservative on that piece there because, you know, everyone has varying levels of how excited they are or you know, how trepidatious they are around just general like driverless cars. But the more I have talked with you about it both today and previously, it does make me feel a lot better about just how much goes into all of this and how much you guys are really, really evaluating and, and running through that before anything ever really happens. That could be dangerous. I mean, again, safety is our top priority. Yeah. So we will make sure we are investing in the right manner, going in the right way to make sure our product out there in the physical world is extremely safe. Yeah. But the reality is that the evaluation and the understanding of the behavioral software requires us to look at many more aspects of driving. When you say there's a lot of other things you're looking at from a multi-layered evaluation there, what does that mean? Like how do you think about it from a more broad stance as you look at something like that? At a high level in some, it's, [00:17:00] uh, very easy, you know, of what we want to measure, right? Like one is the software working and is it working safely and within the constraints of environment? Then is the software working in the most efficient manner, in the fast manner, right? Mm-hmm. And finally, is it working in a cost effective manner? Yeah. So you'll have to look at, in some sense, evaluation hopefully covers all of it. I keep cost effective on the site for now because like the, really, the performance of the software is largely captured in the first three, right? Which is, is it working? Is it working safely and within constraints, and is it working efficiently and effectively? And maybe you have a comparison there to say, is it working better than humans? Right. So I think those are the areas we would want to focus on. And this I think is true for, I would say any AI agent out there of the day. You're trying to test the same thing. And so you have to test all these things. And under each of them, let's take safety as an example. Let's say we are doing a simulation figuring out whether a new software is safe and easy way is we simulate millions and billions of miles. [00:18:00] We can actually find if there were any collisions and see how do you improve them. But of course, if the software is great, you'll find very few of these. So then the second layer would be maybe we don't look for collisions, but you look for what you can call a denser signal, which is, let's say a close call. Mm-hmm. Just two vehicles getting close. They didn't have collision, but they got close. And that's a very good metric to understand, to say. What could have led to collision, right? Mm-hmm. And now what will happen is you'll find, uh, hopefully more such cases in the simulation. And the one that helps with is that helps the development team, the engineering team, to find out what they could improve, which would then eventually lead to improvement in collision rate. And this is true for most of the things we work on, which is you'll have a very clear sense of what you want to get to. Mm-hmm. And then hopefully you can find denser metrics that actually represent the same thing. But then you'll have, because it's a denser metric, it's much easier to he climb on. When you look at overall, like beyond the concept [00:19:00] of model evaluation from a more like higher level, outwards facing stance, what is the goal of ride? Is it a safe ride? Is it a fulfilling ride? I don't know if fulfilling would be a weird word, fulfilling ride. Of course, Weir's goal is to be the. Most trusted driver. Yeah. And of course, to be the most trusted driver, we have to be extremely safe. That's always the key thing. But we also have to be very good at driving, right? Which would be multiple things to say Drivers should be smooth, we should have the right side of routing. Good ets, good pullovers. Mm-hmm. So I think a lot of it comes together, and that's where I was saying I think my evaluation becomes important, is because it forces you to really concretely say. What is it that matters to the users and can you measure it and can you improve it? Does it get difficult at times? 'cause you know, if you think about how you build a normal SaaS company or something like that where you have users, so much of what we key in on, you know, here and multiple other companies have been at is we got this really strong feedback about how amazed they were about something, [00:20:00] or someone's mind was blown about this thing we were able to do. And it feels like. Great driving by a Waymo car might be similar to like in a sporting event when the referee is great. The best version of a referee at a sporting event is you don't notice them. They just do it right and it goes smoothly and it's easy and everything is just good. Usually when you notice the ref, it's 'cause they haven't done something great. Yes, absolutely. I agree with you. I mean, the best remote I is probably that you'll forget about. Yeah, maybe the first ride you would never forget about. But once you become used to it, it should just be part of your day-to-day life. There's no reason to remember using a Waymo. You have been driving a car, sitting in cars for forever. Yeah. So, yes, I agree with you. I think the best outcome is just becomes part of day-to-day life. You don't even notice. I can tell you actually, fally, I still remember the first time I was ever in a Waymo. It was in Austin, and I was taking video of the whole thing and sending it to my kids and be like, look at this. This is insane. And their, you know, minds were blown. But even the third time I got in one, it's still impressive, you know, on my phone [00:21:00] talking about something else and, and just kind of, it's amazing how quickly you settle into it. Absolutely. And it's a good thing, right? At some point there were like, again, long back it was like, well, would humans even trust if there's no driver in the driver's seat? I mean at basically, of course, you know, computers versus humans, like computers would be better at driving. They're always alert. They're not distracted. They're doing one job and they're trying to do their, you know, do best at it. Just that basic reason is why it's pretty clear that we more driver, because it'll keep improving and becoming better is the right way to go. All right. I gotta ask you a couple quick questions because I got you from Waymo here and, and there's been things I've been dying to know. What has been the hardest thing to actually teach the cars to do that, that maybe you didn't expect? Like what's the hardest maneuver? Actually, I give you two examples. One, something that looks simple but is hard, and one that I think even humans would find hard. Mm-hmm. I think one that even humans find hard would be an unprotected turn. Yeah. Because there is so much going on. There is a lot of [00:22:00] oncoming traffic, and so you have to make sure to be extremely safe. Also, you have to make sure you're giving the right indication. There's like a lot of negotiation happening in that. Hmm. And so it is extremely tough and it's great that VMO does really well at those. Yeah, and that's where I think the machine and machine learning systems come into play because we can really understand everyone's intent really well, but it is a hard one. The one that let's say looks simple is, again, I go back to the. Pull over one, which is no, it's just, you know, you have to just go and pull over. But there are aspects that are very obvious to humans. That machine learning systems are still learning, and this is, I mean, going back to like, why can't a robot just fold a shirt? Like it's so easy for humans. Yeah. But there are some aspects which are very easy for humans that machine learning systems have to really learn. Well, you think about pickup is it's not just you pull over. I've talked to so many Uber drivers and Lyft drivers and stuff. People get really, really picky about where they get picked up, where they get left off. Exactly. And then Uber drivers learned that [00:23:00] from years of experience, right? Of how the rider behave. Right. Maybe someone at the corner just said, hi, I'm here. Yeah. And an Uber driver from the corner of the eye could see them and say, oh, I guess they're my rider, kind of thing. Right. There are a lot of nuances about these simple things that we, humans are very good at. It's not even logical always. Someone's probably just emotional. It's just someone's gonna be kind of jerky about where they get picked up and someone else is just happy to have the ride, so it's fine. Okay. Last one. When you look at the human side of driving, what's one thing that you've seen that humans do when they drive that if they didn't do it, would make training these cars so much easier? It's a good question. I do think, I think a lot of our effort also goes into. Predicting what other humans would do on road. Mm-hmm. I mean, some simple things of, well, humans running traffic lights, and so our systems are pretty good at predicting that and then avoiding an incident. Mm-hmm. But that's, of course, takes us a lot of effort to understand that intent. But again, if humans were like robots where they were always [00:24:00] following the rule and driving really well, then our jobs would absolutely get easier. And that's not the reality. I'll be honest, I have like a hundred more questions I've written out, but I also realize that you have to go back to helping to build the future of driverless cars and that's much more important than just answering a bunch of my questions. So it's been great having you on, man. Thank you so much for taking the time. If people are curious or just wanna get in touch and say thanks, or hi, is LinkedIn a good spot or is there a better place to reach out? LinkedIn is a great spot and they can absolutely reach out with any questions. And yeah, I mean, it was great fun talking to you Jeff. Always happy to talk more about vmo. It's of course something that I deeply love. Great chatting with you. Awesome. Well, thank you so much for coming on. It's great having you on. Have a good rest of your day and yeah, hopefully talk soon, man. Good to see you. Good to see you.