Sven Mawson: We don't know anything about these people. They've never talked to us. They just picked this thing up and are using it. That part is just crazy to me. Eric Anderson: This is Contributor, a podcast telling the stories behind the best open-source projects and the communities that make them. I'm Eric Anderson. Eric Anderson: We are live today with Sven Mawson. He is a Founder, or one of the founders, of the Istio Project and a Senior Staff Engineer at Google. We'll be talking today about the Istio Project. Sven, thanks for joining us. Sven Mawson: Hey, great to be here. Glad to get a chance to talk about Istio. Eric Anderson: So I guess let's just start at the beginning. Maybe you could tell us how this all began, the origin story of Istio? Sven Mawson: Sure. Istio is born out of actually a number of different pieces. So I'm mostly going to talk about the Google pieces. It's a shared project between Google and IBM, as the founding companies. On the Google side, we had this project, actually, we still have it, called Cloud Endpoints, that is a way to make, it's kind of like Amazon API Gateway, but different in a lot of ways. It actually uses a sidecar instead of a managed proxy. In the process of building that, we learned a bunch of different things. Sven Mawson: One of the things we learned was that people wanted it to be open-source, and another thing people people asked for was actually a lot of additional functionality that wasn't really API management at all, but was more about service management, about what we now think of as a mesh. And we looked at that and realized we had a lot of expertise in building this stuff because Google had actually been doing this sort of thing for years and years. So Google had a very comprehensive security system around microservices called [inaudible 00:01:54], that there's various things on the internet about. We had tracing and monitoring and logging, and all those systems have copies in open-source now. Things like Prometheus, which is based on some of the early monitoring work. Right? Eric Anderson: Right. Sven Mawson: So Google had these pieces. We actually had built a proxy on top of Nginex, that we used as the sidecar, and we were looking at, "Okay, let's build an open-source version of this." And at the same time, Kubernetes was really taking off. So this was two and a half, three years ago, I guess, that this was happening. And so Kubernetes was not as crazy as it is today, but it was really starting to hit the growth curve and win the orchestration market. We sat near some folks from the Kubernetes team at Google, and we were talking to them about their needs and what the next level of things Kubernetes wanted, and actually, a lot of those things were also the same set of things. They were security, they were load balancing, and especially L7 load balancing, application load balancing, traffic management and observability at the service level. Sven Mawson: So we looked at all that stuff and we said, "Wait a second. This is way too great an opportunity to pass up," so we started really thinking about the idea of how do we actually kick off a project? We saw that IBM had been building this project called Amalgam8, which was doing the networking pieces, also based on Nginex actually. And so we met up with them, I believe this actually was at KubeCon 2016 in Seattle. We met up with them and talked about it and got an agreement to join forces on a new project together. They would bring in their expertise on the traffic management pieces that they'd built in Amalgam8, we would bring in the security pieces and then the observability and build it altogether. And we kicked it off that way. Eric Anderson: Fascinating. A couple followup questions now. IBM and Google are, on some level, competitors. Was there any tension in that agreement or is this all open-source goodness in the meeting at KubeCon, for example? Sven Mawson: No, there was not actually a lot of tension. The biggest tension actually that I had been worried about, was we were basically coming to them and asking them to drop their existing project in some sense. I worried a lot about them being really resistant to that, but actually they were on board very quickly. It didn't actually take very much convincing for them to want to join forces. Sven Mawson: I guess the other thing I should mention, is that we had both built these Nginex-based systems and neither of us were very happy with the proxy. Nginex is super fast and actually really well-optimized, but working with Nginex is not great. They don't have a very open community. They also have a commercial offering based on it, that had a lot of the functions we were trying to add. So there's this misaligned incentives problem, where we kept wanting to come to them and say, "Hey, can you add this new load bouncing thing and this new thing and make it all free?" And they have a lot of that functionality, but they're trying to get money for it. Sven Mawson: So we needed a proxy that didn't have that, and right at that time as we were looking around, is when Envoy got announced. And so it was like, that was another thing that just ... Just the right time, the right place. We were like, "Wow, this is exactly the proxy we're looking for." We did not want to build our own. Eric Anderson: Yeah, yep. Sven Mawson: Right? It was just perfect. So I think IBM also had a similar situation, so I think that helped them in being willing to jump over to a new project, based on Envoy, add a bunch of new features, add security, get Google to help them with it. Right? Eric Anderson: Exactly. And then you've been saying "we" a lot, and I believe that's the Cloud Endpoints engineering team. Is that right? Or does that not quite describe the group here? Sven Mawson: No. So actually, it was yeah, an offshoot of the Cloud Endpoints engineering team. So there's actually a couple of us that had been working on API management at Google for probably six plus years now. I can't even really count that far. We built actually multiple generations of API management at Google, we finally had taken what we built and spun it off as Cloud Endpoints, as an external project. Sven Mawson: And part of that team came along to build Istio, but actually part of the team stayed on Cloud Endpoints. And part of the team has been working on both actually. There's some people that have been working still on Cloud Endpoints and still on Istio. But we actually brought in a bunch of new people, but the leadership, via co-founder at Google, Louis Ryan, we've been working on API management forever, and we are the two founders, at least initial founders, but very quickly it was a pretty good team effort to build it. Eric Anderson: We've talked before Sven, and you've mentioned that starting this work also was kind of associated with you taking some time off? Sven Mawson: Yeah. I have a bit of a history of taking on new and exciting projects after coming back from paternity leave. So this was actually after ... That's why I know when this happened, right, because I just think about how my son, my third child is, and he's turning three in March. So it was basically the summer of 2016 that this was all kicking off. I'd come back ... I had managed to hand a bunch of stuff off to other people before leaving, because I was going to be gone for awhile, and that's the right thing to do when you go on paternity. But it's actually a great opportunity when you come back to think about, "Okay, what do I actually want to do? What's going on?" Sven Mawson: Things were going well in the current stuff we had built out, the existing projects, and my sweet spot is on the beginnings of projects, not necessarily the maintenance and keeping it going and adding new features and stuff. I really like to be there for the beginning of things. I like to come up with new ideas. And so I came back and was looking around, and there was all this stuff going on and I'm like, "I'm going to take this. I'm going to run with it, see what happens." Yeah, it was so much fun just to have that time and be able to carve something out like this. Eric Anderson: Great. So, okay, so take us back now. You and the IBM team, KubeCon, agree to work together, you produce a bunch of code. How does that get us to the launch or is there a launch for Istio? Sven Mawson: Yeah, and actually, I want to mention a little fun part about this was actually we got hallway passes to KubeCon. We, at the last minute, decided to try to kick this stuff off and everything had been sold out. So we had these hallway passes, which means we couldn't go to keynotes, we couldn't go to any sessions. We were just allowed to be in the building. So our meetings were in the bar of whatever hotel that was. Eric Anderson: Right. Sven Mawson: We were meeting with people, meeting with the IBM team, we actually met with some potential users that were interested in this space. We rented a coworking space two blocks away and had two hour power hour with a bunch of people, including Matt Klein from Lyft, he built Envoy, to talk about how all this stuff could come together. And Sriram from IBM was there, and Louis and I talking about all this stuff. It was really actually a lot of fun, just being in the margins of this conference, trying to figure out how to create this project. Sven Mawson: So the history going forward, it was a lot of Googlers and IBM working together early on to build this. I guess maybe that was December-ish. Actually, it was November. It was right around the election, 2016. Actually election night, we were talking to potential customers, and that was an interesting evening, let's say. But yeah, so about, I guess, whatever that is, six months later in May of 2017, was the first real release. That was the 0.1, the first point release of Istio that showed what this thing could be. And it was very, very raw. It was more proof of concept than real usable system. Sven Mawson: But then we started releasing monthly after that. We did seven point releases in the next year, until July of 2018 where we launched the 1.0. It's interesting that leading up to 1.0, the question we got the most when we talked about Istio anywhere was, "When is it going to be 1.0? When can I use it in production?" So we had a very clear focus on that for a long time of, "We got to get this thing out, we got to get it ready, let people start using it." Eric Anderson: Got it. Forgive me for having us jump around, but we've had Matt Klein on this podcast to talk about Envoy proxy, and his description of how that took off and Google's involvement was fun to hear. So maybe if you have any more detail from the Google side, you had this meeting with Matt in KubeCon. Where there other meetings? Sven Mawson: Yeah. Obviously we had lots of discussions. One of the interesting things is that Google is using Envoy for more than just the Istio Project, but a bunch of other things as well. And so there has been a wide array of integration work, and actually the majority of the work on Envoy from Google is not actually related to Istio at all. It's related to other projects, which I can't really talk about. Sven Mawson: But Google has a very, very strong interest in it. It's kind of like if Google had tried to rewrite GFE as something that could be used as a sidecar and these other things and built ... It would be very, very similar to what Envoy was. And so the people in Google who were responsible for these things were like, "Wait, this is exactly what we want. We want to use this for a lot of different things." So we've worked very closely with Matt on not just Istio, but other projects as well. Eric Anderson: Okay. So, it's easy to imagine Matt being inundated by Googlers as soon as Endpoint lands. Sven Mawson: Yes, yes, yes, exactly. There was a lot of interest. Eric Anderson: Yeah. I mean, before we get too far in the story, what are your aspirations or expectations as you set out to do this originally, and how did those evolve over time? Sven Mawson: So I think early on, our aspiration was focused on Kubernetes. So it was focused on the Kubernetes ecosystem needed strong security between services. It just didn't exist, they actually had had a feature request for, I don't know, a year plus to try to do something in that space and had never been able to get around to it. They didn't have L7 load balancing. So the load balancing was all at L4, which meant you couldn't do per request load balancing, which is not great if you have long-lived requests. Long-lived connections I should say, sorry. So if you have long-lived connections and those are spraying lots of traffic, they're all going to go to one place and they can easily overload a single backend. There was no solution in Kubernetes for that. It was, " [inaudible 00:12:19] your own, do your own thing." Sven Mawson: So they were missing that piece. They had the load level, again, networking level metrics, but no application level metrics. So they were looking for solutions for that, and we were very happy to step in and say, "Hey, Google has built all this stuff as libraries, but we think we can do it as a proxy instead. We have expertise in building sidecars, my team, and so we're going to do that." And that was where we started from. Like, let's just build this thing for Kubernetes to solve these known problems. And very quickly we realized, when we talked to people about this, that it can actually be used for a lot more. So you build a hammer and find out there's a lot of nails out there kind of thing. There was just so many different people that wanted the things we were building, wanted these tools. Sven Mawson: One of the big examples here is actually people are moving to Kubernetes, but actually moving itself is rather difficult if you don't have the things we were bringing. Right? So security is a special thing where you want that, no matter what. But the tracing, monitoring, logging and traffic routing pieces all make migration to something like Kubernetes a lot easier, and just make dealing with VMs and Cloud Foundry and whatever other environments, actually a lot easier. And so quickly it became a, "Hey, actually, this is a really good cross-platform, cross-VMs, cross-containers, cross all these different types of things, way to connect all this stuff together and make it actually more understandable and easier to migrate things around and move." So that became our new focus, probably about a year ago, or I shouldn't say new focus, but the expanded scope was, "Hey, let's help people actually become cloud native, become modern." Not just in Kubernetes, but also on other platforms. Eric Anderson: I see. And part of that is because it's advantageous to adopt Istio before Kubernetes, as it can help you with the migration. Sven Mawson: Yes, absolutely. Eric Anderson: As opposed to you start with Kubernetes and you keep adding things, and one of those might be Istio. Sven Mawson: Right. Yeah, exactly. Yeah. So it actually can help a lot. And right now, actually, to be honest, the VM story is not where I'd like it to be. It's pretty manual and there's a lot of steps. It's not well-automated. So we really want to actually work on that, work with partners, work with VMware, do stuff on Google Cloud, do stuff on other clouds to make it really easy to actually get Istio running on VMs. And then help people cook up their VMs to their containers, help them migrate things to containers, move things around, have it all work together. Eric Anderson: It sounds like the first or early version of Istio, the vision had a lot of promise but it wasn't all there in those early versions. Tell me about those early years, if they were months, the delta between what you hoped to build and what was available to users? Were people picking it up and using it? Were people waiting? Sven Mawson: Yeah, so we had an interesting strategy, in retrospect, which was ... So Istio has these three pillars, right? It has observability, it has security and it has networking and traffic management. We could have just done one of those and tried to get that out and get that productionized and get it done, but we thought that all of these pieces really go together and you need all of them to have the full platform, to have this full ability to run services in a modern way. Sven Mawson: And so our strategy was actually to build these out and in each of these spaces build a minimal version that gets you what you need, and then get that into production-ready state. So the early versions actually had a lot of the features that we have now. There's a couple things that I can talk about that came in later versions, but really, most of the feature set of Istio was there in 0.1. It just was very rough. It was very raw, it had lots of bugs, it caused 503s in services, and things could get dropped. If you went from using no mutual TLS to using mutual TLS, you couldn't actually roll that out. You had to restart everything with the new thing, and during the process, connections would fail because the client didn't expect there to be mTLS and the server did. Sven Mawson: And so all these kinds of things that just made it harder to use and harder to roll out and harder to actually use production. So that was where we then focused really for a year, a year plus, was getting all this stuff production-ready and able to be used by real companies, in production, on critical services. And that was a year plus of work of getting it all production-ready. Eric Anderson: I imagine with coming out of Google and having the Kubernetes momentum, that it wasn't hard to find users I suppose. But was there a marketing effort? Sven Mawson: So I actually was shocked at the amount of interest in Istio. I think all of us on the project were shocked by the amount of interest that it garnered. We expected some. We expected we'd have to actually do some marketing and try to find users, but we did not. We were basically having to turn away users and say, "No, no, we're not ready yet. Don't try it yet." I still am actually shocked when I find out about ... Some guys from our team went to KubeCon Shanghai. They had a session and they talked to talk to companies there and did a survey people, and the number of people that answered their survey, that are using Istio and production in China, it's like, "Whoa, wait. We don't know anything about these people. They've never talked to us. They just pick this thing up and are using it." That part is just crazy to me. Sven Mawson: So no, we didn't have to do too much marketing. We did some, like with the 1.0 blast, we did a little bit. We were trying to get Istio associated with the service mesh space early on. And actually, the reason for that was mostly to get people to join us. We wanted people that are interested in this space to stop building their own stuff because everyone is building this. If they're not using a product, everyone was building their own system. So we talked to so many people that had bits and pieces of this, they weren't happy with it, they didn't like maintaining it, it didn't have all the features they wanted. And so just being able to tell them, "Hey, join forces with us, help us build this thing out, help us test it," that was the main reason for the marketing push. And we did a very, very small one around that 0.1, and we did a bigger one with 1.0. But it was still pretty minor compared to the interest that has garnered. Eric Anderson: Help give us some color on how those communications happen. Are people sending you emails from all over the world to talk about how they can help out? Are there more formal channels? Sven Mawson: It's a formal open-source project, right? People go to istio.io and they can follow the links to getting involved there. Or they can just go straight to github.com/istio. There's a whole community section there about how to get involved. There's a bunch of working groups people can join, start contributing ideas, start contributing code reviews, docs, whatever. So there's lots of people that have actually just jumped in to help that way. Sven Mawson: We also have done some slightly more formal discussions with companies that are interested in working on Istio, where it's more of a whole company decision to bet on Istio. And that obviously is more formal, where more senior people are talking to each other about making sure everyone's on board and that no one gets left out in the cold type thing. But for the most part, it's pretty much a normal open-source community. Eric Anderson: How did governance evolve? I imagine at the beginning this may have just been a Google organized, managed project. How has that shaped over time? Sven Mawson: Yeah, so it was Google and IBM from day one. That was very much on purpose. And that was actually ... Why we didn't just start this on our own, is we actually really wanted to make sure that people understood that this was a community and a community project. We didn't want it to be just one company. There's lots of just one company open-source projects, that some of them have managed to build communities around it, but it's harder, and it's actually harder to get customers. Especially if it's Google doing it because there's a lot of, "Oh, this is really a Google thing." So we wanted to avoid that. We wanted to make sure that people understood that we wanted this to be owned by the community and run by the community. Sven Mawson: On the other hand, Istio is a little different in the way it is governed, in that Istio is governed by companies contributing to Istio primarily, not by individuals contributing. And that's at the steering level. So at the top level of how we run Istio, it's right now, mostly Google and IBM, and actually, we're in the middle of figuring out how to bring more of the new partners into the steering of Istio, and make sure that it's a bigger community there as well. But at that level, it's based on company contribution. So the more a company contributes to Istio, the more say they get to have in where it's going. But that's only at the marketing of Istio and the top level steering of it. Sven Mawson: At the technical level, it's all based on peoples' technical contributions. So if you come in to Istio and you do a lot of work in, let's say security, then maybe you'll become one of the leads of the security working group. So for example, Spike from Tigera, has been awesome and done a ton of work on Istio and helped drive a lot of the early security thinking, and so he's been one of the working group leads on security. There's some other folks from Cisco and from other areas that have been helping on some of the other working groups, and it's based on what people do to help out with the community. Sven Mawson: And that actually goes all the way up through the technical oversight committee, which is also based on peoples' contributions, and those are people that have those seats. So as an example, Sriram, who was one of the founders of Istio from IBM, he actually joined VMware and had immediately jumped to VMware way up on the contribution list because he is very prolific. But at the same time, IBM still maintained ownership of the project because it's IBM that was investing in it, not Sriram, right? He was being paid to do that work. Sven Mawson: It is a little interesting and a little different than a lot of other ones, although it's actually, it's probably more in line with a lot of the single company based open-source projects, where it's the company has a huge interest in it. It's just this is a multi-company open-source project. Eric Anderson: Along those lines, some open-source projects, you mentioned some earlier, have commercial ambitions, or at least the founding team do, and so there ends up being some decision-making around what's commercial and proprietary and what's open-source. Do you face this at all with Istio? And do you have to reconcile Google's desires for the project or IBM's and that sort of thing? Sven Mawson: Yeah, absolutely. So I think there's a couple of different things going on here. So Google's interest in the project is, let's say, twofold. On the one hand, Istio, by its nature, as I was talking about, makes it easier to migrate things from VMs to Kubernetes, but it really, it makes it easier to migrate things in general and to manage things in general. Right? Eric Anderson: Mm-hmm (affirmative). Sven Mawson: It reduces the operational cost of running services. That's why Istio exists, to basically make people have less effort required to operate services and actually help them move faster too. So Istio, by doing that, by removing a lot of these roadblocks to migration and moving, levels the playing field for us. So it levels the playing field for Google, It makes it easier for Google to get people onto Google Cloud, right? Eric Anderson: Yeah. Sven Mawson: That's why Google Cloud exists, is to sell Google Cloud. But the other part of it, is once you have Istio, there's lot of higher level things you can build on top of it. There's a lot more management layers, UIs, and actually a lot of machine learning you can do based on the data that Istio can generate, that you can build really cool products on top of it. So I think basically all of the companies that are involved in Istio, that are vendors at least, are thinking in that space. So VMware actually announced a mesh-based product, IBM has one, cisco had one built with Google. There's all these different people that are involved in Istio, are building on top of it. It's a base layer that everyone can build on top of and can make money from building really great value on top of it. Sven Mawson: Related to this, one of the critical things about Istio is actually we really want to commoditize a lot of the base stuff that we don't think actually people should be charging for. And that's where, of course, we got into trouble with trying to discuss with Nginex because they want to charge for a lot of these things. We think they should be a commodity. We think L7 load bouncing and, gateways, and basic collection of metrics, and basic security, that all should be a commodity that everyone should get because it's important. It makes everyone's lives better, and it opens up a lot of room for these higher level features and services to build on top. Eric Anderson: Got it. Tell us where we go from here. So Istio recently was 1.0, and what's the next steps? Sven Mawson: I am very sadden to say that we still haven't launched the 1.1 release. We actually just posted to the community about this. So 1.0 was, as I mentioned, what people were asking us for, for them to start really using this. And it turns out once you get a large number of people using something, they find all kinds of problems with it. Eric Anderson: What's new? Sven Mawson: So we've been spending a ton of time, basically fixing bugs, stabilizing, improving performance. Actually a lot of work and performance on these corner cases we didn't know about, or scale we weren't anticipating people wanting to use. Actually, that's been something else probably that I did not anticipate, but we learned, was Istio solves a lot of problems that happen at scale. And so it actually ... Sven Mawson: Unlike a lot of projects where you can launch first and then scale later, it solves problems that people face when they scale, and so the people that have high scale really want it. And so now you have the, "Oh, okay, we have to handle these huge meshes with tens of thousands of services and ridiculous amount of QPS." Some of the system started falling over, and we were keeping the entire topology of the mesh in memory on every node. And that's just a pretty ridiculous amount of memory being used. So yeah, lots of bug fixes like that, lots of cleanup, trying to make this much more scalable, much more performant. So that's been the biggest focus for 1.1. There's been a couple of architectural changes we're doing too. Sven Mawson: Going forward, I think the most interesting stuff is around architecture probably. We still have some work to do on this multi-cluster mesh thing. We haven't really talked too much about that, but one of the things Istio also makes really easy, as part of the making things easy to migrate, is that it makes it easy to do cross-cluster networking and communication without having to program that into each application. Sven Mawson: So let's say you have one service in one cluster in one region and it's talking to some other service in the same region. For whatever reason, that one goes down or the cluster it's in goes down, but there's other replicas and other clusters available. You set up Istio right, you can actually just have that fail over to the other place, and maybe your latency goes up a little bit, but your whole system doesn't go down. So this whole notion of multi-cluster meshes is something that we still haven't quite nailed, at least to my satisfaction. We launched, maybe it's in beta, it might even be actually in staple now, but no, there's still some cleanup there to make that all work better. Sven Mawson: But yeah, beyond that, the main stuff coming in the future for Istio, I think actually the biggest thing is what we're calling Mixer V2. So I don't know how familiar you are with the architecture of Istio, but there's this thing called Mixer, which runs as a separate service, and the Envoys call out to it for really two purposes. One, is to check whether an operation is allowed to occur. So that's like the access control quotas, other kinds of things like that. Should this be allowed? And then the other, is just reporting, reporting telemetry. So here are the metrics, there's the logs, here's the traces, all that kind of stuff. Sven Mawson: We're actually in the process of taking all of that and making it a lot more easy to move around the system, including being able to run it actually in the proxy. So rather than calling out to a separate service, if you want to, you can just run it in the proxy, and that basically it just removes one of the services from Istio, makes it a little bit less complex. Istio is down to like three components at that point, and it makes it a lot simpler to think about, and we hope also, able to scale better. You don't have to scale up this separate service if you don't want to. It scales linearly with the number of your proxies, so you don't have to worry about that stuff. That's the biggest change planned right now. Sven Mawson: There's a ton of other work around actually ... I did mention VMs. I want that to get better, make it a lot easier to move those things in. We have some more work we're trying to do to make the various bits of Istio easier to use in isolation, so easier to adopt pieces of it. It was built as a monolithic thing, where everything works together and you need all the parts for it to work. I mean, it's a bunch of microservices, but you need them all. They didn't individually have their own APIs and contracts and all that stuff, and so we've been busy adding that stuff on, engineering hygiene stuff to make it easier for people to adopt. And people, like Pivotal, have actually started to help us with some of that because they want to use some of the pieces in their systems and make some of this stuff available, but make it integrated with their systems and integrated with their models. Eric Anderson: And maybe further transitioning, where does this take you, Sven? I mean, earlier we heard that you're a project starter. I don't know if there's any anticipated paternity leaves coming up. Sven Mawson: I actually just got back from one this week. Eric Anderson: Oh, very good. But is this something that you could do for the rest of your life or how tied to you are Istio going forward? Sven Mawson: So the rest of my life is a pretty long time. I'm not that old yet. Eric Anderson: Yeah, yeah. Sven Mawson: I've actually been thinking a lot about this because of my tendency to start new projects after coming back from paternity. It's not there yet. There's still so many really interesting problems to solve. And especially inside Google, I mentioned we are building products and services on top of that, and all that is very, very early. We have some amazing ideas for really cool stuff that we think people are going to love, but it's going to take us a while to get all that stuff out and all the things I want to get done. So yeah, I think there's at least another three or four years in there for me before I start to think about what's next. But who knows? You never know. Eric Anderson: That's awesome. I want to thank you, Sven, so much for spending your time with us. Istio has been exciting to watch, from my vantage point, and it's even more interesting to hear where it came from and now where it's going. We'll stay tuned for further progress. Sven Mawson: All right, thanks. It was great to talk to you. Eric Anderson: You can find today's show notes and past episodes at contributor.fyi. Until next time, I'm Eric Anderson, and this has been Contributor.