Thomas Graf: I think what's helpful in this context is to be honest with your users, and from the start define what the scope of the open source project is. And then don't hold back within that scope. Eric Anderson: This is Contributor, a podcast telling the stories behind the best open source projects and the communities that make them. I'm Eric Anderson. Eric Anderson: I'm excited to have Thomas Graf on the show with us today. Thomas is one of the co-creators for Cilium. Thomas, why don't we start by having you tell us what Cilium is. I wanted to offer an explanation, I was going to say it has something to do with the new eBPF observability layer of Linux, but I think you would be more precise than I. Thomas Graf: Sure. First of all, Hi Eric, thanks for having me on your podcast. My name is Thomas. I'm one of the co-founders of the Cilium project. What is Cilium? Cilium is quite easily network security for the cloud-native age, for cloud-native secure, for cloud-native clusters Kubernetes using eBPF. So it brings the power of eBPF to Kubernetes and to cloud-native world. Eric Anderson: Awesome. And what we wanted to get into today is how this project came to be. I noticed that you've been doing Linux networking forever. I mean, not ever, but for quite a while. Thomas Graf: Yeah, absolutely. I joined Red Hat over 15 years ago. And I was a kernel developer for about 10 years. And what I noticed is that, I've been working on this kernel technologies for a long time and then what I often notice is that 10 years later, these pieces on the code that I would write would then be used in projects, like open shift. I was at... I had during the early days of open shift. Thomas Graf: And what I noticed, the pattern that I noticed is that these technologies get used way later, 10 years later because it takes such a long time for kernel technologies to become available to the user. And that was the starting point of selling. That was the starting point of, can we do better than that? Can we stop writing our kernel code at one point in time? And then 10 years later, you stop for some new layers, for example, networking or, or something else. Because at that point, obviously that technology is very outdated. It has not been designed for that age. And that was the starting point of Cilium, like, can we use eBPF to do something groundbreaking new and leverage the true power of it to create something that is truly cloud-native. Eric Anderson: Got it. So I guess the history of Cilium, I suppose, could start with your work and networking in Linux from a long time ago. And also maybe with the emergence of eBPF, I imagine eBPF came first and then Cilium. Is that fair? Thomas Graf: Absolutely. I think eBPF, as an idea goes all the way back to the foundations of the overall internet. It was the same people in mending internet very early as Linux or eBPF was actually created. And the idea of having programmable package fillers, which is really the foundation of BPF is very, very old. It's actually older than I am. eBPF has kind of a next generation form of that. And eBPF has been there before Cilium, when we started thinking about Cilium, we have specifically extended eBPF further to extend some of its capabilities in particular, in the visibility area and also in the networking area. And then based on that foundation built Cilium. So I think it's fair to say that clearly, Cilium does not exist without eBPF, but it's also fair to say that a lot of the reasons evolvement of eBPF have happened because of Cilium as well. So it's going both ways. Eric Anderson: I just want to clarify that point you made at the beginning, I think you were saying that there's a bunch of networking tech that you and others wanted to push into people's Linux deployments, but it would take a long time, years for them to get through. Are you saying, get through approvals or just to get baked into the kernel and then for people to pick up into their applications? And you're saying by creating eBPF, you can make changes sooner to impact users quicker. Thomas Graf: Exactly. At the core level, that's what eBPF unlocks. Prior to eBPF, you have two choices; either you went all the way to change to Linux kernel source code, convince lenders and others that your changes important enough that they will make its way into Linux kernel. And then once that is actually in there, wait, many years until that Linux kernel version is used by your users and your customers, which is many years. Thomas Graf: A kernel is not something that you update all the time, it's something that is supposed to be very stable. So you're not updating this very frequently. Then your second option has always been, [inaudible 00:04:45] which is a way to dynamically load additional kernel code. That's very risky though, insecure or a malicious kernel module can easily crash. So many users, many customers are not willing to load arbitrary kernel modules and BPF strikes that perfect balance. It's allows to dynamically extend the Linux kernel, but in a completely secure manner, because eBPF programs are verified and the kernel will recheck programs which are not safe to run. And this is this perfect striking balance that gives flexibility, gets the dynamic extension capability while being secure, very similar to containers, eBPF programs run in a sandbox environment and this powerful combination unlocks a completely new set of tooling and selling as one example here. Eric Anderson: So, eBPF came about, and how soon did you realize that the Cilium project was something you wanted to do or something of value that needed to be pursued? Thomas Graf: I think the moment came when containers started to really kick off. I was around for the entire virtualization. Also in the networking aspect of that SDN and the programmability that SDN brought was really interesting. I moved a lot of functionality that existed in networking harder before onto Linux servers, when containers came around and it became clear that containers will change how we deploy applications forever. It also became obvious that what we saw with SDN, what we saw with projects like Open vSwitch, that's only the starting point. And that is a great step or giant leap forward compared to networking hardware, but it's actually a micro step, if you look at what's actually possible with pure software. And that was the starting point of Cilium, that's when we basically sat in a room and asked ourselves if we have all the power of eBPF, how would we do container networking? How do we do container network security? How would we do it if we have all of this power? And that was the first commit of Cilium. Eric Anderson: And take me to that day. Who were you working with? You mentioned co-creator and how did you come to know each other? Thomas Graf: It was a group of four people, almost five years ago. Back then we were working at Cisco. We were trying to figure out what is the next thing, and eBPF... We clearly knew that we wanted to do something with eBPF. The people that I've been working with me most network refried before Cisco as well, been good friends. And that's where we... When we sat together and figured out what to do. And after about two days, the core concepts of Cilium have been clear and we started writing the code. Eric Anderson: And after a decade in the Linux kind of open-source project and code base, I imagine it was exciting or maybe even odd to be on your own, having your own governance. Thomas Graf: The change was not quite as abrupt as it might appear. In the beginning, we spend probably half the time, half of our energy actually extending the eBPF runtime further. So we actually continue to spend a lot of time inside of the Linux kernel to expand it further. And then as the capabilities of the eBPF runtime grew, more of our efforts shifted over and Cilium became public. There was more interest around Cilium and that also drew more attention, our attention to Cilium and to users of Cilium. And these days we're still doing a lot of kernel developments, we're still moving the eBPF for on-time forward as well, though it's not like we have completely lost that site, but more and more, we can just leverage the power of eBPF from that. Eric Anderson: So, how do you find your... Imagine at some point you're trying to find your first users of Cilium and eBPF and there's maybe some evangelism required. How do you find these people? Thomas Graf: That's an interesting point. So, we went into a very extreme direction in the beginning. We basically sat down into how would the most forwarding looking container environment look like, IPV6 only massive scale, very tight security, least privileged everywhere. And we sat down, did a initial kind of slide tech round and data minimal work required to get Cilium there to have a proof of concept and then mentor conferences, instead of talking about it. And very quickly we got feedback like; this is amazing, this is the future, let me be part of this. And this is how we got to first contributors. And in fact, that's been like almost five years ago. And if we look at how people deploy clusters today, it's exactly like this. Thomas Graf: This is exactly how modern container environments look like. We're seeing users breaking the boundaries out by pre before, and going into IPV6. We're seeing security constraints getting tighter and so on. So a lot of what we have done on the whiteboard back then has been rendered accurate, not all of it, but that's one of the best ways to get traction around an source project, is to go quite extreme, paint a future, proof that you can do it and then get feedback based on that. Eric Anderson: You talk like this has been features you've wanted to put in Linux for a long time and this project is five years old, but in some ways it feels like this is all very new. I don't think people were talking about eBPF at conferences. It feels like until just a couple of years ago, notably, Netflix seems like they have Brandon, I think is his name is. Netflix has done quite a good job of getting people excited about eBPF. Is that fair? Or if I just missed the boat and there was a lot of talk about this before. Thomas Graf: No, and I think there's always multiple stages. You have very early stages where a handful of people see the true power of something. And there's probably a lot of forward thinking involved at that point too. Can we be matching? What's possible? And we clearly saw it at that point. And then there is a next phase. And I think, and then that second phase we saw companies like Facebook, Netflix, Cloudflare, to really leverage eBPF for their own usage. You would be surprised for how long Facebook has been using eBPF in production for pretty much all network traffic in and out of their data centers. Thomas Graf: For how long CloudFlare has been using eBPF for traffic management and for needless mitigation and so on from a vital usage of users with very concrete and high requirements. eBPF is actually in use for several years, like three, four, five years. In the most recent years, eBPF is getting really mainstream, we're seeing even more common use cases, maybe not like the extreme scale use cases, maybe not millions of packets per second. We see those use cases, leverage eBPF as well. For example, the fantastic visibility that eBPF gives you or the fantastic network security properties and so on. So we're moving from this extreme use cases, Facebook, Google, Netflix, Cloudflare and so on. And we're getting more mainstream. Eric Anderson: This is something you've been... The principles behind this, you've believed in for a long time and it sounds like everybody's moving in this direction. Have you been surprised by anything as eBPF and Cilium have gone mainstream, use cases you didn't consider or situations in which you've had to adjust your expectations of how people will adopt this? Thomas Graf: I think absolutely with almost everything. I think most often we are overestimating how quickly something new gets adopted. So I think initially there was definitely a time where it's like, why is this not catching on quicker? Why is not everybody jumping on board? And now we're in a phase where it's like, we can't get enough validation, like Google recently announced putting eBPF and Cilium into GKE such as their entire dataplane. So we eventually got there, but it definitely... Obviously it can never be fast enough, but then it always also takes longer than you hope. Thomas Graf: At the same time, we haven't really fundamentally changed anything that we're doing. We definitely event took one step back, but that was definitely expected. In the beginning we were IPV6 only and very extreme focus points. And then obviously we took it back a little bit, but that was always planned. So we didn't fundamentally change anything and we didn't change any big decisions. Some external factors played a role, for example, when we started Cilium, Kubernetes was not really mainstream yet, Docker was dominant. So Docker was also our primary use case. And then with the shift from Docker to Kubernetes, Cilium had to adjust to that as well. Eric Anderson: How have the contributor base evolved, I think at the beginning you said there were four of you several at Cisco. Are those still the four core contributors today? Thomas Graf: Most of them are actually still contributors spot. We obviously expanded heavily, about one year into the Cilium project, we actually found that a company around it, Isovalent, several employees obviously became contributor and then more and more customers and users started to contribute as well. And these days, we're I think just above 200 people who have contributed to Cilium, some of them very regularly cloud provider users, large users. These days it's very common or it's normal for a Cilium release to receive more than a third of the new features from non-core contributors for example. Eric Anderson: Which is quite impressive. I think a lot of open-sources, just single developer with a lot of people filing bugs and requesting features. That's exciting. Thomas Graf: It definitely takes a lot of hard work, to build up an open-source project that is healthy. That doesn't just happen. It also takes a lot of dedication, takes a lot of hard work and a bit of luck as well. And a lot of early investors or lot of early investment into users and contributors that provide feedback. We're very fortunate that we have a healthy user base and a healthy contributed pace, who is first of all, signaling us into the right directions, like this is good, this is bad, this is what we want, this is what we don't want. And also people who are very passionate about it and put a lot of energy and hours into Cilium, whether it's code testing, documentation, getting started guides, all of that plays huge role into making an open-source project successful. Eric Anderson: Any launch milestones? Was there a certain day in which you felt like you launched Cilium or a certain version in which there was a lot of preparation to show it to the world? It sounds like it was more organic kind of growth over time. Thomas Graf: So we were actually more traditional, I think in more recent years, it's more common now, open-source project get developed in private and then there's this big launch day and you kind of open source it. And then even some open-source projects will start counting from that day on. I think two bananas is a good example of this. It was almost production ready way. It was open source. We open source from day one, the first commit we did with Cilium was public, from the very beginning it was public. So we had a very organic growth. We had several extremely good moments for the project. We had a big Docker kind of talk early on that was very helpful, that definitely showcased Cilium around. CubeCon has always been fantastic for us, lots of users, lots of great feedback. And then more recently the Google announcement. So there's definitely milestones where the project is being brought forward in making a big leap, but the actual growth of the overall community has always been organic. Eric Anderson: Is it ever tricky? It sounds like you were co-developing or extending eBPF at the same time you're doing Cilium and it wasn't always clear where a certain feature would go or is that... Were there ever debates about what goes into eBPF and what goes into Cilium? Thomas Graf: There's always debates. And I think at the same time, the big advantage of eBPF affairs, it is very general purpose. eBPF is not networking specific. It's being used for application profiling, tracing, visibility, security LSMs, container Cisco filtering. So improving the eBPF from time is similar to improving a programming language. It will benefit everybody. Which means there's less discussion around what should go in and what should not go in. So all the discussion on terms of use cases and what we should do or not do, all of that has been more on the Cilium cyber's than eBPFs side, where obviously we have a lot more controls. So we could put something in, ask users to run it, get feedback on that. So that's been extremely helpful that we depended a lot less to convince a larger set of contributors inside of the windows kernel to convince them to think along the same way as do we do. That has definitely helped unlock a completely new feature velocity that does not depend on finding open-source consensus all of the time. Eric Anderson: That's helpful. And maybe the same for Isovalent and Cilium, I don't know how the Isovalent product works. Are there also debates about what goes into Isovalent versus Cilium code base? Thomas Graf: I think it's a good question. And I think it's almost a philosophical question. It gets back to open-source, how do you commercialize, to monetize open-source? I think what's helpful in context is to be honest with your users and from the start define what the scope of the open source project is, and then don't hold back within that scope. And from that perspective, we always said from the beginning, and this is still very true today, that Cilium is completely open-source. So it covers networking, network security, load balancing, and all of that is open-source. You can consume it fully, it's complete, we're not holding back scale or a particular feature or something like that. You can make a bet on Cilium music in an open-source concept or context, and it is complete. We won't all of a sudden, not maybe open-source a feature in that particular context, so you can be successful. Thomas Graf: And then Isovalent company offers additional sec ops tools on top of that, which from a use case perspective are quite different from what Cilium offers. And I think this clear split allows the open source project to stay healthy and to not cause any conflicts between people interested in Cilium and people contributing to Cilium to all of a sudden think, Isovalent is making money off this, maybe I should stop contributing. So this split separation of use cases is very essential and you will find and see that other open source projects who are successful at this, as well at monetizing it while keeping the open source project very healthy are doing it in exactly the same way. Eric Anderson: Very good. And then in my research before the show, I saw Hubble, is Hubble part of Cilium or kind of a sister project that is baked in? Thomas Graf: It is. It's not baked in but then, but it's kind of a sister project. [inaudible 00:19:21] not working that works security and load balancing, Hubble is to observability side of it. So Hubble takes network observability, metrics, flow logs, service dependency, graphs, all of that to the next level, with the powers of eBPF. So it's this same magic below and Hubble require Cilium to run, but Hubble has a very concrete focus on observability. So it's not a... What's called a CNI, it's a observability tool on top, it has a UI which has APIs to get flow logs and so on. So it's an extension on top of Cilium, it's an independent open source project. And from that perspective, it does leverage eBPF and uses the same foundational technology. Eric Anderson: Maybe you could talk about the state of the project today and we can shift gears to what plans are for the future. Thomas Graf: I think we're in a very interesting spot right now, and we're seeing a rapid increase in production in Cilium. We've obviously had large scale users for multiple years, but in the beginning, most of the Cilium users have specific reasons to use Cilium and it was usually around scalability or some extreme or use cases. In the last couple of months, we saw a very short take upwards in a more mainstream users just like, Cilium is awesome, eBPF is awesome, I'm just going to use this for networking. Even though maybe there is not a massive scale requirement or there is not a massive security requirements. Thomas Graf: So we are seeing incredibly steep adoption rate, which is fantastic for us. We're seeing our Slack channel explode. I think all of that is awesome. We see cloud providers starting to use us, put us in. We are having interesting conversation with everybody. Definitely very, very exciting, overall it's still early days I think for Kubernetes in general, but it's definitely exciting. The next couple of years, it will be very exciting. We're getting drawn and pulled into various directions and we get to choose what we want to focus on. Eric Anderson: As you look back on that, that period in which you were ahead of your time. I don't know if that's the right way to say it, but you knew that this is where the market was headed, but it wasn't maybe moving as fast as you would have hoped or expected. What kept you believing in working and I guess... Mostly looking at... How would you advise other open source founders who are finding themselves in a similar boat? Thomas Graf: I think I would split a bit the technology side and the co-founder side of a company. From a technology side, there was never any concern that from a technology perspective, this is going to happen. There was never any doubt, but it's not always quite clear when the actual urgency happens on a wider mainstream side that a particular technology gets adopted very widely, but there was never any doubt that eBPF and Cilium will eventually take over more use cases and become very dominant. And then there's the co-founder side where you were much more exposed to market readiness is actually anybody buying what you're building and so on, which is, which is obviously very different. There's a totally different timeline attached to that. And a lot more... Very different pressure points. In general, what I always telling myself is, listen to your gut. In particular, if you have a track cracker that is more or less, right. Just because something doesn't just happen right away doesn't mean that you're gone completely off. And that usually leads to a good long-term thinking as well, at least that's working for me. Eric Anderson: And what should we be looking forward to, or what do you look forward to with the project in the coming year? Thomas Graf: It's very exciting. I think eBPF will, as a technology will get used wider and wider. I think we are taking on more and more use cases for Cilium. I think the biggest step that we're making with the next release is to expand outside of Kubernetes. So far, we have had very specifically focused on Kubernetes users and clusters. And with the next release, we are allowing to integrate virtual machines, bare-metal machines and so on. So we're expanding into what we would call legacy, even though VMs are obviously not legacy at all, but from our very forward-looking perspective, we're starting to cover more and more of this more traditional legacy market. Thomas Graf: And I think even more important multi-cloud, hybrid cloud, edge clusters, very interesting for us. We have extremely promising technologies for these new use cases that come up with providing clusters close to the user to provide better latency, and then somehow connect those edge clusters together. All of those use cases is what we have built into our foundational tech from the very beginning. And we're now seeing demand for what we have built and it's matter of just connecting our tack with those use cases. So, that's definitely two major areas that we will continue to invest in. It's super exciting, Eric Anderson: Maybe to help us all know where to expect Cilium to be used, you could describe what typically brings people to use Cilium and come to love it. What situations? Thomas Graf: There's multiple reasons. I would start out like from the bottom, if you want visibility, if you want to know what's going on on the networking layer, you can run the Cilium at any scale you want. You will always get the visibility, like what rates for troubleshooting, what rates for audit, whatever. If you have visibility requirements, eBPF gives you massively better and more visibility and Cilium can offer that. Then I need to try both a secure, sensitive environment. Thomas Graf: Cilium can go obviously implement, just like stand in the policy for example, but then can go beyond that. Connects very well with things like service mesh, for example, [inaudible 00:25:02] and so on. So, that combination can get you to a whole new level of cloud-native security. We have users interested in transparent encryption and use solely because of that. And then going into the more advanced use cases, large scale,, low-latency multi cluster, for example, multi clouds where, I want to have multiple clusters across multiple cloud providers, or I'm running a multi-region strategy, whether it's running a data store across multiple regions, all of those are use cases where Cilium is doing really, really well. Eric Anderson: Definitely some things are going really well. The project just feels like it's everywhere of late. And I think that the GKE announcement was validating and exciting. And I keep seeing more of that happening. As others are hearing about it as well, any pointers you would give them for getting involved or following the project? Thomas Graf: Absolutely, as I think the best starting point is cilium.io, or you can just search Cilium in your search engine. And cilium.io should be the first search result as well, that will have guides, documentation, links to our Slack channel where you can join, ask questions. There's also eBPF-IO, which is the new community page for eBPF, the core technology that Cilium is based on again with Slack link documentation, getting started guides tutorials, if you want to get involved in BPF and sold, both of them are great starting points. And then if you want to get your hands on Cilium as quickly as possible, the documentation has getting started guides where you can get a sandbox environment, whether it's mini cube kind or K3s andn basically getting your own Kubernetes cluster with Cilium up in two or three minutes, and you can start playing around. Eric Anderson: Just as we wrap up here, I had to ask you, I remember when I first read about eBPF, I think Brendan Gregg had called it analogously like eBPF is to Linux kernel or something like JavaScript is to HTML or to web code. And I think he acknowledged immediately that there was flawed on many levels, but it actually was helpful for me in getting an idea that this is a language and it runs in a VM like environment. How do you feel about this analogy? Thomas Graf: I think it fitted very well, if you're looking at it from a perspective of look at where HTML was 20 years ago, what type of websites we looked at. And how technology in your browser enabled to disrupt applications like word or big word processing applications. Why did that happen? Java script played a huge role. I would say, instead of just focusing on JavaScript, I would say the programmability, making the browser programmable made a huge difference. It is required to no longer ship, new browser versions all the time. If you remember way back, certain websites would not render correctly. If you are using an older browser version, Java script and programmability solved almost all of that. eBPF is very similar, it makes the little trouble programmable as well. And Java script and eBPF share a lot of common ideas about what they unlock. Thomas Graf: I wouldn't go much deeper in comparison. I think there's obviously also... Not everything is perfect with Java script, but if you look at the effects of programmability that JavaScript unlocked, I think we will see, or we are seeing something very similar with the Linux kernel, where all of a sudden it's no longer constrained by what the wings crew can do. You can extend the program that Linux kernel in the hands of an application like Cilium to define where the boundaries of the operating system are. And that's very, very exciting. That was not possible before. Eric Anderson: Awesome. Well, Thomas, thank you for coming on the show and thank you for your decade long kind of effort to bring better networking and visibility to Linux kernel. It's changing how people are building applications, and I think we're all the better for it. Thomas Graf: Yeah. Thanks a lot for having me on the show, Eric. Eric Anderson: Have a good day. Thomas Graf: You too. Bye, Eric. Eric Anderson: You can find today's show notes and past episodes at contributor.fyi. Until next time, I'm Eric Anderson and this has been Contributor.