Philip O'Toole: Successful software, I like to say, is nothing really to do with computers, it's actually all about to do with people. Eric Anderson: This is Contributor, a podcast telling the stories behind the best open source projects, and the communities that make them, I'm Eric Anderson. I am excited to have Philip O'Toole with us today. He's the creative rqlite, or, how would you say it, Philip? I've heard it pronounced a couple of ways. Philip O'Toole: I often try and call it rqlite because I think most people say SQLite, but it's awkward, and most people, including myself, usually just say rqlite. Eric Anderson: So either is fine, but great to meet you. Thank you for coming on the show. Philip O'Toole: No, happy to be here. It's always a pleasure to talk about the database. I really enjoy talking about technical topics, so happy to do it. Eric Anderson: Tell us briefly what rqlite is before we get too far. Philip O'Toole: Yeah, so I think rqlite is best described as what it really tries to do, and it tries to be a highly reliable, easy to use, distributed relational database. So what it actually is is, it's a distributed relational database built on two core technologies. One is Raft, which is the Raft consensus protocol, which is at the heart of many distributed systems, and the other piece that it's built on is SQLite, which they like to say is the world's most widely deployed database engine. So almost 10 years ago, I took both of those technologies and put them together for fun to see if it would work, and so it did, and we've ended up with a database or I've ended up with a database that I think is really easy to use, really high quality, works well for some particular use cases. Eric Anderson: So SQLite is already a database- Philip O'Toole: Sure. Eric Anderson: ... But you've made it a distributed or ... what's the term? What's the adjective you'd use to describe this variant of SQLite? Philip O'Toole: So SQLite is an in-process database, so it's a library that people link with their programs, when they wanted a high quality way to model their data in a relational manner, and store it. So what I've done is I've taken it and put it in a server client application, or server client architecture, and made it into a distributed system. Now, it's interesting you asked about this Eric, because this is one of the more controversial arguments I have. People will sometimes say because rqlite distributes SQLite fully to every node that it clusters, people say that it's a replicated solution, but actually I disagree, it's a distributed database, it just happens to have a full copy of the data on every node. Sometimes when people think of distributed databases, they think it's for resharding or performance. But no, what rqlite does is, it distributes SQLite to every node, such that it is a highly available and reliable and fault tolerant system. So rqlite distributes SQLite for fault tolerance and high availability, but not necessarily for performance, like some other distributed databases do. That's why it's considered a distributed database, and it uses a really important distributed consensus algorithm, Raft, right at the center, so that's what it does. Well, there's one last thing I should say, because it's probably the second-biggest misconception, it's not a drop-in replacement for SQLite, what it does is it takes SQLite and uses it as its engine, that's the key difference between it and just native SQLite. Eric Anderson: We should probably get some use cases out of the way, to motivate what you've done here. You already alluded to the fact that there's some trade-offs as there would be in any technical decision for this approach of distributed database. What do you think people most use rqlite for? Philip O'Toole: Yeah, so the other big thing that I think has going for [inaudible 00:03:43] it's very easy to deploy, and it's relatively lightweight to run. So the key use cases I have seen people approach me about it, for example, a company called Replicated, based in San Francisco, is they want to run lightweight perhaps management software, perhaps on a customer's infrastructure, and customers don't want to have to run very heavy software on their systems, because they're paying for the hardware. So I think its key use case is, the storage and management of a relatively small amount of data, though with the release age, I've bumped up the storage possibilities, but a small amount of critical data that you absolutely can't lose for the rest of your system, but you also don't want to invest a huge amount of resources in storing that critical piece of data. So that's why rqlite also focuses on lightweight, easy to use operation that people can just get up, spin in and run it. So I've seen Replicated use it, I've seen other production systems using our Kubernetes. And the other interesting thing that amuses me, a lot of people who are in the Bitcoining community like to run it, because they're often not super technical, but they want to run every piece of infrastructure themselves. So I have seen a lot of people in my community be very interested in running rqlite for whatever type of application [inaudible 00:05:04] that they're building. Eric Anderson: Great, we'll set that aside a bit and come back to the new release and how it works. Tell us a little bit your story, Philip, you've been doing this for a while. What led you to build rqlite? And maybe tell us a little bit what you're doing outside of rqlite. Philip O'Toole: Yeah, so I'm currently an engineering manager at Google, where I manage developers for one of GCP's largest systems, the Cloud Logging system, I manage multiple teams there. But I remain highly technical, and so I still like to write software in my spare time, even as a manager. But the history of rqlite goes back to before I was even a member of InfluxDB, I got very interested in the GOAL programming language. And I was at a startup, which is long gone now, which was using SQLite to store some information. Well, I was thinking about trying to make it store information in clustered manner. And I'd also had experience with the Raft consensus protocol, so at the time I said, "Maybe we could solve this problem by combining the two together," and that kicked off the idea in my head. Now, that startup never actually ended up going down that path, but it was the combination of a need at the time, and a startup I was working at, we wanted the simplicity of SQLite, but we also wanted the high availability of something like Raft, and I put the two of them together. And I did this during a time where I've been working at many startup spaces in San Francisco. So like many startups, the startups didn't last, but the software that came out of it that I had started is still kicking to this day, and I still develop it outside of my day job. Eric Anderson: So I feel like SQLite is getting a rebirth, a renaissance, there's some enthusiasm around distributed SQLite at the moment, which you've been doing this, I don't know if on your own is the right word, but for some time- Philip O'Toole: Yep. Eric Anderson: ... And now there's this new enthusiasm. What do you think about that? Philip O'Toole: It's really interesting, yes, you're absolutely right. Something has happened in the last two years where this database is suddenly getting a lot more of attention. I see it being driven by two trends, the first one is a rebellion against complex, hard to deploy, really complicated software. And so SQLite by its nature is quite simple to deploy, it's very easy to get going. And so I think that is the first reason why SQLite has become popular. The second reason is clearly, software is being pushed back out to the Edge, after many years of centralization software, and I just don't see this in rqlite, but I also see it in my own professional life at Google and how we build stuff. The kind of hybrid cloud, hybrid deployments, keeping some of your software at the Edge, that trend is there, and SQLite is a great piece of software to have running at the Edge. So I think those are the two trends that have brought SQLite [inaudible 00:07:46] and has got people interested in then distributing it, building it, using it in a peer sense. So that's why I see SQLite coming on, a rebellion against complexity and a real upsurge in interest at running software at the Edge. Now why people want to run software at the Edge is another question we'll get into, but they're the two reasons I see [inaudible 00:08:05]. Eric Anderson: I went back and looked at some of the blog posts you've written over the years, you've been doing rqlite for some time, then you worked on logging, and I don't know if you were ever professionally working on databases, but you've written a little bit about what you've learned from writing databases. Maybe you can help us understand how writing databases is different than logging systems is different than, I don't know, applications. Philip O'Toole: Yeah, you're right. So I actually did work professionally on databases for almost two years, because I was a core member of the team at InfluxDB with politics and his crew. And at the time that database was also written in GOAL so that's kind of really solidified my GOAL experience. So writing databases, Eric, is a really fascinating professional job as a programmer. The first thing that's interesting about it is, often as developers, we are introduced to databases in the most boring fashion possible. There's a textbook that shows somebody creating records for orders in a business, and it's a small little table and you're like, this is the most boring software I've ever seen. But actually writing software for database is really interesting, because it's one of the few times as programmers, algorithms really do matter. Most programmers are working on infrastructure software or applications, where the algorithm choice doesn't maybe matter that much, but when you're programming the core of a database, a algorithms matter. So it's really interesting way to become a better programmer. The second reason is, and I think this is probably the most interesting thing when it comes to working to databases, people who use databases are often frustrated when they get a chance to talk to the designers and programmers of databases, and they'll say to the programmer, "Why is this query slow? Or how can I make this query faster?" And the developer will say, "It depends." And users of databases cannot understand why developers have such complicated answers around why databases have the performance they do. And that's because it depends on so many factors, what the queries are like, what the algorithms are, how much data is in your database to start, whether the data is hot or cold. So I think the engineering challenge and the amount of judgment you have to make as a database developer, is really interesting. The answers are rarely black and white in database development, which is kind of ironic because what a database has to do is very black and white, right, store data correctly, never fail, and don't ever corrupt it. But how you get there is as much an artist as it is a science, so yeah, programming databases is a fascinating thing, and I tell all the programmers that I mentor here, that one thing they should do sometime in their career is try to program or get involved with some database programming, it's a fascinating, fascinating area. Eric Anderson: So going back to SQ Lite, Postgres is also having kind of a renaissance, and is that the natural alternative maybe to what you're doing? Are people deciding, do I use a normal database like Postgres, or do I use something specific to my use case like rqlite? And if that is the alternative, then maybe you can go into the trade-offs a bit more as to why I would use rqlite versus something else. Philip O'Toole: Something heavier. Early in my career I did not use Postgres as much. By far, the biggest database I was using in the relations sense was my SQLite and it carried me a long way basically as a developer, but also building my own content management systems, and running my own blog, it was always my SQL. Postgres was not something that I used a lot until I started to get interested in all its support for things like unstructured documents and JSON and so forth. So I do agree that it is seen as a really interesting database. One of the things I'd like to do with rqlite is to actually put a Postgres compatible connection on the front, so that people could just drop maybe rqlite in and use that instead of Postgres. The biggest trade-offs though come to things like deployment, Postgres can run really quickly, but I suspect that any kind of production environment, it's going to take a little while to spin up to configure, to get right. But I think the biggest thing you'll get with rqlite is it's going to be much easier and simpler to deploy. Now the downside of that is, it's always possible in that rqlite is too simple for your application. So that's the trade off. Software gets complicated for two reasons, and Postgres may be a little bit like this, you have to be careful, it gets complicated for two reasons. Different people are working on it at different times in different stages of the product, and there isn't coherent joined up features. The second reason databases get complicated is because people ask for features, they have different requirements, and they have to be satisfied by the database. So what happens is, there's a lot of things that you can chew. Of course the final difference between Postgres and rqlite is, I'm myself maybe along with an occasional contributor, I'm the main developer of rqlite, so I can decide what goes into it or not. And if I don't like a feature and I don't think it's coherent, then I can just don't build it. Whereas when you're a Postgres, it's a community, there's a lot of different groups driving it, and so sometimes it's hard for them to say no, or say no to each other. And so one of the biggest challenges the Postgres database and community will have would be keeping its feature set coherent, where it's very easy for me to do it because I'm mostly one person. So I think they're the biggest trade offs, easier deployment with something as simple as rqlite light, but it may be too simple sometime for some people's needs. But interestingly, with Replicated, the company in San Francisco, they replaced their use of Postgres with rqlite, because they wanted to run software at the Edge of customer's environment, and customers didn't like running this 200 megabyte docker container just to store some configuration data, but Replicated needed the reliability of something like Postgres, but their customers were rebelling against the technical resources required by something like Postgres. Eric Anderson: Okay, yeah, that's helpful. And I mean you mentioned you're kind of the solo mind behind rqlite. There's other contributors, but you get to control the project. SQLite has a similar mantra, right? I mean the creators of SQlite also hold a lot of control over the feature development, and there's been a fork, we did an episode maybe a month ago on a fork of SQLite, libSQL that is trying to be more open to contribution. What do you think of libSQL and what do you think of the approach SQLite's taken with regard to open source? Philip O'Toole: I've spoken to the guys at Turso a couple of times, [inaudible 00:14:41] met them professionally. I think it's a good idea in the sense that they're putting their money where their mouth is, right? Often people say, "Well, I'll just fork a repo if I have issues, I want to see it different." So they've actually put their money where the models and they're trying to fork it and build a community. From my point of view, the biggest challenge will be, you want the community to be engaged, yet nothing makes software more incoherent than any kind of leadership that doesn't have a strong vision for where it's going. Now, that's not to say that the team at Turso don't, but I think the challenge would be keeping it open, yet keeping it focused and coherent. If they were to introduce distributed consensus into the library, it could be very, very interesting for a lot of people. But now it's hugely more complicated, do they start adding stored procedures into it, right? Something that SQLite doesn't have. Do they start doing different kind of constraint functionality in the database? It's a good problem to have, because if they add all that stuff, it's possibly because people are asking for it and it has become successful. But the biggest, while, yes, the SQLite, original SQLite rather, I don't want to say close, but highly controlled, the result has been very high quality software, that people can rely on, that is rock solid, and people know what they're getting. So I think there's a trade off, look, the thing it is, Eric is, successful software I like to say, is nothing really to do with computers, it's actually all about to do with people. And so whether libSQL would be successful or not, actually depends how they manage the flow or features, and manage the flow of contributions and people who join in into the library. Like SQLite has shown that you don't need a huge community to be successful, you can be open source, but you don't need a huge community to be successful. So a huge community and being more open doesn't guarantee success, but it does open up the possibility that a key feature that is missing from SQLite now has a more open path to get in. So we shall see, the market will tell us, GitHub stars will tell us. Eric Anderson: And maybe one strategy, I guess, you're kind of alluding to is that if they attracted a community of like-minded people who all wanted to take SQLite in a specific direction, then they will likely be able to satisfy those people with a set of features that's coherent, and those contributions that those people come up with will be reinforced. But the problem would be is, if everyone didn't have a similar idea of what they wanted. Philip O'Toole: You don't have everybody exactly on the same path or as the same idea, you will have tensions in the project. And I think managing that is the real trick. I mean something like the Linux Kernel, I mean it's not just a technical miracle, it's actually a social miracle, that such an enormous piece of software has managed to stay together, be coherent, be widely used, it's actually a miracle. So in that sense, we'll see if they can replicate it like that too. I have been reluctant to pull in ... and aside in implementation detail, I have been reluctant and have not pulled in anything but Vanilla SQLite source, rqlite because one of my values, which is important to me is it's very easy for me to say, rqlite is running Vanilla SQLite full stop. No patches, I don't change it, I didn't tweak it. For example, there's another similar project called dqlite. It's rqlite, but instead of R, it's D, it's a library. It came after rqlite, it's actually very useful for people who don't want all the application side of things, but they had to patch SQlite a little bit, and now you've got all the differences, all the problems of managing differences in that and the Vanilla source code. So I am aware that for rqlite to be able to say it only uses Vanilla SQLite can give a certain comfort to people, who be like, well, we will SQLite is solid. This is running literally the same code because GOAL code can link C, which is great. So there is advantages to staying on the Vanilla track too. Eric Anderson: Before we move off your approach to open source, you've also written about how, over the several years you've been working on rqlite, you've kind of worked in fits and starts, bursts, if you will. That resonated with me in my own work that, we think of things, projects as linear, but they're really not linear. Philip O'Toole: So I would say there are three main reasons that it has been in fits and starts. I mean, one of them is personal, right? Things happen in your life, you'll get married, you have a family, you get busy. My wife might not been too happy if I was hacking away on RQlite the night before we got married or something. So you have to stop, take two weeks out for your honeymoon, right? You [inaudible 00:19:18] some of it is personal, but the main reason it's been in fits and starts is, a feature or a re-architecture will recur to me, that's so compelling that I have to do it. I suddenly have an insight into how a large amount of the system can actually be removed, because there is a way I can use the RAF library so much better, and I will not be able to resist going in and making that change. For example, the most recent release, I realized how it could make it support much bigger data sets, and the technical change is so compelling to me, and I could see there being so much value, that I would spend weeks at a time programming many hours a day, to get that out. The second reason that sometimes development suddenly takes a burst of energy, or is the project will suddenly get a lot of exposure in some channel, and somebody will suggest something so compelling that is completely coherent to where I think the database should go, and then I'll dive on that too. So I think the main two reasons are an internally generated insight into how the system can be so much better, or an externally generated insight or product idea that I'm like, "Yes, this is exactly what this database can do." And so to me, the biggest driver of productivity has been passionate to do something, but by its nature it isn't there all the time, it comes and goes. And sometimes and your body and your mind, I will find as a programmer will tell you, you'll feel, "Okay, I can't program now for a few weeks on this." And sometimes it may feel like you'll never program on the product again, the project again. But it has always come back for me, that drive to go in and make [inaudible 00:21:04]. Eric Anderson: So maybe this is a good time to talk about the new release, you had one of these moments recently, maybe it was outside inspiration on how you could approach a change to SQLite, tell us what's just launched. Philip O'Toole: Yeah, so actually to go back to something you asked earlier on, you were saying what has been interesting about programming databases? One thing that has been interesting is, people who pay for databases are at the Edge of the use cases. In other words, sometimes you will hear programmers say something like, "99% of the use cases are going to be fine on this database. It's just that 1% that will have that Edge case. Listen, let's not worry about it, the return on investment is in there." But the thing is, it is that 1% that pays all the money for databases, right? They have an enormous dataset that they can't query anywhere else, so that's actually where the money is. Not that it's all about revenue, but it's the tail of database use cases that are actually often the most important and most valuable for people. To that end, I had a user, he's a company in the states that used the database, and they want to load it with very large data sets, 6, 7, 8, 9, 10 gigabytes. And I was like, rqlite, it's not really aimed at that use case, which is a way sometimes developers say, I haven't a clue how to make it work. That's really sometimes what developers are saying, I don't know how to make it work. But then I used to work with another programmer, one of the best programs I ever worked with, Ben Johnson, you may have heard of him, who created Litestream, he was also part of the InfluxDB core team. And Ben came up with a really novel and elegant way to back up SQLite databases, and I have to give them credit and I give him credit in the blog rolls, I realized that the same approach could be applied internally to rqlite and suddenly would lock it's ability to support large datasets. The reason I say that is, the implementation of rqlite requires a periodic update of the database, and as the database would get larger, that update would take longer and longer and longer. Until Ben had this insight about how to back up SQLite databases in an incremental fashion very quickly, and I suddenly realized I could apply his approach to rqlite. So now I see no intrinsic limit to the size of data that rqlite could support, short of course the hardware that the end users were to throw at it. So that's the big release with 8.0, it has a lot big data sets support so people can store much larger data sets, but still in an easy to operate [inaudible 00:23:34] manner. So we'll see how it goes. Eric Anderson: So let's go into maybe this feature a little bit more, I think I appreciated more how this whole thing worked, Raft and SQlite by understanding this change. And maybe that's not, maybe Philip you feel like I'm still missing quite a bit, but you mentioned that the way this works is that you create copies, duplicates, replications of the instance of the database on various notes, and you did this it, sounded like historically, by making full snapshots, and you said periodically. How frequently were you doing that? Philip O'Toole: Let's take a little step back for a start, because it is in a core part of the Raft system. So I think maybe Eric, you'll find this interesting. So Raft in a sense is really simple, what it is it's a way of saying, I will write a log, series of changes that happen to a system and I will make sure that that log is the same on every note. It'll be like, Eric changed this record, eric changed this record, Eric changed this record. But if you think about it, what's to stop that series just growing to infinity? Because as people use your databases all the time, so Raft has a technique called truncation where it deletes entries from your log, to stop it growing without bound. Does that make sense so far? But as a result, you need to take a snapshot of whatever data base the system is updating every so often, and keep that. So now you've got a snapshot of your database, plus any other updates that need to happen yet. And it was that snapshotting process that has to happen every time you want to truncate the log so that it doesn't grow unbound. That was the one that rqlite for a long time, simply did by taking an entire copy of the database and giving it to Raft, it's kind of an implementation detail, but Raft, basically what Raft does every so often is, it asks you, the rest of your application, I need a copy of your state, I need a copy of your state. And what rqlite used to do is say, okay, here's an entire copy of the database, knock yourself out. But if you started to get to two and three and four gigabyte databases, it's kind of getting big, particularly if the delta between the last time you gave it a full copy of the stage was quite small, right? You'd be having this huge amount of data when maybe only 1% had changed, so that's the problem, I hope that makes sense. And what Ben's insight allowed me to do is, basically if you put SQLite in this right ahead mode, the Deltas are now stored in a separate file from the main database, and now what 8 0 does is, it simply just hands that little delta file to RQlite to the Raft system every time asks for a full state, I hope that makes sense, and that way there's much less data being handed from my system to the Raft system, and then Raft can just deal with it then, that was the key [inaudible 00:26:23]. Eric Anderson: Super, and did Ben work with you on this, or you had just learned this from Ben and hold up and implemented? Philip O'Toole: So Ben and I are all colleagues, we used to work together, we used to work together at InfluxDB on the core system, so I've always admired his programming, and when he released Litestream, he released a lot of technical details. Ben is the kind of engineer that I really appreciate because he's very generous with his knowledge, and his time, and so he didn't just release Litestream, but he also explained how he built it. And so as I thought about what he had done, I realized I could apply it to my system, and he certainly had some helpful tips along the way, but like all great ideas, his idea was actually quite simple, it's only simple in hindsight. So once I understood what he was doing, it was relatively easy to replicate and [inaudible 00:27:09] rqlite. Eric Anderson: So at some point I wanted to, and this is as good a time as any, compare what you're doing to other, I mentioned there's kind of a lot of enthusiasm around SQLite at the moment, and you've already talked about DQLite in passing, I don't know if you have more you wanted to say on that, and you've brought up Litestream, and then we've also talked about the folks at Turso and they also have a distributed SQlite. So there there's four approaches, if you include rqlite, dqlite, litestream, and maybe LiteFS could be a variant as well, Philip, walk us through the options available to somebody today, and what you think of each of these. Philip O'Toole: Yeah, sure. So let's compare, so dqlite and rqlite are very similar conceptually, they have one big difference, dqlite is the library that you need to integrate with your own application, so it requires programming, whereas rqlite is a full RDBMS, right, relational database management system, that you just deploy and you operate with the way you want. So that's the biggest difference is the level of abstraction that has presented to your application. Now those systems are in one bucket, I then think, and there may be some technical corrections here in the future, but I think they're in one bucket. All the other systems that you mentioned, LiteFS, some of the work that Turso is doing and Litestream are in a different bucket. And the biggest difference is, dqlite and rqlite make very strong guarantees about what the state of your data is in at any one time. So when rqlite responds to your all right and says, yeah, I got it, I'm done. You can be absolutely sure as a programmer that your data is safely replicated on every node that your cluster is running, and if you lose nodes, you're not going to lose data, so it gives you very strong guarantees. Litestream for example, and I don't want to overstate it it's a small difference in practice, but Litestream has a small window where when it accepts data, it may not be fully replicated out to the cloud yet, it can be made very smaller. Like I said, I don't want to overstate it, and normally for many applications it doesn't matter, but there is a small difference there where you could have a write, and then there could be a crash and you could lose that data, but what you get in return is way more write performance, so it's a trade-off. LiteFS, I don't know it too well, but it is also replicating SQLite data, but it's doing it at the file system level. So it's very transparent to programmers of SQLite. I'm not quite sure what guarantees it involves around the data, I suspect there's some trade-offs that you can make as a user of LiteFS. And then the Turso developed system I believe is more of a peer-to-peer system, which it makes it in a very different category altogether, as far as I know, but I could be wrong there, I'm sure they'll correct me. But I think the biggest difference between the two is, rqlite promises very strong guarantees about the state of your data once it has been accepted. This is the sort of thing that banks do, banks can never say, "Oh, we crashed, we lost some of your account details." So rqlite offers you very strong guarantees, where some of the other systems are offering weaker guarantees, but then they offer way more write performance in exchange. Now the interesting thing, Eric, is it has come full circle, I actually added a more to rqlite earlier this year or late last year that allows you to make the same trade-offs as well. If you're prepared to tolerate the small risk of data loss, you could put rqlite in to a mode where it performs, its write performance increases by about a hundred. But fundamentally, I think they're two biggest differences. How important is it that your data, once it has been written, has been written across the cluster and won't be lost under any circumstances, short of [inaudible 00:31:01]? It depends on your application, right? That's why I said one of the key use cases is a small piece of data that you think would be really important. Eric Anderson: You mentioned that Turso is a peer-to-peer system, which I assume implies that's different than the others. The others are more of like a leader follower, there's kind of a master, is that ... Philip O'Toole: I may be wrong, really what I'm trying to get at is, having looked at their docs, they allow writes and distribution of data at the Edge, and that makes me think that they don't require information having to go to a central system all the time. But I may be wrong. I don't know every detail about their application. I will say one of the biggest things that they push is, you can use their system to distribute data out to the Edge. rqlite can do that too, but it's not really built specifically to push data out to the Edge. Their system is very much about using [inaudible 00:31:52] SQL to build their cloud on top of that can push data out the Edge. It's an Edge oriented solution too, but I don't consider rqlite specifically for the Edge, so I don't want to overstate that I actually think I know how its architecture is, but I will say they advertise very different use cases than I do for my system. I imagine rqlite running as a key piece of infrastructure in someone's centrally hosted system. Like you are running a fleet of workers somewhere in a data center that is processing photographs, and you have a small central piece that needs to coordinate what those workers are doing, and it can't go down, that's what rqlite to be used to store information. Whereas when you look at the Turso offering, it's very much about using their system to get data out to the Edge in a manner that feels like SQLite at the Edge. I hope that makes sense, I'm hand waving a little bit, but it's more the application use cases are very different, and I suspect that leads into how it's built. Eric Anderson: It resonates with what you said at the beginning, the two trends driving SQLI enthusiasm today were rebelling against how to deploy software, which Rqlite does a great job of, and then this idea that software is being pushed to the Edge, which maybe is something that the Turso team it sounds like has been more focused on. Philip O'Toole: And you can use rqlite, it has a mode where you can deploy these things with the Edge. Eric Anderson: Yeah, to the extent you've looked into it, tell us about the Edge enthusiasm. I mean some people have gotten excited about local first software, the idea that I could have kind of a database on my machine, which is a common use case for SQLite, but that synchronizes with the distributed database to survive persistent network failure or something. What else is kind of the Edge use case as far as you see? Philip O'Toole: It's funny, Eric, if I had been more commercially minded driven when I built rqlite first, I might be in the Edge space myself. One of the biggest requests I got when I started first building rqlite many years ago is exactly the use case. People kept coming to me saying, I want to deploy a SQlite database at the Edge, I wanted to synchronize from a central service, but I want people to be able to write their updates at the Edge, and consolidate changes when their device joins the network again. I had one person approach me saying they wanted to run rqlite on half the phones in AT&T's network or something like that, and I was like, "=That's not going to work." But because I was so interested in building the technology, I just ignored the market demand, so it's interesting to see the use case was even there years ago. I think that this comes from, I mean to me it's pretty obvious where the demand is coming from. The proliferation of software at the devices at the Edge, which is the cell phone and the ability and the cost to a provider, an application developer, for their application not to be working is enormous, in terms of potential revenue, impression of the application and so forth. I think application developers have quickly realized that they want their applications at the Edge to work, even if they're down back at the DC. To me, that is what's driving it, it's wanting to have a user experience and a reliability experience, so the end users don't [inaudible 00:35:03]. That just always seems to work. That is, I think, what is driving it. There's another thing that can drive it too, service providers see a possibility to save some money, right? If they can have some of the processing on the Edge, sometimes on the end user's own infrastructure, it can save them a bit of processing as well. But I think the move to the Edge has been driven, in my opinion, from a desire to provide a really good experience to end users, who will leave your app quickly if the app relies on some central service that when it goes down, you can't use it, right? So I think that is where the drive is coming from, software at the Edge. The other thing that's happening with software at the Edge, I see this in another group I actually manage here in Google, this is more of an improvised application, it's not sure if SQLite manages this, but there has been a huge drive over the last five years in terms of regulation, geographic location of data and so forth. And so for some applications, if you can promise that the data doesn't leave the Edge, and stays inside the customer's premises, or stays inside the customer's home country, that can sometimes help you win business too. So I think that's the other reason why sometimes software is going to the Edge, security and compliance is a big deal, though SQLight has the less presence in enterprise software in that sense, but I think security and compliance is another. Eric Anderson: So we're kind of winding down here, but Philip, I wanted to make sure if there's anything you wanted to cover, we include it. And then I have two other topics we could discuss. I'll tee them up, one is logging, you've spent a lot of time at Influx and currently Google doing logging. I came across one of your blog posts about tailing log files, like why are people still doing this? And it made me wonder, what does the future of logging hold? Have we kind of solved all the problems, or do you see, are there interesting things on the horizon around log management? Philip O'Toole: I think the most interesting thing that is happening in logging is actually what is happening in the observability space more generally. So you're right, Eric, I've been building observability software now for about almost 10 years actually. I started at a startup called [inaudible 00:37:10], which was a great experience, which was subsequently acquired by SolarWinds, I did that in San Francisco. And then I went to InfluxDB, which is time series, and now I'm at GCP managing teams that build their large scale logging systems. So what's actually the most interesting trend that is finally happening in the observability business, is that database technology and analytics technology has finally become good enough that we don't need a different type of database for logs, and another type of database for time series, and another type of database for perhaps traces. The industry has finally got to a point technically, where building unified systems and products for all the telemetry that technical infrastructure is omitting, is finally happening. We see this in our own groups here, we see this externally. So I think the biggest thing that's going to happen to log and logging, is it's going to become merged with traces, with monitoring in true unified products. If you had asked me four or five years ago, I gave a talk at [inaudible 00:38:10] about this, I was very skeptical that it would ever unify in this sense. But I think that's the biggest thing that's happening, unified storage layers or unified analytics layers that aren't just from log data, but also for time series data, and for traces, so that the industry is finally approaching the holy grail. I'm not sure if that's kind of what you're asking, but I see that as actually a bigger trend that is probably going to overwhelm logging in particular, and is finally unified solutions for managing [inaudible 00:38:40]. Eric Anderson: I see that as well, and it does answer my question is that two parts, one that we've identified like a single way to record an event? And then two, that we have a way of storing it that is optimized for all the various types of queries you might want to read? Philip O'Toole: In my opinion, it's a little bit of the former, but mostly the latter. Database technology is getting good enough to be able to store or to query the different types of data, in what appears to be a single system to the outside world. Yes, so I think it's mostly the latter, I think it is a bit of standards at advancement with things like Open Telemetry, trying to bring a unified mindset to this data, but I think it's also technical advancement. The demand has been there for a long time, I have built many time series databases on MySQL, they eventually fall over. I have built many log storage systems in relational databases, it would eventually fall over. I have built systems that stored time series and Elasticsearch, it would get too expensive. But finally, I think there's a technical advancement happening that is making it much easier, but the demand was always there, people didn't want to run multiple databases, but they had no choice, finally it's happening. Eric Anderson: And then the last topic, you also wrote a while back about the economics of opensource, this was in 2015, I think it was. And boy, the venture community has just poured money into open source since then. And you've also maybe had the opportunity, or I'm sure you've considered at times, should I go commercialize rqlite? And apparently you've chosen not to. What are your thoughts on the economics of open source today? Philip O'Toole: I think most of what I wrote in the article, and it's still on my blog, is true. There is this remarkable phenomenon whereby, all this open source software was being developed, and yet just give it away. The software industry has become so powerful, and yet it gives away its product, some of its most important products for free, it's actually a remarkable effect. And the reason is because services, it was the move to services that allowed this thing to happen, because people will pay for services, but they often hate paying for software. But the other thing that happened is services allowed software developers' software to be publicly visible to a huge audience. And you started to even see this, you had sometimes almost superstar developers who became very well known. So I think a lot of what I wrote there is still true, I think the model of building an open source software to build credibility, and then deploy it into a service, it's very common now. So I see the economics haven't really changed since that time I wrote that article. I think the economics of software is about to change enormously with what we're seeing from LLMs. I don't think the software industry as a developer, I think we have no idea what's coming. And I say that as someone who's generally pretty skeptical of new technology and wanting to see how it would play out, but Eric, I think there has been a profound shift in how software is going to go about over the next five years. I use some of these technologies for developing with rqlite, how I am interacting with computers has changed profoundly in the last year. So I think the economics of software development is about to take this huge shift, both in private as well as public open source software, that I think is about to ... so maybe I should write another blog post in five years. I'll probably have an LLM write it for me, because I think the economics are about to change after having remained shared and static for 10 years. Eric Anderson: Awesome, Philip. Well, let me speak on behalf of the kind of software community, thank you for rqlite and your efforts there, and really appreciated the conversation today. Anything you wanted to add? Philip O'Toole: No. Well, thanks a lot, Eric, it's great to chat, I'm always happy to chat about us and good luck in your own ventures too. Thanks very much. Eric Anderson: You can subscribe to the podcast and check out our community Slack and Newsletter at contributor.fyi. If you like the show, please leave a rating and review on Apple Podcasts, Spotify, or wherever you get your podcasts. Until next time, I'm Eric Anderson, and this has been Contributor.