00:05: Anna Rose: Welcome to Zero Knowledge. I'm your host, Anna Rose. In this podcast, we will be exploring the latest in Zero Knowledge research and the decentralized web, as well as new paradigms that promise to change the way we interact and transact online. 00:27: This week, Tarun and I chat with Sreeram Kannan, Associate Professor at University of Washington, where he runs the UW Blockchain Lab. In this conversation, we look at how information theory and blockchain intersect and talk about Sreeram's early work on P2P mobile networks, mapping Bitcoin security, and how this evolved to work on consensus, fair sequencing, and more. We couldn't get through everything in this episode, so hope that we can have Sreeram join us again in the future. 00:54: Before we start in, I want to let you know about a section on the zeroknowledge.fm website that might be interesting for you if you're looking to jump into ZK professionally. It's called the ZK Jobs Board and there you can find jobs posted from some of the top teams working in ZK. Find the project or team you want to work with next over on the ZK Jobs Board. I also want to make sure that the upcoming ZK Hack Mini is on your radar. Last fall we did a 7-week event with puzzle hacking and ZK tool workshops. Now we bring it back for a mini version of this event. It's set to last two weeks and starts on March 1st, 2022\. If you missed the first edition, or if you attended and just wanna join again, be sure to sign up today for the newsletter. You can also already sign up for the March 1st kickoff event. There you can get updates on our partners and schedule. Speaking of partners, I'm very excited that our puzzle building partner for this edition is Polygon. We are very much looking forward to once again, bringing the ZK community together around ZK Hack Mini. We hope to see you there. So now I wanna hand over the mic to Tanya, the podcast producer, to tell us a little bit about this week's sponsor. 02:00: TanyAnna Rose: Today's episode is sponsored by Electric Coin Company, the inventors of Zcash and Halo Cryptography. They're building the next generation of zero knowledge tech. With Network Upgrade 5 happening this April, Zcash moves to the Halo zero knowledge proving system, removing trusted setup and becoming shielded by default. They believe privacy is essential to delivering on the promise of a thriving and equitable Web3\. Visit electriccoin.co to learn more about the relationship between privacy, self sovereignty and creative freedom. That's electriccoin.co. So thank you again, Electric Coin Company. Now here's Anna's interview with Sreeram. 02:42: Anna Rose: So in today's episode, we're here with Sreeram Kannan, Associate Professor at University of Washington, where he runs the UW Blockchain Lab. Welcome to the show, Sreeram. 02:53: Sreeram Kannan: Thanks, Anna. 02:53: Anna Rose: I actually wanted to first start this episode with a question to Tarun. Tarun really brought this episode together and introduced me to Sreeram. So Tarun, can you share with us what we have planned for today's episode? 03:04: Tarun Chitra: Yeah, so one interesting thing I think about the cryptocurrency and blockchain space is that, a little bit like AI, is able to draw talent and ideas from a lot of different backgrounds. And one interesting sort of view from an academic lens is that cryptocurrencies really started mainly with cryptographers and then sort of distributed systems people. But over time, it sort of branched out into other academic fields over time. And personally, I came from sort of this mixed background as well, and I have always... When I enter the space, I kind of observe like, hey, there are all these cryptographers who are really bad at probability. There's got to be something that someone who knows how, what a measure is can do. And that sort of is actually what nerds sniped me into this space. And around the same time, I started observing Sreeram and his advisor and some other folks at different institutions, like starting to write papers that were a lot more rigorous from the probability standpoint than had been in the space prior. 04:15: So I think the idea that you can take tools from other fields and apply them to subsets of problems that are actually really interesting and widely applicable is the main theme that we'll go for. And I think hopefully with this episode, you'll get to see kind of the variety of work that Sreeram has done. And also, if you're say someone who's not in the crypto space, but you studied math or something, or you kind of know these other fields, you'll get an idea of like, hey, there's actually a ton of open problems here where you can apply tools from other parts of the world. And I think, yeah, kind of getting a little... I think Sreeram's background and history will give you kind of an idea of how certain tools and applied math and probability can be applied to a really large variety of fields if you take kind of this very generalist lens instead of being a super specialist. 05:12: Anna Rose: So Sreeram, I now really want to hear a little bit about your background and maybe what nerd sniped you to jump into this. 05:21: Sreeram Kannan: Absolutely. Thanks, Anna. Thanks, Tarun, for the kind introduction. So I got into blockchain in four years back. But the story of my interest in this field or peer-to-peer networks actually dates way back to my PhD thesis. My PhD thesis was actually thinking about how to construct peer-to-peer wireless networks. What are the fundamental limits? Today, when we look at blockchains like Bitcoin, or Ethereum and so on, one of the big questions we look at is what is the maximum throughput that you can achieve with these kind of peer-to-peer networks? What is the latency that we can achieve with these networks? These were exactly the same type of questions we studied, but in the context of peer-to-peer wireless networks. They used to be called ad hoc networks at that time. 06:11: And there was a massive interest in figuring out, can we build wireless networks in the absence of any infrastructure? No base stations, no Wi-Fi access points. Can we just create self-organizing wireless networks, peer-to-peer, they just talk to each other and then organize themselves into a network. And the kind of math problems we were interested in are what are the fundamental limits of such networks? Can you achieve like... What's the maximum throughput capacity that you can achieve? What is the minimal latency? And it was in my PhD thesis, my advisor, a professor, Pramod Viswanath at Illinois Urbana-Champaign, and I worked on some of these problems, which were basically theoretical limits of peer-to-peer networks, particularly peer-to-peer wireless networks. 07:01: And so our interest in this started already at that point. However, what happened was peer-to-peer wireless networks did not take off. It's because the massive deployment of infrastructure in wireless, including access points, base stations, optic fibers, the massive availability of infrastructure obviated the need for constructing peer-to-peer networks. And at the end of my thesis, even though we solved some open math problems inside these fields, the real-world application was lacking. So I was sitting around rethinking, what should we do, now that we solved this math problem, it all looks pretty amazing. But in reality, nobody's actually interested in deploying a peer-to-peer wireless network. 07:51: Anna Rose: So you had the solution without the problem. 07:54: Sreeram Kannan: That's right. We definitely had a theoretical problem, which is, how would you go about constructing it? But there was really nobody who was actually particularly interested. Indeed, actually, Qualcomm at that time had a big project, which it shelved right around the time when I finished my PhD thesis called FlashLinq, which was trying to build peer-to-peer wireless networks. My advisor used to work with them and so on. So we were thinking that could become a thing, but it didn't become a thing, because of the massive deployment of infrastructure. So it gives me a lot of satisfaction to see Helium today, which is basically brought about that idea back into wireless peer-to-peer networks. Instead of, I think the paradigm shift that Helium brought in is instead of building just a peer-to-peer network, they built a peer-to-pool-to-peer network. 08:46: And the pool is basically, I think this is the kind of fundamental transformation that we're seeing in decentralized exchanges, for example. Order book exchanges are peer-to-peer, right? A peer trades with another peer. But fundamentally, the new paradigm we're seeing in AMM DEXs and so on is peer-to-pool-to-peer. 09:05: Anna Rose: Cool. 09:06: Sreeram Kannan: So that's exactly the same paradigm shift I think Helium brought in. They said, well, it is infrastructure-based networks, but why should only monolithic companies build these infrastructures? Can we get, coordinate regular people to put up infrastructure, put the right incentives and actually create these pools from which we can start providing service? So I feel like the world has come a full circle starting from peer-to-peer wireless networks back in 2008\. I think 2002 to 2008\. 2012, there's a lot of academic interest in that,and then that died down, and I think now, for example, my advisor, Pramod Viswanath, he's actually studying how to build these kind of peer-to-pool wireless networks. What are the right incentives? What are the mathematical primitives that need to go in? Again, a lot of interesting random processes and analysis showing up there. 10:03: Anna Rose: Very cool. You're talking a little bit about this sort of lull in the P2P research, or P2P interest. Would you put that firmly in that P2P winter that we've kind of heard talked about? 10:15: Sreeram Kannan: Absolutely. 10:15: Anna Rose: So, this is 2008, would you say? 2010 when it started to just die out? Okay. 10:21: Sreeram Kannan: Yeah. So, 2008 I started my PhD and I finished my PhD around 2011, 2012 and the interest was just veining. So, there was an interest in P2P, wireline networks was already veined by 2008, but we were thinking because wireless infrastructure was not built out, at least all over the world, there was still a possibility that peer-to-peer wireless networks could take off. But that didn't happen. 10:50: Tarun Chitra: I think one of the most important things to realize from listening to this, and it took me a long time to understand this, is that a PhD is really like a venture bet. It's like investing in something and then waiting for five years and then seeing if the thing is still alive after five years. 11:07: Anna Rose: Wow. 11:08: Tarun Chitra: And choose wisely. 11:11: Sreeram Kannan: Absolutely. I agree. 11:12: Anna Rose: Sounds like, Sreeram, yours has worked out. At least the field has come back. A lot of the work that you did, I guess, becomes relevant again. So let's hear about that move, maybe into blockchain for you. What was it? Where did that happen? 11:27: Sreeram Kannan: So after peer-to-peer wireless, I didn't... At that time, it was 2011, and I hadn't had the same exposure to blockchain at that time. I was looking around and seeing what are some of the most important things happening and potentially how can we bring this understanding of information theory and networks and graphs into this field. So just for our viewers who may not be familiar with information theory, one of the most celebrated icons in computer science is Alan Turing. I would say equally important was Claude Shannon, who fundamentally pioneered this view that all information can be represented digitally without any basic loss of performance. It was not at all obvious during the time that he pioneered these ideas back in around 1945, '48\. At that time, communication was all analog. Radio stations used to use frequency modulation and things like that, which were analog methods. 12:33: And he pioneered the idea that almost all information can be represented by bits. He pioneered the idea that there is a way to communicate bits without error. He pioneered the idea that you can bring very sophisticated tools from mathematics to actually solve some of the real engineering problems in communication, compression, information processing. And I think it has laid the groundwork on which much of our modern society is built and definitely more on information technology. So coming from that field, the inspiration that we had was, can we bring these ideas into other emerging important fields? And one of the things that we took a bet on several years, again, like Tarun said, another venture bet was in computational biology. 13:19: So we spent several years, five years, working on computational genomics, trying to understand DNA, RNA sequencing, things like that. And it was in 2018, January, that my former advisor at the time, Pramod called me and said, hey, do you know about Bitcoin? I said, yeah, I've heard of it. Why are you asking? And he said, no, there is this thing called Bitcoin, it's super interesting, and you know the two basic problems in Bitcoin, one is Bitcoin has low throughput, only can process a few transactions per second. And it has very high latency, it takes several hours to confirm a transaction with high reliability. He said, all our life, or rather at least my PhD, we definitely spent all our time thinking about throughput and latency in peer-to-peer networks. And the question was, can we redesign Bitcoin to actually be very high throughput and very low latency? And this kind of immediately caught my attention. And so that's really the technical in for me into the area. 14:31: Anna Rose: And what was the work that came out of that problem? Like, did you end up solving that? Is there something that we can point to a paper or anything that describes it? 14:42: Sreeram Kannan: Yeah, absolutely. I'll explain what we ended up doing. So before that, I think, I was going through a more philosophical problem in my head. Again, like Tarun alluded, these are like venture bets. And to take a venture bet, to get into any area and do something useful, it's gonna take at least three to five years. So that's a basic minimum compounding required to get any kind of results. So I was considering, wow, we're working on all this cool genomics stuff, should we give that up to work on blockchain? Is this equally important? One of my reasons of getting into genomics is all these crazy things happening in synthetic biology. We're going to be able to engineer genomes, re-engineer our cells, seemed like a massive paradigm shift. And I wanted to be a part of, a smaller part, but a part of it. And I was wondering whether blockchains are anywhere near the same scale of interest. And I wasn't convinced for a few months. I was dawdling around looking through it and then it clicked one day. And here's what clicked meant for me. 15:49: One of my favorite authors is Yuval Noah Harari, his famous book, Sapiens. And there he makes this thesis that humans have taken over this planet. He summarizes the human's evolutionary advantage in one line. Is as humans took over the planet because we cooperate flexibly in large numbers. 16:11: Anna Rose: And we believe in myth. We're able to believe in myths. Right? 16:14: Sreeram Kannan: Absolutely. And shared myths help us cooperate flexibly in large numbers, but the fundamental primitive he lays out is this ability to cooperate flexibly in large numbers is what has led us to this evolutionary advantage. As I started chewing on that, it became obvious that any mechanism which helps humans cooperate flexibly in large numbers more efficiently will have a very basic fundamental evolutionary advantage to us as a species and worth investing deeply in whether Bitcoin crashes or not. So until I got to that conviction, I could not jump all in on this new venture bet. 16:57: Anna Rose: Yeah. Wow. Because that is such a shift. Like you're working on, when you were saying kind of genomic, is that mapping genes and this like deep, deep biology, like super important medical knowledge and research into knowing kind of how we're built and how it works. But you made a shift, but you actually could see that even... I mean, when you think of blockchain, sometimes you do think of the speculative nature of it. And it's a little bit of a shame if you move from medical to like, oh, we're going to make money over here. But I think the way you just put that is really cool. This is this idea of the organization of society. And you can't even have that deep medical research if you can't organize society correctly, so. 17:41: Sreeram Kannan: Absolutely. I think just like the internet built information superhighways, I think blockchain is building these cooperation superhighways. 17:51: Anna Rose: Cool. And I guess is this what led you to look at consensus? Because that's like the deepest cooperation side of things. 18:01: Sreeram Kannan: On the one side, yes. On the other side, that's also our distinguishing advantage that we've actually thought deeply about how to organize peer-to-peer networks. How do you get information across these networks at high throughput, low latency? So for people coming from distributed systems already come with a deep understanding of the consensus problem. We did not. We came with a deep understanding of the network problems. How do you get maximum throughput? How do you get low latency? Also, we come with a very deep understanding of the random processes that Tarun highlighted, which played a massive role in information theory and all these wireless networks. Because one of the most important things you're dealing with when you're dealing with a wireless network is noise. 18:46: Anna Rose: Noise, yeah. 18:46: Sreeram Kannan: And noise is random, and you have to understand the randomness quite carefully. So the first thing that struck me is Bitcoin mining is random. Okay, you mine a block, who gets to mine a block? There is some randomness. And if you view the Bitcoin blockchain as like a chain or even a tree potentially because there are forks, because two people can mine blocks near simultaneously, it sounded like, hey, that's a random growing tree. Can we bring a lens of probability theory and information theory to actually sharply understand this problem? So, once we understand why these things are actually secure, then we can maybe expand it to say, how do we get both security and high throughput? Security and low latency. So that was the direction in which we actually came in. 19:39: And I'd like to give a shout out to both my advisors, Pramod Viswanath at Illinois Urbana-Champaign, who got me into this area, and Professor David Tse at Stanford, who was my postdoctoral advisor. And the three of us got into this in 2018, and we used to meet like every day several hours because we didn't understand head or tail of what is going on here. And we dropped almost everything else to just study this area from like a beginner. 20:11: Anna Rose: That sounds so cool. And we were... One of the questions I just asked before was like, is there work that came out of this first problem? So, yeah, what's it called? 20:20: Sreeram Kannan: So the paper is called Prism. The idea that we were actually trying to pursue was how do you get low latency as well as high throughput with Bitcoin? And we realized, so just delving a little bit into what we did there. So first to understand why Bitcoin takes a long latency to confirm. So the idea is blocks are made every 10 minutes, right? So blocks are made every 10 minutes. And just because a block is inside the Bitcoin blockchain does not mean you should accept any transaction that went into that block. Why? Because if there was an adversary, the adversary could potentially mine a competing block at the same depth and take over the chain and your transaction may get double spent on the other chain. So the basic fact of Bitcoin is the closer the block is to the tip of the blockchain, the more likely it is to get reverted. The more deeply embedded a block is into the blockchain, the less likely it is to get reverted. 21:28: And so this is a basic fact. And so what this implies is if you want to confirm a transaction at a certain level of reliability, then you need to wait for a certain amount of time. So this is all there in Satoshi Nakamoto's original Bitcoin white paper. And we went through all of this and found that the latency is accruing fundamentally because to think of this in probabilistic terms, what you're doing is let's say Tarun is an attacker and Tarun has like 20% mining power and the rest of the Bitcoin miners have 80% mining power. Now, if you just ask what is the probability that Tarun will get the next block first, that's like 20% because it's a race between 80% mining power and a 20% mining power. But if you ask what is the probability that Tarun will get the next 10 blocks first as opposed to the rest of the network, then the probability is much smaller because the other guys are winning the race every time and they're kind of compounding on their wins and over, so what is called the law of large numbers, which basically says if you average enough, then you would always see that the 80% wins over the 20%. 22:46: So that's the basic reason you wait for a long enough time in order to confirm a block in Bitcoin. And the question we asked is, can we get this averaging, this law of large numbers to work in a completely different way, which does not require latency? So here I wanna highlight that there are many other methods people have thought about how to reduce latency in a proof-of-work system, and one of the simplest is to say that take the last few proof-of-work miners and then form a committee and let this committee vote on blocks. Because once you have a committee, you just send the thing to the committee, they all sign and then you're done. But to us, this felt highly unsatisfactory. And the reason it's highly unsatisfactory is one of the really beautiful things about Bitcoin is nobody's power, everybody's power is evanescent. The power is transient and vanishes immediately. 23:41: So for example, when I'm trying to make a block, I'm racing, racing, racing to make a block. Once I made the block, I don't know when I'm gonna make the block, and then once I made the block, I have no further power in the network, special power in the network. Everybody's power is evanescent. So what that means is, even for a short time scale, nobody has special powers. Everybody's competing fairly all the time. But if you elect the miners from the last thousand blocks to go ahead and form a committee and then let them run a blockchain or just sign blocks, now you have a vested powerful intermediary, this committee, which can just collude and take down the network or do whatever. So mathematically, the way we think about it is we think of this as a dynamic adversary or an adversary that can corrupt people after knowing what their identities are. And a dynamic adversary cannot corrupt Bitcoin. It's really magical in that sense. In the sense that you have power proportional to your actual computational power and you don't get any further power in the network. And so this was a very pivotal property for us when we thought about how do you get low latency without giving up this absolutely fundamental property of Bitcoin. 24:56: Tarun Chitra: One thing I always appreciated about Sreeram's research is I also, at one point was doing computational biology and thinking about things in terms of probabilistic processes generating graphs and graphical models and generative models is kind of the way that I spent probably my first five years of doing research. And I think one interesting thing is that in crypto, it's actually really hard sometimes to see that those things map as cleanly. Like things in ML and AI are almost always stated directly in terms of graph processes and things that look like these types of well-studied statistical objects. But I think a lot of the hard part in the cryptocurrency industry is A, first, filtering through the bullshit because there's a lot of nonsense. B, filtering through the wrong math. So like the Bitcoin paper, their Poisson distribution assumption was wrong. And a lot of what Sreeram, I'd say there's three lines of work that have really tried to fix all of the really shitty math in the original Bitcoin paper. 25:59: And don't get me wrong, like clearly it worked and whatever, but it should be told... The truth should be told that the math is wrong. A lot of the assumptions made don't work. And this kind of Poisson expectation assumes sort of some extra smoothness on the kind of state space that you're looking at. But there's three lines of work. The first was this paper, and this was the thing that kind of nerds like me, which was Bitcoin Backbone Protocol, which is paper and kind of overly complicated, but it did get the right probabilistic idea of how to analyze these probabilistic blockchains and get kind of CAP theorem-like guarantees from traditional databases that people kind of wanted. The second was sort of like the Elaine Shi and Rafael Pass improvements on that and simplifying it. 26:49: And then the third was really, I feel like Sreeram's work, which said like, hey, instead of trying to analyze specific consensus protocols, what if we took the space of all consensus protocols and then said, what properties do you need to make things work? And I think a lot of the kind of famous cryptography professors who made blockchain so in life, kind of ignored this problem when they made their chains. They really focused on like, here is a single consensus protocol and here's why it works. And I think the beauty of the way Sreeram and collaborators have thought of it is they went the other way around. They're like, what is the set of all of these things and how do we distill them down? And I think maybe talking a little bit about how you got into that, how you guys flipped the problem in that way, going kind of top down versus bottom up would be awesome. 27:36: Anna Rose: But right before you do that, what's the name of the paper you just mentioned? Sreeram Kannan: There are two papers. One is actually doing the low latency stuff that's called Prism: Deconstructing Blockchains. The other one is called where we actually look at the fundamental properties of Bitcoin and many other large family of consensus protocols. That's called Everything is a Race and Nakamoto Always Wins. So... 28:02: Anna Rose: Okay. Everything is a Race and Nakamoto Always Wins. 28:08: Sreeram Kannan: The name was chosen by our collaborator David Tse who just has this enormous respect, of course, for the deep math stuff, but also for somebody like Nakamoto, who had an intuitive sense of actually how this thing works, even without actually formalizing it in the same rigorous ways. But we were so shocked in the number of ways in which the intuitions were spot on, while the math was far lacking, was actually amazing. You had to have a very deep intuitive understanding of what all ways in which things can go wrong and had a mechanism to fix almost all of them. I'll highlight a couple of them as we go on. But I think to reinforce what Tarun said, I think when we got into this area, we were looking around all kinds of papers. They're all looking at very narrow formulations. One of the things that we like in information theory, this really Claude Shannon's school, what Shannon did is instead of saying, hey, I'm going to communicate using this method, and then let me just try to analyze it, which is what everybody was doing at that time. Instead, he says, what does the set of all possible potential communication algorithms, and then can we find the absolute best among all of them? 29:28: What does it even mean? Right. It seems like a simply mind-boggling thing to say that there's a space of all possible communication algorithms, and then what is the absolute best among that family? So this is really information theoretic thinking on blockchains. So we just brought the same fundamental question, which is what is the space of all consensus algorithms? And what is the absolute limits on what you can do if you optimize across all of them? What would be the best one? Can you actually find something which is close to the best one? And this is really this paper, Everything is a Race, and David went on and brought on some really strong probabilists. Amir Dembo, Ofer Zeitouni, who we had the privilege of working with, absolutely a fantastic probabilist. And great students, David's student, Nasret and Xuechao, who was Pramod's student, formed a big group who actually ended up working on this problem. 30:27: Anna Rose: Cool. 30:28: Sreeram Kannan: The basic thing, just to summarize what kinds of problems need to be thought about, so when I stated the simple heuristic about how Nakamoto did the calculation, he's saying publicly everybody's mining, 80% of the mining power is doing a public mining and the remaining 20%, the attacker is mining a private chain and then trying to displace the private chain because in the blockchain the longest chain wins. So this is the attack that he considered. But then when we looked at it, it was not at all clear why this is the only attack. What if I as an attacker try to create blocks on both sides, try to balance the chain. Now there are two chains, both of them roughly equal length all the time, and the blockchain just gets confused. People get confused, what is the longest chain? And leave you under perpetual confusion. Is it possible or why is it not possible? 31:29: So even modeling what an attacker can do seemed pretty mind-boggling. And this is where I would say it's an absolutely brilliant paper. The Bitcoin Backbone Protocol just came in and distilled it out to its very essence. They said an attacker can do four things. They can decide where to mine a block, they can decide when to release that block, they can decide what to put into that block, all these degrees of freedom. And all of these, of course, can now depend on the particular state of the blockchain. Or if one chain is winning, then you do this, if another chain is winning, then you do that. So it's a very, very, very complex state space of what an attacker can potentially do. And now you want to prove security against this massive potential that an adversary has. And this is really what was already done in the Bitcoin Backbone Protocol. 32:26: What they showed is if your mining rate is slow enough, if your mining rate is slow enough, then you can be secure as long as the adversary does not control more than 50%, then the blockchain will be secure. So it's an absolutely fantastic general result which says that you cannot execute all these other complex attacks. Even if you tried, you will not be able to succeed as long as you have less than 50%. And so we used that as our baseline to build on our work, and what we said in this paper is, can we think about this problem even more sharply? Instead of saying that if your mining rate is really slow, then you can actually solve this problem. Can you find what is the maximum mining rate at which you will continue to be secure? 33:18: Anna Rose: Was it sort of like, were you testing out sort of the parameters of the security of the system? 33:23: Sreeram Kannan: Exactly. 33:24: Anna Rose: Are you just like trying to break it? Did you break it? Did you find those limits or? 33:29: Tarun Chitra: So this is like theoretical, right?I think like one important, I think, piece of information maybe here that I think to listeners who maybe have more of a background in traditional distributed systems, is traditional distributed systems work has these properties that are safety and liveness of like, hey, if I send a transaction, does it stay in the network? And hey, if I keep sending transactions, can the network keep processing transactions? And one of the kind of fundamental results of the 1990s was that you kind of can't have your cake and eat it too. You sort of can't have safety and liveness perfectly. You have to make some trade-off in terms of how much the network can partition or things like that. 34:17: And blockchain sort of promised this holy grail of like, actually you can get all three, it's just that one of them is going to only hold with like 99% probability, one of these properties you want, and the rest will be maybe deterministic or also high probability. And I think the hard part in that is that adding in this probability thing blows up your state space, to Sreeram's point, to be this infinite dimensional sort of search space, right? You go from this very finite dimensional, I can work with this stuff thing to this infinite dimensional thing. But in some sense, the infinite dimensional thing has a lot of constraints on it. It has some kind of decay properties, some types of like, you can't really predict the future more than a certain amount of time and that all adversaries kind of somewhat have that property. 35:02: And so in some sense that if you can formulate that correctly, you kind of can get around these sort of traditional database problems. And the Bitcoin Backbone Protocol is the first to really show that, hey, there are some different types... It's a different sort of like rotating your way of framing the problem such that you can write this simply. So I think the context is that was sort of the 1990s, early 2000s, Bitcoin happened, but then no one gave a good reason for why it happened. And then Bitcoin Backbone Protocol came out, and then people started being like, I can build better than Bitcoin, right? Like everyone and their mom's consensus algorithm, right? So I think then it became this problem of like, who's the taxonomist? And I think that's where our story goes next. 35:53: Anna Rose: So you mentioned it was like a taxonomy move, but isn't it... I guess the question I would have now is from the work you did, were you also developing a new protocol from that? Did you actually want to build something else? Because what it sounds more like is that you were mapping the space. 36:11: Sreeram Kannan: Yeah. So part of engineering or at least principled engineering is understanding and sieving through what is happening. So we did both to summarize briefly is we actually first map the space, try to understand when things are secure, when things break, and now you have these atomic units that you can combine into new protocols that can actually give you different kinds of performance metrics. We did end up... The protocol Prism, some of our collaborators and people at MIT, they ended up building it and actually assured that you can get insane throughputs, for example, 60,000 transactions per second, low latency confirmation within one or two block times, things like that actually ended up getting built. So it's not just a taxonomy, it's kind of uncovering the principles that underpin the design of these systems. 37:10: We absolutely felt what Tarun said, which is that it seemed like a simple enough thing. You just have a longest chain, why not have a longest graph or longest tree or make it a directed graph? Why not play with that? The space with which you were playing seemed intuitive and simple, but the properties were extraordinarily non-intuitive. I'll give you some examples a little bit later when we started getting into proof-of-stake and proof-of-space where some extremely non-intuitive things happened. Where if I tell you the protocol, it seems like it should work, it should be even better, but actually it turns out far worse in terms of security. So there are all these very non-intuitive things. In fact, one of our starting points was there was this really interesting protocol called the GHOST Protocol by Aviv Zohar and collaborators, which said, you can get much higher throughput by instead of just looking at how many blocks are in your chain, you look at the entire tree and calculate weights and stuff recursively. 38:12: So that protocol claimed to improve the throughput and in fact was adopted in Ethereum among other things. And when we did a much more formal analysis, what we found is the protocol does not actually improve the throughput, it gives you the same throughput as the longest chain, but it does not improve throughput. And it will never be obvious without actually going into the security analysis because there was this balancing attack which was in GHOST. If you try to improve the throughput by making more blocks simultaneously what will happen is you can try to... You will not be able to figure out which one of the two chains actually got confirmed because I'll keep flipping between the two chains so you never get any agreement on to what the real chain is. 38:59: So there are all these very interesting things that can happen with these security things that many, many, many people were overlooking. And so that was the taxonomy part, is kind of understanding when a protocol is secure, what is the limits of security? And as you phrased it, can you say, in this parameter regime, it is secure, and in this parameter regime, it is simply insecure. We call this in statistical physics and in information theory, sharp thresholds. Sharp thresholds are saying basically exactly in this regime, it's all safe and exactly outside of this regime, it's not at all safe. So don't venture into these regimes. 39:35: Anna Rose: And no gray zone, it sounds like, and this is why it's sharp. 39:37: Sreeram Kannan: No gray zone. That's exactly, it's very sharp. That's why it's called a sharp threshold. Absolutely. Because you know, what happens is as things go on for infinite time, any bad thing that can happen will happen. 39:48: Tarun Chitra: I think one thing that's funny is the first time I was on the ZK podcast in 2018 or whatever, I remember talking about phase transitions then, and which is the same as this notion of sharp threshold. Like different fields will call it different things like combinatorics and other places will say sharp thresholds. Other people will... In physics, you'll say phase transitions, but it's like ice turning into water. It's a very sharp change in some sort of macroscopic property that averages over all sort of participants particles, whatever. I think people didn't have as much of an appreciation for this until maybe 2020, 2019, when people actually were writing these things in code and the network testnet would always keep falling over and they couldn't explain why. 40:37: Anna Rose: I kind of want to go back to this idea though that a lot of different people were looking at these problems. Would it be that the way that they defined the security would actually, or like that would then define the type of protocol that they would develop, but they might not be looking at all of the axis? Like who's to say... 40:55: Sreeram Kannan: Yes, that's exactly correct. 40:57: Anna Rose: But who's to say that like, why was your... The axis you were looking at, were they broader, better? Like why would you also have the same kind of blind spots in the work you were doing? I guess no researchers should say yes, but... 41:11: Sreeram Kannan: No, no, absolutely, absolutely, right? Like self-doubt is very good if you're a researcher. And so here is one thing, one reason why at least what we were looking at is to summarize what other people were looking at. They were mainly focusing on because Satoshi Nakamoto looked at only the private attack, which is I grow a chain in private and then I try to overtake your public chain. Everybody else thought that that is enough. So everybody else were building protocols which were fundamentally secure against that attack. But what it turned out is what Satoshi Nakamoto did is uniquely correct in that that attack was the absolute worst thing that could happen for that protocol, but not at all true for most other protocols that people were designing. So that's why the paper is called Everything is a Race and Nakamoto Always Wins because there's something absolutely stunning. 42:10: I'll give you another example, which is also a line of work we did and actually pioneered by Aggelos Kiayias and the people who did the Bitcoin Backbone Protocol. So one of the things going on in Bitcoin is variable difficulty, which means the difficulty gets adjusted every 1024 blocks. So people are not looking at what is the longest chain, but what is the most difficult chain or in which chain has most work accrued. So if you look at the mining power in Bitcoin, right? When it started and now it has gone up 10 to the 13 times or something some absolutely insane times. So even after all of this, the system works and is fully secure because the system keeps recalibrating the difficulty it is required to make a block. So blocks are made roughly every 10 minutes and the system, every 10,024 blocks has a recalibration. Okay, so this is called difficulty adjustment and instead of following the longest chain, you follow the most difficult chain or the most worked chain. And this was already kind of thought in until this point it's intuitive. You should follow the most difficult chain that seems correct. 43:23: Okay. And then it turns out that there is an attack on this system if you just define it like this. And the attack is again let's bring Tarun, our attacker. Tarun comes in and he has 25% mining power and Tarun wants to overtake the longest chain. So what does he do? What he does is he says, I'm going to raise my own difficulty. How does Tarun raise his own difficulty by creating fake timestamps, you can artificially show that blocks are arriving too frequently in his chain, and because blocks are having too frequently, the algorithm will say, you should increase your difficulty. You should increase your difficulty. And if he keeps increasing difficulty, then what he can do is if his difficulty is so much higher that one block that Tarun makes can overtake an entire chain that the honest guys are making. So what this does is there was a law of large numbers which said that by the time the honest guys make 1,000 blocks, they have 80% mining power, so they're going to be much ahead of Tarun, who only has 20% mining power. But now he has claimed back his fair odds. His fair odds was like he should win 20% of the time by saying that one block he makes, which is a highly stochastic event, can completely take over the chain. So instead of saying that he has to make 1000 blocks, he's just making one block. So he doesn't operate with the law of large numbers. He's operating in a random regime. 44:59: So the law of large numbers is what makes things deterministic or roughly deterministic. And Tarun is, by raising his difficulty, he's bringing it back into the random regime where his chance of winning is now 25%. That's completely unacceptable to run a system. But you can't execute this, Tarun can't execute this attack on Bitcoin because Satoshi Nakamoto had an absolutely brilliant fix to it. I don't know if they understood it or not, but there was this fix, which is basically difficulty will not be raised by arbitrary amounts, but it can only raise 2x or 4x every interval. And that just makes everything completely different. It's one of the most amazing things that I found that if you just go ahead and design a protocol, you would absolutely not think of anything like this. This is an insane kind of attack, and the fact that it has withstood attacks like this speaks a lot to the amount of engineering insight that went into the design of this protocol. 46:02: Anna Rose: So, so far, I feel like a lot of what we've been talking about is very, very proof-of-work focused. But I know that your work also starts to touch on proof-of-stake. Does that change any of this? Like, yeah, I'm just curious about like, where did you evolve into? Almost like where did you start on the proof-of-stake train too? Because that's a whole other kind of realm of research. 46:25: Sreeram Kannan: That's right. So all of this stuff was proof-of-work, and this was 2018 when we started. And end of 2018, there was too much buzz about how this proof-of-stake is going to be the real thing, and proof-of-work is going to go away and all of this stuff. And having studied the longest chain so deeply, our natural impetus was to actually follow the train of work in proof-of-stake, which worked based on longest chain. Much later, again, going back to the taxonomist hat, we synthesize the space much more clearly by kind of understanding what the fundamental trade-offs are when you design a proof-of-stake protocol. So I'll start with that, so then we can delve into what the longest chain offers and doesn't offer. 47:14: So fundamentally there are two different kinds of proof-of-stake protocols. Looking at it practically, you will say that there are longest chain protocols and then there are BFT protocols. But fundamentally what we reverse engineered and other people like Tim Roughgarden and others have also pointed this out that what is actually going on is there is a fundamental tension or a trade-off between two sides of algorithms. One class of algorithms offers you availability. What does availability mean? Or dynamic availability. Dynamic availability means even when a small fraction of participants are present, the protocol will remain and continue to make progress. The protocol continues to be running even when only a small fraction of participants are present. So these are called dynamically available protocols. 48:03: Anna Rose: Would this be when there's 30% active? Or what's the... 48:07: Sreeram Kannan: Yeah, 30% or even 3% active, the protocol needs to continue to make progress. For example, Bitcoin, of course, continues to make progress even if the mining power... Like the China thing happened, miners moved out of China and Bitcoin is running effortlessly because it's a completely dynamically available protocol. As the mining power increases or decreases, there is no denominator on what the world's mining power is. So nobody knows what to normalize it by. And so it's self-normalizing. So it just keeps adjusting itself and moving on. So we call these family of protocols dynamically available protocols. 48:40: Anna Rose: But if that's one side of a spectrum, is it sort of like, are there some PoS chains which are more dynamic? How does that work actually? They have a lower threshold needed in order to remain dynamically available or something? 48:54: Sreeram Kannan: That's right. So there are two absolutely... It's at the very core of when you design a proof-of-stake protocol, do you want to be dynamically available or do you want to be finalizing? Okay, so that's the dilemma is either you're dynamically available or you're finalizing. What is the finalizing protocol is you get a safety even when the network is asynchronous, when some nodes are... When the network latency is really, really high, you still have the system to be safe. So those are called finalizing protocols. Usually the class of BFT protocols, asynchronous BFT protocols, will fall under finalizing, which is Tandemint, which is HotStuff. So I'll give you examples of real world blockchains, which took one or two... One side of the fork or the other side. Aggelos and others who pioneered the longest chain Bitcoin Backbone Protocol, and they went on to build a whole family of provably secure proof-of-stake protocols called Ouroboros underpinned much of the rigorous longest chain protocols. Cardano is one of the biggest ones following that, but also Polkadot uses longest chain protocols. There are smaller blockchains which use proof-of-stake, which use these longest chain protocols. 50:16: On the other side of the spectrum are protocols which are finalizing. Dapper Labs Flow, the Flow blockchain uses HotStuff, which is a finalizing protocol. Tendermint, Cosmos is a finalizing protocol. And there are others like Ethereum 2, which want to be both. 50:33: Anna Rose: Okay. Where does Ava fall in that? I just had... 50:36: Sreeram Kannan: Ava is more on the finalizing side. 50:38: Anna Rose: Finalizing side. So it's closer to HotStuff, Flow... 50:42: Sreeram Kannan: That's right. 50:42: Anna Rose: Tendermint. 50:43: Sreeram Kannan: That's right. 50:43: Anna Rose: Okay, cool. 50:44: Tarun Chitra: But with some weaker guarantees, right? Like it takes more liberty with its voting mechanism. So there's a little more variance in the expected latency, for instance, and stuff like that. But it also means you get to use fewer resources and potentially lower communication complexity. So anyway, maybe that's too much detail. But I feel like they're kind of in the middle. They have some things that are a little bit unfortunate. 51:09: Anna Rose: So I want to just state the spectrum again. So it's from the... Do we say this is dynamically available or finalizing? And they're sort of on either side. I had an interview some time ago that I've cited a few times with Ittai Abraham and there he mentioned synchronous and asynchronous. Are those the equivalent terms or are those different? 51:32: Sreeram Kannan: Yeah, they're related but not equivalent. So synchronous means network has a guaranteed time to deliver all messages. So you're making a very strong assumption on the network that it will deliver all messages within a certain synchrony bound, which is a latency. Let's say like in 10 seconds, all messages will be delivered. A protocol which makes such an assumption will be called a synchronous protocol. An asynchronous protocol is one that will not make such an assumption. The family of asynchronous BFT protocols would be finalizing because you cannot have a protocol which is safe under asynchrony and also available because you don't know whether you've got enough messages to make progress, right? Because when you say, oh, only 10% of the nodes have signed, I'm still going to go ahead and make progress, you don't know, is it because the remaining 90% were honest nodes and they were trying to sign something and their messages got delayed? How are you going to go make progress? You need synchrony in order to get dynamic availability. So that's the thing. 52:37: But I think as a fundamental property for the end user operational property, it's much better to think about, are we dynamically available or are we finalizing? And the benefit of dynamic availability is obvious as users come and go, it's a come as you please kind of like a model. And the benefit of finality is also there, which is if somebody does something wrong, you can penalize them because it's usually based on signatures. When you know the committee or size or whatever, they sign something, and if you double sign something, then you can be penalized. So the taxonomist in us is basically trying to categorize this space and say, are protocols falling in this space, are protocols falling in that space? Is it possible that a single protocol can be both? That was another question, actually. This is another one of those things in academics. Actually, David Tse and his students started working on this and Pramod and I started working on this simultaneously independently, and we were both working on exactly the same problem. Can you have a single protocol that is both finalizing and dynamically available? And there are some basic theorems which tell you that it's not possible. It's called the CAP theorem. And I think Tarun alluded to that. So the CAP theorem says you can't be both modeled in this blockchain language. It says you can't be both dynamically available and finalizing. 54:02: Anna Rose: I wanted to ask you, since you had just mapped out where certain protocols fall, I had a few others I kind of wanted to ask you about. 54:10: Sreeram Kannan: Yes, please. 54:10: Anna Rose: Where does Solana fall? I never know what it is. Is it Tendermint? Is it... 54:15: Tarun Chitra: No, it's just PBFT. The proof-of-history thing is kind of like a heuristic, I wouldn't call it. 54:22: Sreeram Kannan: Yeah. So it's basically a finalizing protocol with some protection against this thing called the long range attack. And I'll come to that when we explain some of our work in that space. Basically, so if you look at the tree of all protocols, you have proof-of-work and then you have proof-of-stake. In proof-of-stake, you have either dynamically available or finalizing. So by the way, you can't have a finalizing proof-of-work protocol because you never know what the size is that is enough to finalize something. Because the total amount of mining power in the world is unknown and unbounded. So you can't have a finalizing proof-of-work protocol. So proof-of-work protocols are all dynamically available. 55:02: On the other side, you have proof-of-stake protocols, and proof-of-stake protocols could either be dynamically available or finalizing. But one difference between proof-of-work and proof-of-stake is proof-of-work protocols do not have the same level of long range attacks, which is history reversion attacks that are quite possible in proof-of-stake protocols. So that's another meta thing. And Solana has some defense against this kind of long range attack using this proof-of-history. 55:32: Tarun Chitra: But it is heuristic. It's not like using a VDF where you have a guarantee. It's much better from an engineering standpoint than a VDF, but it is... 55:42: Sreeram Kannan: That's right. 55:44: Tarun Chitra: You can't really prove that it works. It makes a lot of weird sort of assumptions when it's used. 55:50: Anna Rose: Let's now take that step towards answering the question, can it be both dynamically available and finalizing? 56:00: Sreeram Kannan: Actually, it was not our original idea. It was actually Vitalik's original idea for trying to figure out if you can get both. And the way he was thinking about it, I don't know if people remember back in 2018, there was this thing called Casper. Casper was supposed to be something built on top of Ethereum proof-of-work, which gives finality. So this is called Casper the finality... 56:23: Anna Rose: FFG. 56:24: Sreeram Kannan: Yeah, the Friendly Finality Gadget. Exactly. And we looked at it and said, man, he's trying to have his cake and eat it too. Is it possible or not possible? Or is there some trade-off? There were not complete proofs for these things. You know, there were hints that this may work, may not work. What's going on? Okay. And actually, the Casper is kind of built into Ethereum 2.0 now called Gasper. It's a combination of GHOST and Casper called Gasper. And I think it's not called Ethereum 2.0 anymore as of yesterday. But I think the Ethereum Foundation made an announcement yesterday. 57:01: Anna Rose: They officially made it not... Okay I have heard before that that they were distancing themselves from the word, from the name. 57:08: Sreeram Kannan: Yeah. Now I think it's called just consensus update or consensus layer or something. But the consensus update, which is basically called Gasper, what happened was David Tse found that there are many attacks on Gasper. So this was a paper they wrote early this year and early in 2021 that got a lot of attention. And so David's working with the Ethereum Foundation on how to fix several of those things. 57:35: Anna Rose: We actually, we did an episode, I think Tarun, you and I did this, like years ago, or two years ago, maybe, or one year ago, two years ago. 57:42: Tarun Chitra: I think it was like the beginning of the pandemic. 57:44: Anna Rose: Yeah, I think, and it was with one of the co-authors of Gasper. We did an episode on Gasper. So I'll actually try to dig that up if people are curious, but I haven't since heard about this work finding the problems. 57:57: Sreeram Kannan: Yeah. So the thing is, Gasper was trying to get both. Basically, the way Ethereum 2.0 works is there's kind of a longest chain, but people eventually vote on the chain enough and then that becomes final. So it's kind of having the features of both longest chain and finalizing. So that's the basic class of things, and the fundamental questions we asked is, can you be both finalizing and dynamically available? And the way you can be both is you have a common underlying protocol, but different ways to confirm things inside that protocol. So Anna, you are a very, let's say, conservative person. You want to only make very safe claims. Let's say you're running a bank or whatever. So have a common underlying protocol and you'll only accept somebody's payment if it has gone through a very safe rule. Whereas Tarun is running a gaming engine and he's like saying, hey, let's get things moving, man. 58:52: So what Tarun is doing is, it's the same underlying protocol, but he's coming at it and applying a different confirmation rule on the protocol, which keeps the system running live. And the theorems that, David showed and we showed in separate papers, by the way, the end of the story was David beat us to the race, and we wrote our paper like two weeks later with Pramod student Xuechao and Surya and we ended up basically bringing more properties to this kind of a protocol, basically, can we build a protocol which is both dynamically available and finalizing. Fundamental blockchain is run by the same protocol, but Anna can use a different confirmation rule and Tarun can use a different confirmation rule. 59:36: Anna Rose: In the same, but at the same time? Like, can you do that? 59:39: Sreeram Kannan: At the same time. And so you're alleged, so the safe rule will always be lagging the live rule. Because you're only making things that have been signed off very very safe. 59:51: Anna Rose: Very finalized somehow. 59:53: Sreeram Kannan: Yeah, very finalized. So the way you can think of it is already in Bitcoin you can use the confirmation rule which is like 10 blocks deep. And Tarun can use a confirmation rule, which is only one block deep. And so it's like that, but much more complex and much more general. And so that's the space of things we figured out, and so of course, now David is working with the Ethereum Foundation, trying to understand how you can make sure the protocol has both properties, but different confirmation mechanisms. 1:00:19: Anna Rose: What was that work you just mentioned when you said you released this paper, but later? 1:00:24: Sreeram Kannan: Yeah. So the paper is called Blockchain CAP Theorem Allows User-Dependent Adaptivity and Finality. It's basically user-dependent adaptivity and finality. So either your adaptive, which is dynamically available or finalizing, depends on the user, but the protocol itself is the same underlying engine. And actually other blockchains have also been thinking about this Polkadot had something called GRANDPA, which is their finality gadget. And so we kind of integrated all of this into a common picture which is again the taxonomist thing. 1:00:58: Tarun Chitra: One thing that's also interesting to note here is that a lot of layer 2s effectively offer this sort of similar promise if you choose what level of quality of service you want. 1:01:09: Anna Rose: Oh yeah. 1:01:09: Tarun Chitra: And so you probably can generalize this to some of the layer 2s, although of course you need to make different latency assumptions because there's two different chains and sequencers and stuff like that. But you could argue that layer 2s were already trying to go in this direction, but didn't know why they were doing it, other than it was easy engineering-wise. 1:01:27: Anna Rose: I think it was because it was so slow to actually write to the main chain right? You needed a faster approach. 1:01:33: Tarun Chitra: Well, I guess I meant in terms of their security guarantees. How people design layer 2s based on security guarantees is still quite heuristic. There hasn't been the Bitcoin Backbone Protocol paper for layer 2s. In a lot of ways, a lot of layer 2s are driven by engineering needs and not mathematical soundness needs, which is why I think they've been somewhat hard to get to production so far. 1:01:58: Sreeram Kannan: Absolutely. I think... So here, this point will also ask something Anna raised earlier, which is what are you not doing? You're saying you're modeling all the adversary thing and all this, what are you not doing? What are you sneaking under the carpet? I'll tell you exactly what we know we are sneaking under the carpet. We may not know other things. What we do know is fundamentally, when you try to mathematically model these complex blockchain systems, again, there is there are two fundamentally different views. One view says, I'm going to assume that X fraction of nodes are protocol obeying. You can call them honest, but I think maybe the right thing to call them is altruist. They'll just do, or you can say obedient if you want to appear differently. But basically these nodes just simply toe the party line. Basically they're going to follow whatever the protocol founder or whatever team says they'll just run that particular instance of the node. 1:02:54: And there's a fraction of nodes which are non-protocol following and they can deviate arbitrarily. They can do whatever they want. So this is one kind of model. We can think of this as an adversarial model of blockchain. Some group of nodes are adversarial, the remaining are altruist. They're called honest, but basically altruist. They're just doing what they're told. And they're doing it because it's good for the network, they're doing it because they're already happy with the revenue they're getting... Whatever, we don't care. We don't model why some nodes are honest and some nodes are adversarial. We just plop them into those groups and say either you're honest or you're adversarial. Okay? So that's one kind of modeling. 1:03:32: There is a completely different kind of modeling, which is saying nodes are rational. Okay? Now, when you say nodes are rational, they're trying to maximize kind of some utility, which is how much maybe money they're making. It may be factors in some of their real world reputation. We don't know what a utility function is. So the problem with the second class is it's much, much more difficult to state things very sharply. Everything that you state is under somewhat weak models because it's very difficult to model the full space of what utilities are and what rationality is. So what we had done for several years, say 2018 to 2021, is completely focus on the honest versus adversary or altruist versus adversary models. But over the last one year or... I've been thinking a lot about the incentives, which seem to drive a lot of user behavior. That's a fundamental disclaimer. Fundamental disclaimer is when you see a paper and it says, hey, some nodes are honest, some nodes are adversarial. Okay, it's good in some sense because it can allow... The adversaries can do arbitrarily bad things and still it's going to work. But the honest nodes have to absolutely follow protocol. 1:04:52: And then you can model it rationally. For example, Emin Gün Sirer and Ittay Eyal, their selfish mining attack is basically a rational deviation. It's not an honest, adversary thing, it's basically, I'm going to maximize my revenue source. So when you look at a mathematical model, you have to first calibrate, is the assumptions what you want? Or will these assumptions hold? Are there reasons promoting the holding of these assumptions or not? And so that's a basic fork in the modeling choice. 1:05:29: Anna Rose: When you talk about the rational though, is there not still sort of the irrational following their orders honest in that? Do you really believe that all will act purely based on incentives? 1:05:44: Sreeram Kannan: No, no, I'm not at all saying that everybody is incentive driven. And also incentives have a meta layer which is super hard to model. And the meta layer may be the following. 1:05:55: Anna Rose: Like social. 1:05:55: Sreeram Kannan: In Bitcoin, there is a very easy bribing attack. But if you execute this bribing attack, you may collapse the value of Bitcoin. Measured in Bitcoin, maybe you make more money, but measured in US dollar, you make less money. So there are all these meta things that drive into incentives which are very hard to formally model. So I'm not at all suggesting that this is the absolutely correct modeling or anything like that. But the reality is there are Byzantine nodes who don't care, they just want to take your... Maybe they're a competing protocol, they're coming and shilling on Twitter that, hey, your protocol is bad and I'm going to take it out. So there are Byzantine nodes, there are altruist nodes, and there are rational nodes. So the fundamental correct overarching mathematical paradigm will be what I'll call a bar model. Byzantine, altruist, rational, all mixing together in like one big party and trying to characterize what fraction of Byzantine, what fraction of altruist, what fraction of rational can your system endure? And can you really slug it out? 1:07:01: This was actually our original formulation when David and Pramod and I were trying to formulate the mathematical problem of blockchain. But we said this is way too complex for our heads and the math to handle, simple it down to just, let's say there is Byzantine, can we solve it? Now let's say there is rational, can we solve it? Let's combine everything together, can we solve it? So, and we see this kind of playing out in practice in various ways, for example, Ethereum, one of the main things that Ethereum 2.0, now called just their consensus layer is thinking through is the cryptoeconomics. And cryptoeconomics to me is just how to make sure it's rationally incentive compatible to just follow what people are telling you to follow. And can you make it more and more compatible to actually follow it? Maybe if you attack, you're gonna get slashed. This is one of the basic primitives that Casper introduced and many others followed suit, the idea of slashing, which is that if you deviate, you're gonna lose your money. 1:08:03: Anna Rose: And that's the economic game there. 1:08:05: Sreeram Kannan: That's the economic game there, exactly. And by the way, even though in theory, people are talking about all this, I don't know of any real large-scale protocol that actually does slashing. 1:08:15: Anna Rose: It has happened. I mean, in Cosmos, it definitely happened. I remember that somebody got slashed, but it was usually accidental. That's the problem. It wasn't really malicious. It was like... 1:08:24: Sreeram Kannan: Yeah. Exactly. That's the reason that people are worried when they want to bring slashing is, are you getting slashed for a real reason? Or it's just some kind of discordance in the version you're running and you can get slashed. So absolutely, that's the risk with building slashing, and so it's going to take time before slashing becomes an automated thing rather than... It's a social consensus thing that if somebody gets slashed, I can go and un-slash them because they were all just honest nodes out of sync. 1:08:55: Tarun Chitra: Yeah, I mean, so people do a variety of very weird kind of heuristics to avoid slashing right now. For instance, I think in Solana, they have this thing where they keep track of validator statistics. And I think it's actually central. I'm not actually sure how they get collected and disseminated, but there's some notion of what the distribution of validator times is, and then you get penalized based on how far you are in that distribution. In general, yeah, designing the slashing conditions is really hard. Because at some level, it has to be just as dynamic as the fastest type of attack in your system, but the system is trying to also slow things down. And so it becomes this very multi-timescale optimization problem, which sucks. 1:09:42: Sreeram Kannan: Absolutely. I think we observed this kind of central difficulty in slashing. And the idea is this, slashing is applied for safety failures when you have two blocks which are simultaneously confirmed. Actually, we have a paper on when can you slash. This paper is called BFT Protocol Forensics. This appeared in this year's CCS, and was selected as a best paper runner up. But the core idea here is when can a protocol have slashing? And what we found is that people normally think, if you have enough signatures and people double sign, you can slash. What we found is that's not absolutely true because there are scenarios where you can create safety failures, two things getting confirmed without actually double signing. And for example, Algorand is not slashable fundamentally, the way they have designed it. Even though it has signatures and quorum and all this, it's not slashable. So looking deeply into the mechanics of what makes a protocol slashable is another line of work that we kind of embarked upon in this taxonomy of consensus protocols. 1:10:54: Tarun Chitra: So I like to view a lot of this as the natural maturation of this space. If you think about things in 2017, '18, there was a lot of wild west, like, hey, I wrote this paper, I wrote some code as a prototype, like, hey, it's going to work, put hundreds of millions of dollars into it. And I think at that time, everyone was like, the math doesn't matter, who cares, whatever, the engineering matters more. And there's always this interplay in this industry, I think, in general, where the engineering can oftentimes get ahead of the theory, but then the engineering hits a roadblock, as we're seeing with layer 2s, where I actually think the theory is actually going to be very important for ensuring that they actually can get to production, unlike a lot of layer 1s. And there's some roadblock, and then it takes a while for people from other fields to provide theory. Some of the people mentioned earlier, like Kiayias... I can never say the last name correctly. 1:11:49: Sreeram Kannan: Kiayias. 1:11:49: Tarun Chitra: Kiayias, thank you. And Elaine Shi and stuff, they were kind of working in related fields, but tangentially, and then brought in new thoughts and I think in general, it's like with each kind of wave sees this improvement in theory. I think the moral of the story is there's kind of this like meta, meta game between the engineers and you saw more mathematical researchers in the space competing over whose design actually works in prod better. 1:12:21: Sreeram Kannan: Yeah. So just a thing there, when we look at security, performance can be determined by engineering. You just run things and check. You know if Solana is running 120,000 transactions per second or not by just running it. But you don't know if Solana is secure or if you don't know Ethereum is secure by running it. Because the space of all possible attacks is simply too mind-boggling for like people to even realize. This is the basic thing in cryptography was, you cannot just say that your a signature scheme or a zero knowledge scheme is secure because nobody has attacked it yet. Instead, you rely on very basic axiomatic assumptions saying, okay, calculating the discrete logarithm is hard, therefore, this scheme is secure. So this principle somehow did not yet fully manifest itself in the blockchain world because protocols seemed like engineering things. They are engineering in one dimension, but the dimension that they are engineering is just performance. 1:13:27: There's no... You know, to prove that one thing is more performant than other is not that useful. I mean, it's useful to some extent to understand the taxonomy and principles, but I'm running it and I'm actually getting better performance. Nobody cares about your theory in some sense. But to prove that something is secure, you cannot just run it and show what did you do? Did your friends attack it? Did your enemies attack it? Did you put a hundred billion dollar on it and let it sit for 10 years? And did you move the entire Wall Street into your application and see if it's secure? It's not possible, right? It's simply not possible to do any of this. You have to have mathematical methods to think through security in a way that you do not necessarily need for performance. So when people come in from an applied systems view and say, hey, I built this system, it's great in engineering., we can absolutely give them credence for building high performance systems, but whether their system is secure has to be sieved through a theoretical lens. And here I wanna give a big shout out to like Aggelos, Elaine Shi, Rafael Pass, all of these people who brought in this lens, and in particular, this was a very important... 1:14:44: Tarun Chitra: They were the cowboy academics. You know, at that time, a lot of their peers were like, you're going into the scam thing. 1:14:49: Sreeram Kannan: Yeah, yeah, exactly. And they stood there and wrote paper after paper after paper, sieving through the core principles in this. So it's absolutely amazing. 1:15:02: Anna Rose: As you're analyzing a lot of this work, I mean, I think a question that inevitably will come up is what happens when you think about MEV, like this other vertical, this other... I mean, maybe it was taken into account and I missed it, but I feel like it hasn't been fully kind of brought into any of your modeling, and obviously it's a very big topic right now. So what are you thinking around that? What work are you doing? 1:15:28: Sreeram Kannan: Absolutely. So when we started seeing this whole MEV thing explode, I remember first time, I think Ari Juels was giving a talk on Flash Boys 2.0, how these things were happening and how they blogged about it, and more people started just front-running and arbitrage bots, all of this stuff, and thinking that there must be like a fundamental mathematical problem underpinning why MEV is happening. And one way to think about MEV is it's really an incentive thing, right? If you're a miner, you are obviously going to prioritize transactions that pay you more fees and put them ahead than transactions that don't pay you that much. So that leads to this whole idea that you can front-run transactions by just increasing the fee that you're paying the miner. 1:16:18: Okay, this is an incentive driven way of thinking about it. And of course we were much more familiar or deeply embedded in the honest and adversarial way of thinking about it. And the adversarial way of characterizing this problem is the following, saying that if you are an adversary and you are mining a block, you can unilaterally determine the order of transactions in this block. If you're just running the geth node, you're going to order transactions in a certain way. But if you're running some other version of the node, you could order transactions in the way that you want. So that's an adversarial behavior. And the point is anybody can mine blocks, and an adversary can mine blocks, and their mined blocks will have a very different order. 1:17:01: So then we went back to the basics of distributed systems. And we see that there are really two pillars of distributed systems security. One is called safety, which says that a confirmed transaction should not be reverted. Another is liveness, which is an honest transaction should eventually get confirmed. But what we are looking at here is a fundamentally third dimension, and the dimension is, is the order of transactions in the final ledger, how is that related to the order of transactions entering the network? So it's a concordance property between transactions flooding into the network and transactions entering into the ledger. So people had thought about it a little, but no deep work had gone on to create systems which give strong guarantees on this concordance between the arrived order and the finalized order. 1:17:59: Anna Rose: And would you say like high concurrence? Or I don't know if I'm saying that word right, but if it's high in that dimension, does that mean that it's more similar? So there's less kind of funny business happening in the MEV camp? 1:18:11: Sreeram Kannan: Exactly, exactly. Imagine, so in fact, this is one of the big properties that a centralized exchange offers today, like a New York Stock Exchange will take enormous efforts to make sure that transactions are processed in the order in which they are arrived, going so far as trying to synchronize their servers across different cities to millisecond resolutions and so on. Okay, so this ordering, fair ordering, is also a mandated property by something like the SEC because you wanna make sure that the big guy who can pay for order does not outrun the small guys who cannot pay for order. And we were seeing ripples of this in things like Robinhood doing order flow, things like that. 1:18:59: Anna Rose: I'm just realizing like early, early stock market probably had the MEV bot equivalent, which was the person who ran the note from one place. 1:19:08: Tarun Chitra: Kind of all still happens. Having worked in HFT, it's the same thing. It's just that it's more competitive on a hardware level than it was before. 1:19:19 : Anna Rose: I'm picturing when orders were written on papers and run across the room. I just want to have a vision of what a MEV bot would look like if they were corporeal and I'm picturing it's like that's why I'm saying early, early like pre-digital. But, yeah, I guess, it's something that's always been there. 1:19:38: Sreeram Kannan: It's always there but once it arrives at the exchange there is no unfair ordering. But to arrive at the exchange you can always kind of try to get things close to the exchange or whatever. But coming back to the blockchain, I think the guarantees offered by today's blockchains are far worse than the guarantees offered by a centralized exchange. Because you can front-run just by putting more transaction fee and that's just like a zero bar on what is needed to do front-running. 1:20:07: Anna Rose: Totally. 1:20:08: Sreeram Kannan: So what happens in this world is if you're doing large trades, you can put in a higher transaction fee. If you're doing small trades, you cannot put a higher transaction fee, so you're obviously going to get front-run. And to me, this goes against the spirit of decentralized finance. What are we decentralizing if we cannot offer at least some protections for people whose orders are small? Okay, so what we are saying is, can we create a system which does fair ordering? And fair ordering means if the timing in which transactions entered int the system is something, then those transactions... If a transaction A arrived into the system earlier than transaction B and you have to define it and so on, then it should appear ahead. And in fact, our work shows that on both types of protocol, longest chain protocols, as well as confirming protocols, finalizing protocols like BFT, you can actually modify the protocols to have very little overhead, but still have a fair ordering. So that's the line of work. 1:21:16: Tarun Chitra: Although one caveat here still is in the Themis result, you still have some level of trust in this aggregator, where it's somewhat permissioned, right? 1:21:28: Sreeram Kannan: Oh, yeah. So just to give a quick overview of what that is. So normally in these BFT protocols, there's a leader, and the leader makes a block and sends it to everybody. Instead, the leader does not make a block or is not supposed to make a block unilaterally. Instead, the leader has to collect everybody's viewed orders because the order arrivals are different at different nodes, it's a distributed system. So what the leader has to do is to collect all the different people's orders of different transactions and then come up with an aggregate order, and that's what they have to put it on the chain. And if they did something wrong, it's a slashable offense. Because anybody can challenge them, and then if they don't show it, then they can get slashed. So the point that Tarun is mentioning is of course, now it's still dependent on one node, but if the node does do something wrong, they can get slashed. And what you can do is you can increase the scale of slashing by saying that not just one node needs to sign on it, but 10 nodes need to sign on it. So 10x the collateral can be slashed. If you want the entire network to be slashed, you can say that everybody needs to sign on it. So there's a kind of natural trade-off between complexity, which is how many people need to sign on stuff, versus the guarantees that you're deriving out of it, or the slashing that you want to derive out of it. So that's the finalizing protocol work called Themis. 1:22:51: Anna Rose: Themis, that was what I was looking for. What's the name of it? Themis. 1:22:55: Sreeram Kannan: That's called Themis. And Themis builds on top of some of the work done by Ari Juels and a student, Mahimna Kelkar, earlier. In collaboration with them, my student, Soubhik and I worked with them to actually come up with this protocol called Themis, which is a way more efficient. They'd already come up with the basic principles of what it means. That protocol was called Aequitas. Aequitas is a BFT protocol, which offers some kind of fair ordering. What we did is both improve the scale of fair ordering as well as reduce the complexity back to what it would have been without adding fair ordering. So that's one line of work. The other line of work is on the longest chains. And this is something also, another thing happening in things like Ethereum 2.0 is short reorgs, which is can miners kind of reorg the last few blocks and our protocol called Hammurabi, going back to ancient Babylon. 1:23:53: Tarun Chitra: I mean, you have Themis. You have Babylon, Themis and Hammurabi. You're just like, you're really going... 1:23:59: Anna Rose: Archaeology protocols. 1:24:03: Tarun Chitra: Yeah. It's kind of ironic to name the newer protocols after older things. It's you're going backwards in time to make future things, further back. 1:24:14: Sreeram Kannan: No, but I think some of these things actually organized and created law and order and organization and society at that time and potentially newer protocols. 1:24:25: Tarun Chitra: That's an alpha leak. I don't think anyone ever explained to me your choice of paper names until this moment. 1:24:33: Sreeram Kannan: Absolutely. So we also have longest chain protocols where you do not have a miner make ordering unilaterally. The ordering comes out of an emergent understanding or a consensus among many people presenting their views of how the order should be and then that gets synthesized into the finalized ordering. So that's the line of work we have on this direction. 1:25:00: Anna Rose: All of this stuff, like in just an interview I just did recently, it was actually talking to a team building the bots that exploits the MEV, and I'm wondering, here you talk mainly about the miners and the work for the miners, but do you actually remove the ability of these bots to use strategies to actually get something? Maybe it's not all the attacks, but are there still some attacks that get through? 1:25:28: Sreeram Kannan: Absolutely. I think the scope of attacks is something... You can broadly categorize front-running as targeted front-running, which is, I want to kind of sandwich your transaction, Anna, and extract some MEV from it, that I would call as targeted front-running. And then you can say, I want to do non-targeted front-running. Imagine the price fall of Bitcoin like two, three days back. What I could do is I say, I don't want to particularly front-run anybody, I just want my transaction to go before other people's transaction because I know the price is crashing, I just want to get out. So there are a scope of methods which can defend against targeted front-running basically by saying that I obfuscate or just send a commitment of what I'm going to send and then reveal it later. 1:26:20: Anna Rose: This is like the threshold decryption type of thing. 1:26:23: Sreeram Kannan: Yeah, this is exactly threshold encryption. Absolutely. So that'll solve targeted front-running because I don't know your transaction to target it. But still there is a massive scope of non-targeted front-running, which is prices crashing, I want to take all the price advantage, I'm a big trader, I'm going to put up a big transaction fee and go ahead of all other small traders, even though they actually sold ahead of me. This is basically like creating an insight system mechanism for order flow, which people don't like about Robinhood and Citadel and all these guys coming together. But what we feel is that the protocol should offer the absolute strongest guarantees on timing. The protocol should offer the best standards of timing. On top of it, when you build your DeFi applications, you could absolutely do things like you don't respond to each transaction separately, you batch them up, because you don't want... Just because Tarun sent an order early, I don't want to give him an advantage. I'm going to batch things at a slower resolution. In fact, this has been suggested over and over for things like the stock exchange, and you could absolutely build it. Once you have a fair ordered system, you could build these batching things at a higher layer, but you have a non-fair ordered system, you can never retrieve the people who got screwed in the unfair ordering. 1:27:48: Anna Rose: So, Sreeram, I know that you have a lot of work, a lot of papers, a lot of various works happening, and we probably won't be able to get everything into this episode. We're really at time at this point, but I'd love to hear what are you working on right now? Like after all of this work behind you, what are the focuses? Maybe you can highlight a few things and maybe we can bring you back some time to go deeper on these. 1:28:15: Sreeram Kannan: Absolutely. So one thing is when we talked about slashing, how do you build a system, even more basically, you think about proof-of-stake on the one hand has certain kinds of properties, proof-of-work has some kinds of properties, can you build a system which is better than both? So this is a question that we've been kind of ruminating over and one particular direction that's culminated in is a protocol called Babylon, which lets you derive security from Bitcoin To protect yourself, your proof-of-stake network against long range attacks. So suppose you want to build a Cosmos chain, but you want to also be resilient to long range attacks, you also want to be resilient to, let's say your own token market cap is only like a hundred million dollars, but you don't want your security to be bounded by your own token market cap. Instead, you want to borrow security from a security powerhouse, Bitcoin. How can you do that? So that is a theoretical project we did called Babylon and David Tse, who is a professor at Stanford and my former advisor has taken it up and actually building a project around it. 1:29:17: Anna Rose: Another archeology project, just a note? 1:29:21: Sreeram Kannan: No, actually building it. Engineering project, actually, they're trying to borrow Bitcoin security. On the other hand, I myself have been on leave over the last eight months building a company called Layer Labs. And Layer Labs is trying to solve a problem which goes fundamentally to the root of governance in blockchains. If you look at blockchains today like Ethereum or any of the other blockchains, how does innovation get in and move the practical blockchain industry? It's very difficult for innovation to find a place in existing blockchains because blockchains upgrade through unanimous or near unanimous agreement, which is a very rare thing in society. What do we all as a society agree on unanimously? It'll be almost nothing, but blockchains have to have unanimous agreement or near unanimous agreement to get new innovation in. 1:30:15: And this has bugged me a lot. Why? Because we come up with these cool algorithms, we don't want to necessarily start companies or anything. We just want to have other people adopt it. If Bitcoin were a company, absolutely they would have adopted ideas like Prism inside it. But Bitcoin is not a company and that is to its credit because it's a decentralized network. So how do we build decentralized networks that do not have unanimous dynamics? We need to build decentralized networks which have opt-in dynamics. For example, can we take the Ethereum network and say, oh, 20% of Ethereum stakers like my new protocol or like my new idea, they can provide service on this. What does it mean? How do you build systems like that? I'm just trying to solve the problem that we faced as innovators in this space again and again. When you come up with an innovation, you have to build your own trust network. Let's say Solana comes up with a new virtual machine, why do they have to build a new network? Why not just run the virtual machine on Ethereum? Because there is no way to run a new virtual machine on Ethereum. So this we face again and again and again, and I think this is massively slowing down the pace of innovation in blockchain because new innovation needs to build new trust. And by decoupling innovation and trust, you can actually, if you can get Ethereum have opt-in dynamics, if you can have Solana have opt-in dynamics on new innovations, you completely change the equations of adoption. So that's what we're building in Layer Labs. 1:31:45: Anna Rose: It also sounds like you're almost rethinking the role of the agents that actually do a lot of this kind of consensus building, like the validators would potentially change or the miners, I guess, in some cases. Like, would they dramatically be a different beast in this? 1:32:01: Sreeram Kannan: I think so. I think we have to give credence to the free will of these validators. So today you say, oh, I mean, everybody uses the Ethereum virtual machine or whatever, this consensus protocol because it's universally accepted by all of us, it's good, but why not each of these Ethereum staker also has a free will to say, I am putting my ETH at double jeopardy, not only providing the service, but also running like a new EVM 3.0 or like a Solana Sealevel virtual machine on top of Ethereum. What does it mean? So basically addressing the scale of what is agency here in blockchains. Validators not being passive. No, they just run the thing you tell them to run instead having active agency in deciding their own risk reward trade-offs and potentially earning additional yield by actually providing much more futile services. 1:32:58: Anna Rose: Crazy. You also start to create almost subsets in the chain itself, almost sub communities, sub applications. This is very interesting. 1:33:07: Sreeram Kannan: This also goes to something that Tarun asked about L2s. How do you build L2s which have strong value alignment and full security of L1s is something I think we can address. And we'll hopefully talk about all this at some... 1:33:20: Anna Rose: Yeah, so I think we're going to have to invite you back, maybe in the next few months, to continue this conversation. We've covered amazing ground so far, I think, in this episode and a lot of your background and what led you to a lot of this work. And yeah, I guess you've kind of just teased the audience a little bit here with what we can talk about next. 1:33:39: Sreeram Kannan: Absolutely. I'd love to come back. I really enjoyed the chat. Thanks, Tarun and Anna for bringing me here. 1:33:45: Anna Rose: Yeah. 1:33:46: Tarun Chitra: Yeah, I mean, I just want in a lot of ways, I think a lot of the... There are a lot of engineers who listen to this podcast who find ideas or things to implement or things like that from listening to this and maybe aren't so deep in the theory. So I think sometimes it's really good to give some exposition of what's been happening, especially since there haven't been conferences in two years or like really many conferences, like live conferences. So I feel like it's always good to hear about these types of things. 1:34:19: Sreeram Kannan: Absolutely. 1:34:19: Anna Rose: Yeah. Thank you so much for coming on the show. And I want to say a big thank you to the podcast producer, Tanya, the podcast editor, Henrik, and to our listeners, thanks for listening.