Anna (00:07): Welcome to zero knowledge, a podcast where we talk about the latest in zero knowledge research and the decentralized web, the show is hosted by me, Anna. Fredrik (00:17): And me Fredrik. Anna (00:17): In this week's episode, we learn about a waste this with Vishwa Rahman, who is the privacy architect at Oasis labs. But before we start in, I want to say thank you to this week's sponsor Trail of Bits, Trail of Bits recently published a new guide for building secure contracts with their Crytic offering. Crytic is a SaaS based GitHub application created by Trail of Bits to continuously assure your Ethereum smart contracts are safe and functional. It reports build status on every commit and runs a suite of security analysis, so you get immediate security feedback. Check out their guide, which I've added in the show notes for tips on how to build security into your dApps from the start, as well as how to use the Trail of Bits suite of tools for automated vulnerability detection. So thank you again, Trail of Bits. And now here's our episode on Oasis. Anna (01:19): So this week we have Vishwa Raman who is a privacy architect at Oasis labs with us. Welcome to the show Vishwa. Vishwanath Raman (01:26): Well, thank you very much. It's a pleasure to be here. Anna (01:28): So I think it would be interesting for us to understand a little bit about you before we started on this interview. What were you working on before you joined Oasis? Vishwanath Raman (01:38): Oh, I started off in 1995 working at Silicon Graphics. I think that is one of the things that I'm really proud of. We, uh, rendered SGI paperless, which is way before we had Google, which is way before we had all of the things that we are now used to, uh, in the way in which we interact with the internet. So this was one where we built a complete, uh, you know, our PKI based system. We had a backend, uh, set of services that could process these electronic folders as we call them. So these would be electronic forms when people submitted any kind of requisition, for instance, it could be for purchases or for leaves or whatever. So this would be an altar to our system and then be able to figure it out based on the content of those folders, how they have to be routed through the organization. Vishwanath Raman (02:22): So that was what we did way back in 95 using C++ STL. And in fact, back in the day, it used to be the case that, uh, you'd be able to use Perl for the front end and Larry Wall, the invented of Perl was within hollering distance from wherever I worked. So if you had questions, you would go directly to Larry. So this was me back. So I worked at SGI for a year and a half or so, and then moved to the design automation space where I worked on formal verification techniques for awhile. This was a work nine years or so in Synopsys. And then subsequent to that, I decided to go back to school. I went to UC Santa Cruz, got my PhD in computer science and then loved Academy as someone so that I decided to stay back to CMU as a postdoc for two or three years as it as such. And then after that Dawn reached out and she had started this company called Insight, which was a mobile security company. And so she wanted me to join them. So, which is what I did. This was way back in 2013. And so that's... Anna (03:25): Pre-blockchain, right? This is not a project. Vishwanath Raman (03:29): This is pre-blockchain. And so I worked with her there on mostly using static analysis techniques to see what kind of vulnerabilities that exist in applications, Android, as well as iOS. But Android was and continues to be much easier to find, not just vulnerabilities, but also to analyze, right, because it's bytecode, which you can. Uh, yeah. Uh, so that was what I did there and subsequent to which I went to, uh, a container security company. So I was in security and privacy ever since from 2013 or so. So I went to a microservices, uh, company. This was about three and a half years. I was working on machine learning techniques that can be used to detect anomalies for detection primarily. Then Dawn started Oasis. And for me, it was just a matter of time before I joined Dawn again and her next journey, on her next adventure. So that's exactly what happened. So I joined Dawn. This was 2018. So when the company was about three months. Anna (04:26): I see. So that was actually a follow up question that I had is this, you're the first person from Oasis that we've had on the show. Um, but we've heard about your project for quite a while. I think actually Noah spoke at an earlier ZeroKnowledge Summit. So I have some reference through that, but I'm curious to hear, like how did Oasis start and when did it start. Vishwanath Raman (04:49): Um, started in 2018 and this came out of Dawn Song's lab in Berkeley, because I think Dawn has, as you know, has been in security and privacy for the last, I think 20 plus years. And this was a project that was incubated pretty much in, at Dawn's lab. And then I think at that point, if you really think about it, what was prevalent in terms of technology or the limitations of technology for blockchain platforms was scalability because you have blockchain 1.0. I mean, it's still that way things are, but I think things are changing for sure. I mean, we, as well as other projects, I think there are different, interesting solutions for scalability. I mean, we have ours of course, which we are hoping to use to address the problems, but more importantly, I think blockchain 1.0, and then you had 2.0 at that point, which is the Ethereum, but then what we decided was not just to handle scalability, but then if you architecte a system from the ground up so that you're not just starting to scale, but you also addressed privacy and confidentiality, then that would make a significant difference to the whole ecosystem. Vishwanath Raman (05:53): And that was the Genesis for Oasis. And given Dan's background, given her expertise, I think it was a really natural thing for her to work on not just a blockchain platform, you know, which provides integrity, but also one that provides confidentiality and privacy. So that was the way Oasis started. Anna (06:08): What is Oasis actually, because I've heard about a lot of projects and initiatives and research that's come out of Oasis, but like what would you call the project today? Vishwanath Raman (06:20): Sure. It's a, I would call it a layer one blockchain platform. New Speaker (06:23): Okay. Vishwanath Raman (06:24): Which is pri... privacy-first layer one blockchain platform is probably the best way to describe what it is and it does. So, and if you think about it, Oasis is not just the blockchain platform, but also all of the tooling and the technologies that enable people to take advantage of the confidentiality and privacy guarantees that the platform provides. So what we have chosen to do is not just spin out a blockchain platform, but also to make it easy for us to attract developers who are not just blockchain developers for also enterprise developers. So right from the very outset, the goal was not just to enable the sorts of transactions that you would find in blockchain networks that are prevalent. And even today you find that mostly token transfers and things of that nature. Vishwanath Raman (07:05): But then what we really wanted to do was to enable demanding applications, for instance, applications that do machine learning training or those that can be used sort of for predictive analytics or, um, you know, they do differential privacy in a hand off secret database, for instance. So things of that nature. So we didn't want to be limited given the fact that we weren't talking to attract developers, not just blockchain savvy, but also those that belong to enterprises. The goal from the very outset has been to make it as easy as possible for devs of any ilk to be able to use the platform. And so that I think has been the overarching concern for the black box. But to answer your question, I think, yeah, that would probably be the best way to describe it. It's a privacy first blockchain platform with the tooling that makes it easy for devs. Yeah. So... Fredrik (07:50): I first came in contact with Oasis probably in 2018 or maybe early 2019 and were because back then some of it was built on parity-ethereum or what was parity-ethereum back then, then I, we talked at ZeroKnowledge Summit and it feels to me like it's gone through a number of pivots and a number of changes in architecture. Um, here is like, if you just look at the project today, how has that built? Like, are you, did you start building a layer one from scratch or is it still based on Ethereum? Yeah. What, what is the actual tech stack? Vishwanath Raman (08:23): Interesting. Yes. I mean, it was always a Layer 1 blockchain platform, for sure. I mean, at that point when we started, at least we were using parity, as you say, and also for consensus, we were using the Tendermint, Tendermint opensource code, but then in order to handle the problem of scale from the very beginning, it's been that we have taken this approach where you can have any runtime that interacts with the consensus layer. There's a clean separation of concerns between consensus and the execution layers and execution can happen through, for instance, that it could be Ethereum compatible runtime it would be based on it. It would provide a wasm runtime and execution environment could be a wasm runtime that's based on parity. It doesn't matter at all. So if you have completely oblivious, when it comes to the consensus layer, it's completely oblivious as to what the runtime capabilities are. Vishwanath Raman (09:10): And it's not the case that all runtimes might provide confidentiality. So we are completely open to enabling different runtimes. And the runtime that we run as a part of Oasis is a confidential runtime and we have a reference architecture and an implementation of confidential runtime. That is a wasm runtime, which is again, parity based. Yes. And that's the thing that anyone can run. I mean, if anybody wants to run a wasm based confidential around time, they can do so using our open source. Um, of course. New Speaker (09:39): When you say it's parity based, do you mean like, is it Substrate? Vishwanath Raman (09:44): Oh no, no, no, because we started at parity-ethereum. Because even before substrate happened, I remember the conversations that we had with the parity team at that point when substrate was emerging. And I think at that point in parity, more interested in substrate clearly than they were and what had been done before. And so there was that move even within parity. And so for us, yeah. So it doesn't have substrate to answer your question. Fredrik (10:07): Okay. It's interesting because you've sort of architected it the same way, right. Where you have, where you're kind of saying we're splitting these concerns apart, the layer one, the platform in which this runtime runs doesn't matter all that much. And what's, what's important is this runtime. Vishwanath Raman (10:27): And also the contract between the runtime and the consensus layer is something that the consensus layer dictates, meaning that there is a well defined API and the consensus layer is responsible for committee selection and not just the validator committee selection, but also the compute committee selection. And this is a randomized process that's used to select these, right? And then also token transfers, the native token transfers that need to happen for transaction fees, and so on. All of that is handled by the consensus layer. And more importantly, what we are enabling at runtimes is verifiable computing. And what that means is there should be a way by which the consensus layer can verify that the compute being done by a particular run time meets, you know, the consensus standard. So it meets, for example, some definition of lack of discrepancies before it actually gets committed to the blockchain. Vishwanath Raman (11:13): And that gives us tremendous flexibility really. And in fact, that's the way in which we hope to attack the problem of scale because you know, of course there's sharding and sharding and so on and so forth, but then more importantly, I think we want to be completely agnostic as to the way in which as I was saying earlier, there is a contract between runtimes and the consensus layer, and there should be a contract, which is completely decentralized between the runtimes and the people that want to use those runtimes. And we don't want to be prescriptive when it comes to the way in which runtimes satisfy those constituents. And those could be enterprises. They could be developers. I mean, if they are blockchain developers, then they are used a certain set of tooling. Then, there's nothing that stops anyone from running a runtime provides those services, those sets of tools. Vishwanath Raman (11:56): They would have a solidity runtime, for instance, it doesn't matter at all. And then when it comes to enterprises, because from the beginning, it's been important for us to satisfy these two groups. And when it comes to enterprise or workloads, many times you need permissioned execution environments, and then they may want to participate in the consensus layer. They might want to participate in a public blockchain. To some extent, there may be some data that makes it into the blockchain, but then the rest of it is something that might be maintained locally within those run times. So we want to be completely flexible for enterprises to do, but then they can choose to participate in a public ledger, to the extent they want, whereas a large part of the workload could run within their environments. And then the only thing that we would do is before anything gets committed to the blockchain, then of course the whole mechanism of verifiable computing comes into the picture. We make sure that, you know, we do this process, not as discrepancy detection that I can get into a little bit, but then, um, which, which, which enables us to run much smaller compute committees. Anna (12:53): When you say verifiable computing, can you maybe define what that means? Because I feel like we have, I mean, we have versions of verifiable, something mentioned on this show, but it might be referencing something else. I'm just curious what that means. Exactly. Vishwanath Raman (13:08): It also depends upon, in fact, one of the things that I should say is that when you're talking about verifiable computing, what is it that you're verifying? Right? I mean, it could be that you're verifying integrity. It could be that you're verifying integrity and confidentiality. So for instance, if it happens to be just an integrity verification, then you would need to make sure that, you know, there are some, there are some threshold or some bar is met for there to be integrity in the computation that is being done by the execution environment. For instance, if it's 3F+1 consensus, then we want to make sure that F+1 compute nodes agree on the result of a computation before they can be committed to the ledger. Right? So which means that you are actually doing discrepancy detection so that you can run with the much smaller quantities, right? You can run with f+1 committee sizes, and then for some definition of F and then you can figure it out. If all of them agree, then you can simply commit. But if there is a discrepancy, then you can involve the footlong consensus mechanism. So the, the verification verifiable computing is the step that is done before commit where you're verifying the results of the computation along two different axes, integrity, or integrity and confidentiality. Anna (14:12): This is something that exists in a lot of different systems or is this quite unique to Oasis Vishwanath Raman (14:18): For as far as I know it, unique to Oasis. You're not seeing this happen anywhere. Fredrik (14:22): Even, even in, in any normal blockchain. I mean, you, you have verifiable computing in the sense that like, even in proof of work, the miner verifies the transactions and, you know, commits the work well. And I think the more apt comparison is in proof of stake, where it is stated that you have a set of the validators saying, yes, this is correct. Or we're going to commit this to check. Yeah. Um, but then you have, I mean, zkSNARK is also verifiable computing, right? Where you, you compute it and then you submit a proof and you verify that computation by running the verifier on the proof. So that's verifiable computing, but so I actually want to dig in a little bit to the protocol of these committees that you're talking about, because this sounds similar to what a couple of other projects and Dfinity is kind of going down this path too, and like a couple of others. Right. But I want to pull it back into what we talked about with scaling. Cause you mentioned like you're also trying to approach the scaling problem. Right. But is it then from the angle of saying the miners, validators, whatever, don't have to recompute everything that they can just verify... Vishwanath Raman (15:39): Correct . Fredrik (15:40): ...or, or that you have this other committee structure, Vishwanath Raman (15:44): It could be the former, where they're just like they can commit. Because of the fact that the runtimes are providing the capability for verifiable computing. Fredrik (15:52): Yeah. So you're essentially removing this requirement that every full node has to recompute everytime all the time. Vishwanath Raman (15:57): Yeah. So here is one of the things that I would say that that is if the, the, the, the discrepancy detection that I was mentioning earlier is almost like detecting errors and then full blown consensus is a correction step. So it's for the first part, the optimistic case that discrepancy detection enables us to run with much smaller compute committees. And so that's the place where you're actually detecting problems. If there are no problems, you don't have a problem in commiting it, but if there are problems, then you want to go through full-blown consensus. So you can think of this as a two step process. The first is the optimistic case, but you're detecting, and the next is correcting, right. If there is a problem with, during the detection phase, and you're actually correcting using full blown consensus, Anna (16:36): Does this land somewhere on the, and I don't know if this is out of, maybe this is out of context, but on the like synchronous, asynchronous consensus scale, is that what you're describing? Or is it something completely different? Like I'm trying, I'm sort of thinking back to that episode, we did with Ittai where we were looking at like BFT and where that fell. And I'm just wondering if that's the part that you're talking about or is this, is this action you're talking about happening before Vishwanath Raman (17:01): I think before consensus. Okay. So if you think about it, the consensus layer has the interface between the consensus layer and the ParaTimes. We call them ParaTimes... ParaTimes, the parallel runtimes, right? These multiple execution environments. So the interface between the consensus layer and the ParaTime is this verification step that happens, the discrepancy detection happens over there. And then you have these signed commits that come from each one of the compute nodes. And that is what is actually being checked. If you notice that all of these are the same, if the hash has been computed, and each of the compute nodes that you have picked randomly to begin with, right, are, are the same signed by those nodes. Then you are prepared to accept them. If not, you want to go through full-blown consensus. I mean, we are not really interfering with the consensus step, but the only thing is we engage with it only when we find that there are discrepancies, Fredrik (17:47): When you talked about machine learning and more complicated compute steps, right? Like, are you able to tolerate non-determinism in this game? Vishwanath Raman (17:59): Yeah. So that's a very interesting question, actually, because if you think about it, any kind of replicated compute mechanism cannot be non-deterministic because if it is, then it's going to be a problem for you to find no discrepancies. So, which is why typically when you run something, as we say on-chain, which means that you have these smart contracts, so that I think with replicated compute, multiple computes, nodes processing exactly the same transaction, then we would need to control TCP to the extent where that I've known network accesses that are not, uh, you know, there's an, of course, no access to a local file system. And you also want to make sure that, you know, any randomness is pre-generated if something that is fed in rather than being generated within these programs so that the results are, so there is no non-determinism, but then to be able to enable workloads such as machine learning training, or any of these demanding applications, or the way we think of this as enabling off-chain compute. Vishwanath Raman (18:51): Right? So, you know, you can have a cluster of nodes, for instance, with GCP (google cloud platform), we have the flexibility of using confidential VMs, right? I mean, so for confidentiality, I mean, this is one of those things that I should have mentioned for confidentiality, that I'm wanting to protect technologies. I mean, let's call it secure computing, right? For a secure computing, and you can use FHE. You can use HE. You can use Trusted Execution Environment based confidentiality. And when it comes to trusted execution environments, you have Intel SGX. You also have AMD SEV as the underlying technologies. And with AMD SEV on GCP. We have a VM level encryption capability. So that then makes it possible for us to pretty much run Docker instances if you wanted to. Okay. So you can have fully self-contained binaries, which is usually hard to achieve with machine learning, because it's hard to make TensorFlow sit inside a single binary. Vishwanath Raman (19:39): So what you could do in these cases is that you can actually build a Docker image that has all of its dependencies baked in, and you can create an instance. And that then runs inside of a confidential VM, which gives you the properties that TEEs do. Right in the sense that you have at attestation. And then you have encryption, which is used for the entire memory. All of the pages that have been allocated for a given process is going to be on a program that's running inside of it. There's the enclave program is going to be encrypted. And so given those two, you have confidentiality that you can provide for workloads that are not restricted anymore. So you can be completely language agnostic. It's no longer the case that you're restricting developers to write using rust or using solidity. But instead of now they can try anything Vishwanath Raman (20:20): You know, anything they can do, like using python if they wanted to. Right. And so what we do is the Oasis runtime. Okay? So provides these capabilities where you can actually engage with off chain compute nodes to be able to do these demanding applications or run these demanding applications. And then the protocol is the one that's enabled that enables persisting what you want to persist on the blockchain and the entire key mediation, and showing through attestation, you are able to verify that what's running is exactly what is expected. All of that happens as a consequence of the protocol. So the protocol is being used to verify, do attestation proofs, and then, you know, handle our keys to enclave programs so that they can then access the data that they would need to be able to do whatever they need to. So this makes it even more flexible if you think about it. Vishwanath Raman (21:07): So, as I was mentioning, because given the fact that the overarching concern is to enable the developers, whether they wiil be blockchain developers, the enterprise developers. So I don't want to keep repeating myself for them. It makes it easier for us to enable them. You know, if they, for instance, if there was an existing stack, if there was an existing application stack that's been written, which is cloud native learning in the cloud, it makes it easier now for us to tackle confidentiality for that entire stack. I'm not saying that we are there yet, but then it makes it possible for the sort of stuff. Yeah. Fredrik (21:35): I didn't know that GCP had that actually. That's cool. I have to dig into that further. Cause like, I mean, a big problem with TEE is actually just deploying them, getting a TEE to run. It's really hard. Even like all it's all Intel CPUs shipped with SGX, but most of them are, are like BIOS blocks. You can't actually enable them on this specific CPUs that you can enable them. You also need like a motherboard that supports it and the right BIOS for that. And then like it's a whole stack all the way up through the compiler to actually make anything work on it. Vishwanath Raman (22:13): Exactly. And think about it. It's so painful from a tooling perspective, Intel's SGX SDK, the fact that they came up with this curiosity then, right. I mean, it was fantastic. It's an incredible technology piece of technology, but then the tooling is difficult. I mean, if it hadn't been for, Fortanix for writing Fortanix EDP, it would make it so much harder for you to build enclave programs. Yeah. I mean, with EDP, it makes it possible for you to take Rust and then build programs in Rust and then enclave them. Secure updates from them. So I think you're absolutely right. And it's been very hard for us to use TEEs from Intel, exactly. For these reasons. But then now with a GCP, this announcement from GCP by the way is very fresh. It's a couple of months old. Yeah. Vishwanath Raman (22:53): It's not something that's been around for awhile. And also one thing that you should remember is that the TEE technology that backended by the support on GCP is based on AMD SEV, which does not provide application capabilities yet. So it gives you encrypted memory, but it doesn't give you attestation. And that is something that is expected to land by the end of the year when AMD supports AMD SMP, which is secure master pages. And so once that happens, I think we'd have the full capabilities, the full gamut of what we can do with Intel SGX is something that now becomes possible using confidential VMs from Google. Fredrik (23:26): I want to go back a little bit to the protocol because I still don't really understand the role of the committees. I mean, as you just described it, it's sort of, you can imagine a smart contract. That smart contract kicks off an off chain worker job. And then that worker comes back with, okay, Fredrik (23:46): here's, you know, my, my execution, here's the, you know, in an ideal case, the, at the attestation and then like, here's something for you to verify. Uh, and then the miner in the system, whatever they are can look at that they can verify the computation. They can say, this is legit. Let's include it. Right. And so, uh, where does this, um, smaller group of validators or runners come in? Vishwanath Raman (24:11): Oh, I see. Okay. So that would be, Oh, not for the off-chain case, because I was mentioning earlier that sometimes you can provide smart contract support. So in which case you have the full gamut of replicated compute where your smart contracts are deployed exactly the same way as you were deployed them using, say for example, the Ethereum-compatible platform. So nothing changes in the sense that this is just a capability that we provide as a part of the Oasis runtime. Vishwanath Raman (24:33): But then what I should also mention is that this is different from the referenced confidential runtime, which is based on Parity, that'd be make available for anyone to run, right? So this is just expanding the footprint, so to speak. I mean, for blockchain developers that are used to their existing tooling, they can continue to use, for example, what we have, which is like any Ethereum-compatible wasm runtime, that's the point at which all of the various things like, you know, manifest. But then what we have done is for off-chain, this is, it's again, as far as the platform is concerned, it's not prescriptive, but this is just one of the things that we build. So we have two different runtimes when we have this runtime that is Ethereum-compatible wasm there we just talked about. And the other happens to be the one that I was talking, which is the one that provides this off-chain capability. So nothing stops people from using any kind of technology. And you can think about it. You could even have an FHE based runtime that is dedicated permissioned learning training, you know, because that seems to be the one that's getting a lot of traction In the FHE community, at least. So that could be the one that gives you those capabilities or it could be specifically using ZKP. Yeah. So that's the flexibility, which I think is interesting. I, I, yeah, I hope I answered, answered that question of yours. Anna (25:43): I want to, I mean, so you just mentioned, you know, you're using TEEs or TEEs or like at the core of what, you're the kind of architecture that you've created. But like we did do an episode on TEEs about a year and a half ago, I believe. And we also have heard a lot of criticism from the community that... Vishwanath Raman (26:02): Side-channels... Anna (26:02): ...we have around you. And so how do you address that? Like, what are, were you not concerned about those same issues? Vishwanath Raman (26:12): Yeah. So that's a very interesting question. In fact, we get asked that question quite often, but as you can imagine, there are multiple answers to this. I mean, one is, if you look at just TEE technologies, I mean, we are not wedded to Intel SGX. I mean, as I was mentioning earlier, we are also looking at AMD SEV and Professor Song with, uh, collaborators at MIT is working on project Keystone, which is like an open source, our trusted execution environment project. So there are multiple TEE technologies that are emerging and we are not wedded to any given secure computing technique. So for instance, if FHE becomes more usable, let me put it that way. I mean, it's not that it doesn't give you the guarantees you seek, but then it gives you better guarantees probably. But then there are still issues with enabling your data at the scales that you would like. Vishwanath Raman (26:56): And so we are constantly watching out for improvements that are happening in that space. And if it catches up, I mean, we'd be completely open to using that as well. But for us, given that the overarching objective is to enable these demanding applications to run. We need near native performance, given which TEEs are the ones that give us the best of both worlds. I mean, you can still tweak your security parameters to the extent possible by which will make it much harder for attackers to compromise these systems. But then that's always a battle. If you think about it while vulnerability is always going to be there, it's not something that you can address. There'll be vulnerabilities and any system that you build. And the battle is always trying to make sure that you make breaking a system that much harder in terms of compute in terms of everything else that people need to bring up, um, in order to be able to break it. Vishwanath Raman (27:41): And when it comes to side-channels, I mean, there are side channels that can happen at the level of the TEE that will also say channels that can happen to the application layer. And for the application layer that we have solutions. I mean, you have oblivious RAM. You have various ways by which you can, the applications can at least address the problem of side channels, but you're absolutely right. That when it comes to the native DNA technology, there have been attacks that have been discovered. It might take a while to discover these things, but once they've been discovered, it's much easier to replicate. It is almost always the case in security. So that's what I would say. And the guarantees that are provided are substantial and it's an evolutionary process. I mean, we are, where we are, we are providing the capabilities and confidentiality. and the best part of it is this, you know, you're now making it so decentralized that anyone that wants to submit a transaction can choose if they want to submit a transaction to a runtime that is TEE-based, or if FHE that itself is the power that I think we bring to the table, right? Vishwanath Raman (28:37): Because now you're really empowering an individual. I mean, it depends on their perception, their trust, their trust model that they have in mind that needs to be addressed. And if your security model, the one that is provided by your runtime, addresses their threats then I think you're good, right? If it doesn't, then they are not going to be using that runtime. They'll use something else. And for us, then it behooves us to make sure that we provide a platform that enables anyone to run a capable runtime. That would be appealing to developers that can come from anywhere Anna (29:08): You mentioned. I mean, TEEs, we know we understand in the realm of a privacy technology, but like how else does Oasis approach privacy? You're the privacy architect. So I'm very curious to hear what other kinds of privacy. I love that you laugh at your title. It must be a new one. Vishwanath Raman (29:32): So that is really interesting because for me and for others, in fact, in Oasis the way we think about security is that there are three different axes, right? You have integrity, you have confidentiality and you have privacy and then confidentiality and privacy are different things. TEE-based technologies give you confidentiality FHE gives you a confidentiality, but privacy, you know, in our mind at least it is a little different. Let me, let me explain what I mean by that. I think it's, it's best to explain that with an example. Now, let us say that you have an organization that has a database of employees and a column is their salaries, and you want to provide like an average credit that can be used to compute the average salary of the employees in the organization. Now, if I run this query once before a new hire joins, the company, and once after, and if I know the number of people in the company, I can then determine exactly how much money or what the salary is of the new hire. Vishwanath Raman (30:25): And so this, you might run the computation with full confidentiality. You might use TEEs and you might use FHE. You can use whatever you want. It doesn't matter at all. But the point is the result of the computation in this case is actually leaking sensitive information. So that is where I think the way we look at privacy, that is the place where you look at privacy. So there's an application level construct if you think about it. So you want to make sure that you have, of course, confidentiality address. You have confidentiality in transit, you have confidentiality during compute, but then you also want to make sure that the result of the computations, not just confidential, but also privacy preserving in the definition that I just gave you all the example that I just gave you. So for that, there are multiple techniques, for instance, there's differential privacy, which is the thing that we, uh, have some expertise in Anna (31:11): Differential privacy. What, what is, what is that? What does that mean? Vishwanath Raman (31:16): You can think of it this way. Differential privacy is a technique that gives you the ability to add... jiggle your result a little bit. I mean, you add noise to the result, in a way such that you know, it will be privacy preserving. So in effect the presence or absence of a given record, a single record, I mean, these are actually called, called databases that are very close to each other. So, but the differences we're going to record. So in the presence or absence of a record, the reason should still be somewhat similar. Okay. So that's what you would want. And if you're looking at differential, privacy comes in two different. There are two different places where differential privacy is necessary. One, this was statistical qualities, like the average credit that I was mentioning earlier. So which means if you have a secret database and you want to apply statistical credits on that database, you want to make sure that the results are differentially private. Vishwanath Raman (32:03): Okay. And, uh, the way that's typically done as the way we do it, at least is by querying items. So which means you have a query, a SQL query, that's re-written so that you make it intrinsically private. So the noise mechanism is actually added to the query. So which means now once the query is re-written, it can just run against your database. It doesn't matter what the database is. As long as the database supports SQL, and it supports a bunch of mathematical functions. You can run this query just like the way you were running your, uh, you know, the non-private query and you will get a result that is differentially private. What's typically done, in fact, is that you use like Laplace distribution, the parameters is the distribution that based on the sensitivity of the query, and then, under a privacy project. Vishwanath Raman (32:46): And then that determines, you know, what the spread is going to be after distribution. So when you sample from there, you get either more noisy or get less noisy. It's based on one tweakable Parameters and another that is intrinsically part of the query. So that is for a statistical query, but then Dawn in fact, through her research and may, does in fact have shown that, um, machine learning models have this habit in certain cases of memorizing the input segment into training then, okay. So between study, you can actually use the model itself to glean sensitive information that was used in its training. Yes. Anna (33:20): I see. If you... Cause it will, it will sort of repeat patterns and it will go back something that is what it's learned. Vishwanath Raman (33:26): Yeah. Right. I mean, so in machine translation, for example, language models, if you want to be able to translate from one language to the other, based on the compass, you might even have situations where sensitive information has been ingested during training, and that's a part of the model. So that models can leak sensitive information. So for that again, differential privacy is used so that you do something that is similar in the sense that you jiggle the parameters of the model so that you cannot reconstruct input segmenting the training model. So, so which means now our differential privacy is a privacy modality that are placed in our this machine learning training, but also applies to statistical queries. And the important thing for machine learning training is that, I mean, you might do federated learning. I mean, which can be thought of as a privacy modality. By that, what I mean is part of the compute at least goes to the data, right? Because the data is not going to a central server that it's being aggregated, and then you are applying your training algorithm. But instead you are actually computing gradients locally where the data is located and that's what is actually used, is averaged and then used to update the model remotely. But the fact that the model where it's updated can still leak sensitive information. That's the problem. So no matter what you use to train the model, you need differential privacy to make sure that the result does not leak any sensitive information. Anna (34:35): Interesting. So, sorry, I interrupted you with the differential privacy question. Cause I actually had that on, on my list of questions, but are there other privacy concepts that matter in the Oasis system that you're thinking about? Vishwanath Raman (34:50): Oh, thinking about it, but not yet. Uh, zkp for sure. I mean, there is a lot of work and I mean, this is the right place. That's fine, but I'm no an expert in zkp for sure. I mean, I'm not, not an expert in homomorphic encryption either, but then from what I understand, yes. I mean, it has a place for... What is really fascinating for me. I think where zero knowledge techniques are applicable would be in, you know, decentralized ID, for example, in credential keeping, for instance, if I have a few of trust hampers, my, the government might give me a passport for instance. And I might summarise that in some way. And that is information that I own as an individual. That's mine. And the drivers...what's the... The department of motor vehicles, my local department of motor vehicles might give me a driver's license that I also want to hold as a credential. And now when you want to check certain things on this credential, then I can see that, you know, techniques such as zk become exceedingly yeah. of knowledge become exceedingly valuable. So we are not oblivious to that, but then we want to bring it in at some point, for sure. But then not at this time, because I know we've been busy with other things. Yeah. I can see applicability for that in many places. Yeah. Fredrik (35:56): The concept of differential privacy as fascinating. You have to read up more on that. I can totally see how it worked for statistical examples, because like it's an average, you don't really care what the exact number is. If it's fudged a little bit, whatever, but in a machine learning model, it feels like you'd be destroying the model by introducing noise, like the whole point of how you're building up the model and why you're doing it. It's to get to these exact numbers like this exact matrix. And then if you're like, Oh, let's just fuck it up a little bit then how does that work? But yeah, Vishwanath Raman (36:30): In fact, there is a technique known as approximate minimum propagation. You should take a look at some of these. I can also send you separately via email. I can send you some links. Anna (36:37): Actually. We should share it in our show notes. If you do, if you send us anything personally, share it with our listeners too. Vishwanath Raman (36:42): Absolutely. Fredrik (36:45): Talking a little bit more about Oasis and what you've built. You mentioned that there's a mainnet coming up. I'm curious also, just to hear your thoughts on, you've already talked about this. Like, it shouldn't really matter where you're coming from, what kind of developer you are, seems like you're trying to appeal to like an enterprise use case as well. How do you see this breakdown between public blockchains, private blockchains, permissioned ones or unpermissioned ones and all these. Vishwanath Raman (37:12): I, this, I can cite from examples that we have working with some of the enterprise partners, some of them clearly want to have a participation in a public blockchain, but I wouldn't say that that is universally true. For many others because cloud services today provide ledger technologies. Like for instance, in AWS, you'll find that that is QLDB, which gives you a similar guarantees. But of course, I wouldn't say that it's as robust as a completely decentralized ledger. It's not. Because AWS has to own your master keys, but then be that as it may, what we see is that it's a spectrum. I mean, there are some enterprises that are completely fine with... It depends on the use cases is that for example, we are working with an automotive company that has this grand plan eventually of doing consent-driven data capture from the dashboard awakens. Vishwanath Raman (37:59): So, which means you pretty much came in the dash, participate in a campaign that this company has, where they wanted to collect a specific piece of information, like how you're using a particular vehicle feature. And then once you accept it, then they'll collect that information. And now the new transparency for at least the way that information, the journey of the information that is being collected in which case they might want to, you know, at least have some part of the transaction captured on a public ledger so that it becomes auditable and transparent. And for other cases, like for instance, in healthcare, we find that healthcare is one of those very cagey industries where you find a lot of information, does a tremendous amount of data siloing that happens in healthcare. They do, you know, hold their data a way closely and participating in a public ledger is something that is not encouraged in many groups within healthcare. Vishwanath Raman (38:46): But then what we see is that there is an emerging trend. I mean, with COVID things did change. And we think that this might be a great opportunity for there to be a disruptive change in healthcare. Because I mean, I can give you various examples. Care continuum. I mean, you have IOT devices at home and you want to make sure that you capture data from these devices. And then you want to provide this information to your doctors or even to programs that can, you know, do some sort of pre-diagnosis and might even be able to prevent seeing your doctor. And they might obviate the need to see a doctor because there is a way of self medicating and so on and so forth. So there are a lot of opportunities that have opened up as a consequence in healthcare, but maybe I'm deviating a little bit from your question. But what I would say is that it depends on the use case. I mean, if there is consumer data that is involved, even as a D2C use case where the business is actually providing value for the consumer, then I think participating in a public ledger becomes even more important. So it's not necessarily the case that everything participates in a public ledger, but then they can pick and choose. There are certain things that will, and others may not, it doesn't matter. Fredrik (39:49): So what is the, what is Oasis's mainnet and what does it actually, you know, set out to do? I'm particularly curious, like you talked about these parallel runtimes or paratimes, which parachain - paratimes will be there. That's launch. Vishwanath Raman (40:07): Sure. Yeah. We expect to have a couple at least. And um, one is going to be a wasm runtime which is Ethereum-compatible, that's being run by second state, which we expect for the land by mainnet. And then we also have an Oasis runtime. And our runtime is purpose-built for these ecosystems of data providers and data consumers, because Oasis also wants to use our platform for a responsible data economy or responsible data use will be the way we envisage. This has to play out as where individuals own their data. They can tokenize it and then there's tokenized data can participate. So data is a non-fungible asset, like real estate or cars, if you think about it and that it hasn't been thought of that way so far. But then we would really like to make that change. Vishwanath Raman (40:52): We would like to enable that post mainnet of course, where people can tokenize the data and participate in data markets. But then to come back to your question, the Oasis runtime, that will be there mainnet launch is the one that enables these ecosystems of data providers and data consumers, data providers have that persona has certain requirements. They want to upload data sets. They want to set policies that can be programmatically verified. You know, the policies could be things like that. That is only this corporation that can analyze my data. And I expect the results to be in this format. For example, interpretations of, you know, your pologenic scores of your genome. So that's in fact the use case that we are working on right now. So things of that nature. So that would be the runtime. So the runtime Oasis'll run, the Oasis runtime is the one that enables these ecosystems to make it easier for enterprises and developers, to be able to engage with the platform. Fredrik (41:41): And what validators exist on this network. Fredrik (41:45): Right. Well, is it proof of stake and like anyone can participate as a validator or do you need special hardware to do anything? Vishwanath Raman (41:51): No special hardware. Yeah. Anyone can participate. It's proof of stake. Yes, absolutely. In fact, I think that was one of the things that would also have raised concern, right? Because if they all needed to have SGX enabled the boxes and they needed to have some kind of trusted execution environment, no, that's not really the case. Um, right now pretty much the validators that are participating and other platforms are also participating on Oasis. We have over 400 validators and all operators at this point. I mean, I don't know the exact numbers. I mean, it's one of those things that I keep, I have to take a look at. If you want me to, then I can pull up a slide and I can give you those numbers, but I don't know them off the top of my head. Anna (42:29): So just going back to the ParaTimes and the sidechains, and I know you may have actually mentioned this earlier on in the episode, but when you talk about it, I, for some reason I still don't understand exactly how they're stuck together. How do they link into each other? So you talked about this mainnet runtime, but are they connected or are they running separately and not connected? Vishwanath Raman (42:49): Separately. Anna (42:49): Oh, they are. Okay. So they're different instances and they don't, there's no interoperability, Vishwanath Raman (42:54): The ParaTimes are independent entities at this point. on mainnet, they're going to be. Separate communication between ParaTimes is not expected, but that can change. I mean, for instance, there could be people that run multiple ParaTimes that have mechanism by which they can communicate with each other. We have also been looking at how we can enable it for instance, on DeFi space and we have a lot of DeFi partners and we are trying to bring them on board with their existing codebases on runtimes that would be able to support their code. In which case it would be nice for us to have a way by which we can communicate with those instances from say the Oasis runtime. So there are a lot of things that can be enabled by that, especially when I, when it comes to data markets and tokenization and so on and so forth, because I do expect that these things will start to play. Um, but on mainnet, it's not going to be there, but it's something that we have taught off. Anna (43:44): I just wonder, like, would one of these ParaTimes be useful as a side chain to another system, then Vishwanath Raman (43:50): it could be. New Speaker (43:50): like, if there was a way to connect those back? Vishwanath Raman (43:53): Yeah. That could be, yeah, for sure. I would expect that to be the case, Anna (43:57): But I guess you could do, like, that could be any, it's just spinning up a brand new blockchain, like, so it doesn't have any, like how would the security of those actually function? Vishwanath Raman (44:06): Oh, it wouldn't be a separate blockchain. It won't be a separate blockchain. I mean, it would still have to be a separate blockchain. So, so the thing is ParaTimes can have their own state management, meaning that they can have their own internal state that they manage. They have their own storage nodes, they have their own key management if they happened to be confidential and they can also have a compute, all of the various things. But then when it comes to committing anything to the ledger, that is the single consensus layer that has to be, it has to go through that, that I have no separate multiple consensus layers that are participating in sidechains. No, it's not that at all, the architects that is very different from that sense. Anna (44:37): So it would still go through this sort of valid... the same validator set. Vishwanath Raman (44:40): Exactly. Fredrik (44:41): I think maybe the simplified, uh, analogy is thinking of it, like each runtime being a smart contract, but that, that smart contract can have its own execution logic. It has its own state. This you're not sharing a global state between all the contracts. Vishwanath Raman (44:55): Exactly. Yeah. Anna (44:57): Cool. So there was some announcement that came out quite recently about the crypto-safe platform. Yes. Can you tell us something about this and, and how this relates to everything we've just talked about? Vishwanath Raman (45:10): Absolutely. Crypto-safe was something that came out of a discussion that we had with finance, uh, maybe a couple of months ago. So it's interesting. They have a unique problem. I mean, you have exchanges that are processing so many transactions, crypto transactions, volumes of sources, destinations and so on and so forth. And there is a lot of research that's going on in even identifying malicious wallet addresses for instance, or identifying or creating fingerprints of end points that may be participating in fraudulent transactions. Now there is a lot of fraud. In fact, you'll notice that even for ransomware attacks these days, I mean, crypto seems to be the way by which they expect to be paid. And they'll given all of this. I think the need of the hour is for the exchanges to pull in their threat intelligence because each of these exchanges, I'm not saying that this is true for all exchanges, but then there are many that have their own threat intelligence teams. Vishwanath Raman (46:01): And these threat intelligence teams are continuously figuring out what would be say, malicious wallet addresses or what would be interesting fingerprints or signatures that identify fraudulent behavior. And if there is a way by which they can share it in a way such that they're not required to trust each other, they're not required to even trust Oasis. Then I think it becomes an incentive-compatible mechanism for all of the participants. And let me tell you what I mean by that. Okay. So you can also make it a little bit of monetary compensation. For instance, an exchange could build a reputation that a lot of things that the platform can enable, right? Because an exchange can build a reputation. If it has a very strong threat intelligence team, then clearly their threat intelligence is going to be more valuable for the others to react, too, to prevent malicious transactions. Vishwanath Raman (46:46): And so there could be a compensation structure that also is worked out by which exchanges get compensated based on the value that they bring to the platform. But then I would suffice it to say that, to begin with what we want to do is to enable exchanges, to upload their threat intelligence. And this is going to be maintained confidentially on Oasis when Oasis does not have the ability to interpret the threat intelligence. So we have come up with a mechanism. Of course, there are multi-party computation mechanisms. You can use privates, right? I mean, you can use private set intersection cardinality. There are many, many ways by which you're going to choose something similar, but then that also requires more compute capability from each of these exchanges. They need to have dedicated resources to be able to participate in some kind of an MPC mechanism to achieve this. Vishwanath Raman (47:31): And so what we came up with was a system that would enable them to share this data where they are not required to trust each other. They're not required to trust Oasis. And we can provide this capability by which they can check, you know, any fixed bit width field against this threat intelligence. So, so crypto-safe enables exchanges to come together to share their threat intelligence where they're not required to trust each other, nor are they required to trust Oasis. And so Oasis manages the data that we get from them to run queries that they can then use to prevent fraudulent transactions. Okay. In a nutshell is what crypto-save is. Anna (48:07): And, but how does, what does that really have to do with, I guess the privacy part here is that like the data itself is kept private and still could be queried. Vishwanath Raman (48:15): Oh, it's amazing. Vishwanath Raman (48:16): It's defense in depth. If you think about it, right? Because you provide the platform, provides confidentiality, the platform, as I was mentioning earlier, we are enabling these ecosystems where data providers can be completely assured of the fact that their data is something that is encrypted and they are the ones that own it. Right? So that is the piece of functionality that, that is one piece of the puzzle. But not just that, I mean, it's defense in depth because you have, you know, confidential runtimes, you have data being encrypted, put exchange that is then being maintained with all the transactions persisted confidentially on the ledger. And only the exchanges can actually record the transactions that they have submitted or the delegated authorities. They might delegate access to other identities that can then check the transactions for a given exchange. Right. And if you think about it, what is really powerful is that threat intelligence by itself as something that is not as sensitive. Vishwanath Raman (49:07): I mean, you can argue, right. You don't obviously want it to fall into the hands of people that are malicious participants, because then they know exactly the kind of intelligence that you are picking up about them. But then more importantly, there is a lot of business intelligence and the queries that are being run by the exchanges, right, because of an exchange, and consequently, then that is information about an existing customer that more often that that is not malicious. And so you need to make sure that there is complete into in confidentiality and the platform is capable of giving you complete end-to-end confidentiality for your transactions because you are not just encrypting data at rest, but you're also encrypting payload. You are, you have key management that is responsible for you, you know, generating keys out of your strength payload, program, state data that the program acts upon. And all of these are then processed inside of these secured enclaves to give you the level of guarantees that are, that are so far being impossible to achieve I think. Fredrik (49:56): This problem is an interesting one. It's actually like, I think I've actually seen this in a textbook where, uh, it's this game theoretic problem where you have a bunch of competitors who they'd all be better off if they shared the data, right? Like if there's a comprehensive, you know, threat intelligence sets, they would all benefit. But if only one participates, then he's helping his competitors and like, you shouldn't participate if he's the only one. And so like you have this, if it's a repeated game, they'll find the solution that they should all share and be all better off. Vishwanath Raman (50:32): Yeah. It's, it's quite fascinating actually, Anna (50:35): To close out the interview, I'm curious, like what stage is Oasis at? Like you are in testnet from what I understood at this stage. So where, yeah, where are you at and how close are you to launching? Vishwanath Raman (50:49): Um, we are pretty close to launching. So we have the candidate mainnet that is actually being tested at this point. So I would say that we are pretty close. Uh, hah hah. Anna (50:59): They can't say more. I mean, this may just, just as a side, this may air in like two weeks already by that, but then what what's next in privacy? Like what are the, what are the future things that Oasis is thinking about on that? Vishwanath Raman (51:21): Yeah, so we have a, a whole bunch of use cases you're working on. So enabling all of those. So if you think about it for me and for, for the company as a whole it's operationalization of all of the things that you've built so far, I mean, we just want to make sure that all of these things scale and they become useful, right? And that there are people out there that are building applications that become successful with it. And if you look at the kinds of applications that people are building right now on the platform, they're fascinating, just enabling them is going to be very satisfying. Honestly. Like for instance, there is a genomics company that was working with us where they're providing these interpretations of polygenic score study I mentioned earlier, then there are companies, like for instance, there is one that wants to enable masking technologies. Vishwanath Raman (52:03): Like you can create a video. If you're a reporter, for instance, working from a place that is extremely dangerous, then you can create a video and then you can use their AI technology to mask your face out. And that is then going to be released to the public so that people can't really point out who it is. And then there is this other company that wants to, I mean, it's like a Black Mirror episode where they want to pretty much make your memory searchable. Anna (52:27): Whoa. Vishwanath Raman (52:27): So they want to take you out. I mean, it's just fascinating, you know, if you think about it because it's like interactions such as this, where the things that we say that can be captured, and from there, you can do, you know, speech to text translation, you can do topic modeling, you can do various transforms on that data and then you can figure it out. Vishwanath Raman (52:43): What would be interesting memory capsules that you want to capture from the conversation and then make that such a goal. That's what they want to do. And care continuum is one of those things that I was mentioning earlier. In fact, the other thing that I want to say is that this is a use case that came to us recently in Australia because of COVID, where people that have, you know, mental disabilities or they have the need to seek mental health care, they needed to be matched with the providers that would provide pro bono services. But then how do you enable that? Right? Because there's credentialing, you need to be able to verify that a lot of sensitive information is being shared with the person. How do you enable all of these things? I mean, they are going to be maintaining all of this with confidentiality becomes important. Vishwanath Raman (53:24): User privacy is so important, these matters for healthcare, for sure. Right? So there are so many use cases sort of answer your question. I think it would be more of, you know, enabling the most fascinating use cases that have not been possible before, using technology that we have built so far, and ensuring that it scales and they are successful. We just want to make sure that they are successful. Anna (53:42): Cool. Fredrik (53:44): Alright. It's been a fascinating conversation and thanks so much for joining the show. Vishwanath Raman (53:48): Thank you so much. It's been an absolute pleasure. Thank you both to you. Fredrik and Anna. Anna (53:54): Sure. And good luck with the upcoming mainnet launch sometime in the future or present if this airs right. When it does happen. Alright, so thanks so much and to our listeners. Thanks for listening. Thanks for listening. Fredrik (54:14): Thanks for listening.