Ben: Hello, and welcome to Pod Rocket. Today, I'm here with Jorge Sancha, who is the co-founder and CEO of Tinybird. How are you, Jorge? Jorge Sancha: Hey, I'm great. Thank you. Thank you for having me. Ben: Yeah, of course. So I'm excited to learn about Tinybird today and what you're building. And a lot of exciting things happening, I know, recently, just unveiling your Series A yesterday, I think. Jorge Sancha: Just yesterday. Ben: Yeah. Well, big congrats on that. So, yeah, maybe you could give us a quick overview of what is Tinybird. Jorge Sancha: Sure. So Tinybird is a platform. At the same time then, a set of tools that allow developers to build data products over real time data without having to worry about scale. So you can start throwing data at Tinybird, build some pipes, some SQL, and turn the result of those queries and those pipes into API endpoints that you can start the exploiting right away from your applications. Ben: Got it. So I have some sort of data, I'm constantly sending it into Tinybird, and what kind of data format might that be? It'd be any shape of data, any protocol, any format, or what is the API for ingesting data into Tinybird? Jorge Sancha: Yeah, you can ingest a number of formats. CSVs, JSON, Parquet. But the actual source of the data can be varied. It can be streaming data, like coming from a Kafka, for instance, or coming from Kinesis. That's usually the case for events type of data, things that are happening, and that it's append-only. Like facts, let's say. And then you can have dimensional data also in Tinybird that you bring maybe from a data warehouse, like a product stable, or a company stable, or a warehouse stable, or something like that. That then you can use to join in real time and incrementally with the data that it's coming in a streaming fashion, and then transform it and enrich it such that you can then exploit it much easier than you would otherwise. Ben: Got it. And you mentioned writing SQL queries. So explain where SQL comes into play. I've ingested some data to it. Is it then transformed that I can write SQL against it? Or how does that exactly work? Jorge Sancha: Yeah, because we're very focused on building APIs that you can quickly whip up and then start integrating into your product, if you think about what an API normally entails, when you're building your own APIs, you have some code that will run a SQL query of some kind into your data store. If you're using a relational database, it could be something similar to SQL, some query language if you're not using a SQL database. But you're writing queries against the database, and then you'll be doing other things. Like you'll be ensuring that there's security for those goals, and you'll be ensuring that it's scalable, and you'll be doing a number of things in order to put those APIs in production. We take care of all that. Jorge Sancha: We want the developers just to think about the SQL part. What is it you want to extract from the database in order to expose it to your application, and then switch it on as an API, and the result of those queries will be the output of your APIs, let's say. And you can easily add parameters. You have a templating language that you can apply on top of a SQL so that you can pass in parameters, you can pass in conditions, and you can define your own errors, and those kinds of things. But we take care of the scale, we take care of the security so that you can, as a developer, if you know SQL, you can build data products that scale to whatever you're trying to scale without being a data expert, or a data engineer, or having an army of data engineers behind you in order to do your job. Ben: And I guess, under the hood, what is the underlying data storage layer that you use? Have you built on top of a big query, or some sort of existing tool, or is it all custom? What does that look like? Jorge Sancha: We use an all up database, an open source database called ClickHouse. ClickHouse is a super powerful database that's designed to scale both horizontally and vertically. We were blown away when we started using it a few years ago, and we became experts at using it. Jorge Sancha: But the thing with ClickHouse, we always say it's a bit like a Formula One. If you've been driving for many years, and you have a team of expert mechanics and so on, then you can make it run at 300 kilometers per hour. And, essentially, what we're trying to do is enable any developer to drive the Formula One. So any driver, let's say, you at least have to know SQL, but if you know SQL, then you can build APIs that will scale and will take advantage of that performance and that scalability that ClickHouse brings without having to worry about what's happening in the background, whether you're using replication, or how to replace data when you've made a mistake, or what to do when something fails, and so on. Ben: Got it. And, yeah, I've definitely been hearing a lot about ClickHouse recently. I'm not super familiar with how it works. So I'm curious, when you look at just native ClickHouse, how do you query it? I assume it doesn't support SQL, so what does their query language look like? Jorge Sancha: Yeah, no, it supports SQL. Ben: Oh, it does. Okay. Jorge Sancha: Yeah, it does. It's not 100% standard, but it's almost there. And then it has a number of added functionalities and functions that are incredibly powerful as well to do statistical calculations and so on. But it's a columnar database. It's designed to do aggregations and do calculations over columns of data. So it's really good at inserting it very fast, and it's really good at querying data very fast, and running these analytical queries. Not so good at keeping one individual record changing all the time or doing updates basically, or deletes, and we cover for that with functionality we've built on top such that you can do those things for certain use cases without having to worry about what you would have to do under the covers, let's say. Ben: Got it. And on the ingestion side, when you're typically ingesting data into a ClickHouse ... Let's say I'm not using Tinybird. I'm just using ClickHouse. I have the types of data you mentioned, streaming data, JSON, CSVs. What would typically be the process for getting that data ingested into ClickHouse, and help me understand what Tinybird might do on top of native ClickHouse to help with that. Jorge Sancha: Makes sense. So ClickHouse and has a set of different functionalities to ingest data, so you can ingest CSVs, or you can ingest JSON, or it has an engine to ingest Kafka data and things like that. With Tinybird, we'd simplify that such that you can do that from the browser. Just click, click, click, and you can start ingesting data right away, and building those endpoints. But we also bring observability to the mix such that the reality of ingestion is that normally it goes wrong. Like if you have CSVs, generally you'll have different formattings. And CSV is not a standard. It's just a set of guidelines really. And then people interpret them in different ways. So we've invested a lot of time in ensuring that if you have problems ingesting, instead of stopping your processing, we have very good, let's say, reporting capabilities in terms of telling you, "Hey, this went wrong." But as long as the data that was correctly ingested continues to be ingested, and the one that it's had problem is here, and you can fix it, and then add it to the other rest. Jorge Sancha: So we built functionality around ingestion such that you don't have to go deep into the logs to figure out exactly what happened, but actually you can just quickly solve it and understand what's what's going on. Jorge Sancha: So we were basically building on top of the things that ClickHouse already has, and replacing some of the things. For instance, we have our own Kafka ingestion. We don't use ClickHouse own Kafka engine because it was limiting us in some of the things we wanted to do, such as not having to define a schema. For instance, when you start ingesting, in Tinybird, the schema gets defined automatically based on the data that's coming in. Whereas in ClickHouse, you'll have to define that schema, and then iterate, and so on. So we make all of that iteration that is required when you're building a data product. We make that integration easy for anyone irregardless of how knowledgeable they are around ClickHouse. Ben: And then on the API side, help me understand a bit more about what that looks like. Are you automatically generating API schema based on, as you said before, automatically understanding the schema of the data, putting that into ClickHouse? Is that what you then use to build API endpoints that I can build a frontend quickly on top of? Jorge Sancha: More or less, yes. Essentially, in Tinybird, you have the concept of data sources and you have the concept of pipes. Data sources is how we abstract sources of data that could be coming from different places. It could be a Kafka topic, and you want to join it with a CSV that you ingest every hour, for instance, or some other data stream that you want to join with. And then with pipes, you can materialize data on ingestion. So as data comes in, you can be transforming it and materializing it, but you can also create API endpoint. So these pipes, they're inspired by Jupyter Notebook. I don't know if you're familiar with them, or Python Notebook, such that you can write a simple query and then use another node to query the result of that previous query. Or you can do joints between nodes. So you can incrementally solve the queries that you're trying to write rather than having to build this massive, not very performant queries. You can break that up in chunks and work incrementally. Jorge Sancha: And then once you're happy with a result, that result, which would be a table of some kind, or a columnar structure of some kind, that's what you expose as an API. And you can control the results using parameters and so on, as I was saying earlier. Ben: Got it. How about API security, and error checking, and all the annoying parts of building an API? Do you handle those as well? Jorge Sancha: Yes. So we have token-based security such that you can do things like ... Well, when you create an API endpoint, you can decide to just have one token that gives you access to that particular endpoint, or you can create tokens that give you access to a number of endpoints. And then you can do row-level security as well. Jorge Sancha: For instance, this is very typical when you're building user-facing analytics for your product. If you're trying to build charts that show how your users or to your users how they're building the product, you can create a token that says "This token is only good for company ID equals A" or whatever. And then that will always pre-filter the data such that the same API can work for multiple companies, let's say. So that's how we handle security, and we have APIs to handle those tokens such that you can refresh them. Every time a user logs into your or creates an account, you can create a token for them. Those kinds of things such that you can manage your tokens easily as part of your normal process, let's say. Ben: Got it. So taking a step back, I'm curious. ClickHouse is open source or mostly open source, I know. Is Tinybird, its parts open source? Any of it open source, or is it all going to be more of a software as a service platform? Jorge Sancha: So we are a software as a service platform. We are contributing heavily to ClickHouse. We have our own ClickHouse team, and we've been helping from performance things on certain functions, to improving logging for materialized views, to implementing a number of improvements that obviously we're interested in and that benefit the community as well. And then we have a number of things that we are thinking about open sourcing that we're using internally. Jorge Sancha: One thing that we don't like doing is open sourcing for the sake of open sourcing. As in if it's a project that we think it's interesting, then we want to pay attention to it, and we want to put it out there, and support it really well, and really be on top of pull requests and issues that people bring. Because in the past, we've put ... Not in Tinybird, in previous companies and projects, you open source because, hey, yeah, of course. Let's do this. It makes sense. And then you have a lot of frustrated people because you're not paying as much attention to it as you should. Jorge Sancha: So there's a couple of things. For instance, we have an amazing graphical tool to explore time series data that we want to open source. But, again, until we feel, hey, we have enough bandwidth and this has enough entity for us to support it well, we'll probably keep it closed source, and then contribute when we're ready to support that well. Ben: Yeah, that makes a lot of sense. And people often underestimate the burden of opensourcing something and doing it well in terms of managing the community, and responding to issues and pull requests, and marketing the open source. And so it definitely makes sense to only do it if there's a strong need to do so, or it's particularly valuable for the community. Jorge Sancha: Exactly. Ben: I'm curious. So I know with ClickHouse, they came out a few years ago, and then more recently, there's been a company started with a lot of the original builders of ClickHouse. And they're now launching their own software as a service Cloud offering of ClickHouse. So I think that's going to have some of the things that make it easier to develop, like automatic scaling, and it's hosted so you don't need to worry about a lot of the more difficult parts. So I'm curious, will that be competition for Tinybird at all? I know there's more you do in terms of building the APIs on top of the data that I don't think ClickHouse really covers, but would you see that as competition, or as more complimentary? Jorge Sancha: Look, of course there'll be competition in terms of certain use cases, but I think we're targeting very different people in the sense that there's always going to be users for databases that want to have full control of every single parameter of a database, and full control over the whole life cycle of it. But in our view, and based on everything we've been talking to our customers and users and so on, really the key interest people have, based on everything that's going on, and because they want to move fast and they want to compete and so on, really what people care about is for systems that are fast, that are reliable, and that give you amazing time to market. And that's what we're focused on at enabling developers to do that without really needing to understand what's going on under the hood. Jorge Sancha: So ClickHouse, those guys are amazing, and we have great relationship with them, and we talk a lot to them. But we see it as very complimentary. They're, of course, going to be very appealing to people that are already using ClickHouse, or looking to move from some other all out database to ClickHouse, whereas we are appealing to developers of companies of all sizes that they don't want to worry about or are not interested in being the ones in charge of hosting their infrastructure and so on, but want to do it in a serverless way, and scale without having to make the decisions about how many CPUs do I need, or how much memory, or how do I migrate this table. We want to provide rails for those developers to move quickly, let's say. Ben: So tell me about some of the use cases you're seeing for Tinybird. Maybe any critically interesting customers, or types of data, or just general use cases that you find interesting. Jorge Sancha: Yeah. We are blown away. The use cases we see today versus the ones we were imagining initially are very different. And I think that's due to the power of real time. Real time is easy to discount when you already have your ETLs running, and maybe you have one hour lag from the moment some transactions or something happens on your website or application until you see the result. And a lot of people tell us, no, we don't really need real time, but actually what we are seeing is that once you realize you can do things in real time, it fundamentally can change how you operate your business because it enables you to work in a different way. Jorge Sancha: You start thinking about it in a different way. You start thinking about it in a more reactive and a more opportunistic base. Like, hey, if I know right now if my campaign is functioning or not as I expected, I can react to it immediately. I can even automate all those insights and react to changes automatically. Whereas, before, analytics has always been used to understand the past. Now is much more about you can actually affect the experience of your users in real time, which is pretty powerful. Jorge Sancha: So we see a lot of use cases from user-facing analytics, which is the one repeated more, because we can take events either through a Kafka or directly through our own HTTP high frequency ingestion endpoint, and immediately expose those results as charts so that customers can see in real time how their users are experiencing their applications and services and so on. Jorge Sancha: Vercel, for instance. I don't know if you're familiar with Vercel, but they're great company platform as a service for building react applications and next JS applications, and to scale them. So very similar to what we do, but for code rather than data. These guys, for instance, power Vercel analytics with Tinybird such that their users can see in real time how their applications are being experienced. Jorge Sancha: So that's a very typical use case for us, but we have others like real time personalization, for instance. I was sort of alluding to this. We take data that is being generated through events, and use that to affect the next step of user journey, for instance. So based on what they're doing, but also based on what other similar users are doing. So a typical use case would be in an e-commerce sorting the products grid based on what people are looking at, or sorting the pictures in order of a given product based on what people click more. Those kinds of things. It's really powerful and it's really trivial to build with something like Tinybird. Jorge Sancha: And then we see things around automation as well, security, anomaly detection. Operational analytics as well, like operational dashboards for entire companies that want to have realtime analytics. And so those are the types of things that we are seeing a lot. Ben: And so you just raised just your Series A, and congrats again on that. Jorge Sancha: Thank you. Ben: 37 million is what was announced. Jorge Sancha: That's right. Ben: Can you tell me a bit about what are the plans over the next year or two for the funding in particular? Jorge Sancha: Well, the there's three pillars to it, I think. One of them is the product of course, and improving and taking our vision forward in terms of how we believe that real time data products at scale should be built. And that includes investing a lot in the user and the developer experience, such that it is really a pleasure to build things, and scale them up, and iterate them, and so on. We invest a lot in that, but we just have amazing ideas of things that now you have to do that you won't even have to worry about in the future. Jorge Sancha: Then there's a lot as well around connectivity such that one of the key blockers for any data platform is you have to get people to get data into your platform in order for them to do something useful with it. So we want to remove all the friction there such that there's connectors to everything under the sun in terms of where your data can be and what data you want to bring into Tinybird to exploit it in real time. Jorge Sancha: We can run in any Cloud now, but by the fault, we run in Google Cloud at the moment, and we want to go multi-Cloud such that you can choose where you want to run Tinybird. Whether it is Azure or DigitalOcean or AWS or whatever. Jorge Sancha: And then, in general, in order to do that, then a lot of the money goes to hire people and hire really good people to do that. And I'll take this opportunity to say that we are hiring heavily and aggressively both in Europe and in the US. Jorge Sancha: And then the last big thing is our US expansion. All the founders were from Spain. We started out in Spain. I'm now gradually moving to the US and moving the family and everything, and headquarters now are in the US, and we have our go to market team is led from the US. And so our US expansion is also another big focus for us. Ben: Where did you choose to move in the us? I'm curious. Jorge Sancha: I'm going to New York. Ben: Nice. Jorge Sancha: I've already been going back and forth. Moving whole family and everything, I have two kids, is not an easy thing. So it's, unfortunately, not something that I can do from one day to the next. So I'm in the process there. But I love New York, and we think New York is practical for us in many senses. Time zone difference to work with a team here very important. Also direct flights to Madrid, where a big chunk of the team is, is also important. And New York is, in many ways, the center of the world, and every time I work around Manhattan, I have a feeling like there's 15 potential customers in every building at least. So it's a great place to find customers, and it's a great place to find talent. It's a hub for data companies as well. A lot of great data companies are there, like Datadog, for instance. And so, yeah, it just feels like the right place for us. Ben: So, Jorge, thanks so much for joining today. It's been great learning about, about Tinybird and hearing about your exciting plans for the future. For folks out there who are interested in using the product or learning more, what's the best way to do so? Jorge Sancha: The best way to do so is to go to tinybird.co, and to sign up, and check out our documents and our guides to get started. There's a nice onboarding guide when you sign up for the first time. This is just the right moment because, interestingly, although we've always been focused around developers, and our mission from the beginning was to enable developers to build data products over data at any scale, actually, we landed big customers early on, and that forced our hand to focus on stability, on performance, on multi-region and things like that, but not so much in the self-service experience. And now, finally, very recently, we came out of private data, and it's now open for everyone. So anyone can sign up and start for free. Jorge Sancha: As long as you are under a thousand requests a day, you can do it for free. And then, yeah, there's a usage based pricing, so you can pay as you grow if you build stuff with Tinybird. Ben: Great. Well, thanks again. It's been great to have you on the podcast. Jorge Sancha: Thank you very much. Thank you very much for having me and, yeah, I hope to come back in a year or so and tell you a lot more about Tinybird and our progress. Ben: Well, we'd be happy to have you again. Speaker 3: Thanks for listening to Pod Rocket. You can find us @podrocketpod on Twitter, and don't forget to subscribe, rate, and review on Apple Podcasts. Thanks.