Ben:
Hello and welcome to PodRocket. Today, I'm here with Ustin Zarubin, who is the Co-Founder and CEO of Batch. How are you, Ustin?

Ustin Zarubin:
I'm good. Thank you for having me.

Ben:
So super excited to learn about what you're building. Maybe you could give us a quick overview to Batch and what it does and how it helps developers. And the one caveat I want to give you a head of time is that our audience skews a lot towards frontend engineers. So I checked out Batch and it seems super cool, but maybe when you're explaining it, try to take the view, the 10,000 foot overview for folks who maybe don't have as much experience with backend systems and distributed systems.

Ustin Zarubin:
Of course. So the idea of Batch is basically a data observability platform for messaging systems. So that could include data streams, like Kafka or just your genetic cues, like RabbitMQ or Sidekiq, And those kinds of like mats, I don't know, IOT technologies. And what Batch does is it takes popular concepts of event sourcing. So something like data replays and implements them in a more generic way so other companies could utilize it and integrate with it easily. And what replays are, is actually a perfect parallel to LogRocket, like LogRocket that records sessions and you can see how a person when on a webpage and how they interacted with it and the data that happened. What Batch does is it hooks into your data stream and is able to create an audit log of all the data that passes through, but also manage a schema of that data for you automatically. And keep promoting to a new version so that whenever your service or database or some backend store needs to be re-synced or re-instated to a different version, maybe six months in the past, you could just replay that data back and you'll be crystal clear.

Ben:
Got it. So let's unpack that a bit and maybe first it's worth talking about event sourcing, for anyone out there, if you haven't read it, there's a really great article by Martin Fowler, I read a number of years ago. I think it's like the decisive article on the topic. So great read, but in case people haven't read the article, what is event sourcing? Tell me about that architecture and how it works.

Ustin Zarubin:
Yeah. Event sourcing is basically having your events being the source of truth. And what that means is you're not making decisions based on states of the database, you're making decisions of the event store itself and all the events that are propagated throughout your system. There's a lot of different other patterns like CQRS that separates, reads and writes that are familiar with that. And the funny thing is you mentioned Martin Fowler, we actually are a YC company and during our YC application, we said, here's what event sourcing is and we linked an article to Martin Fowler's posts. So basically, how we always looked at it is that event sourcing is allowing your system to be completely asynchronous. And it is a paradigm that is just, you have consumers and you have publishers, and that's how all of your services communicate. There's no database in between, and that's actually how Batch is set up internally.

Ben:
And so this architecture, it's not dissimilar, I guess, to how Redux works. So that's very familiar concept to folks who have built modern front ends, where you have your state and then your state changes by, you have actions that are like events, then you've reducers that take the action and modify state.

Ustin Zarubin:
Yeah. It's funny you say that because I actually, even though most of my time is now in the backend, I've spent probably like five years of my career in React land, when React is, I think version 11 point something, so very early on. And I remember when Redux came out, I was having a conversation with my friend an engineer recently, and I was like, yeah, I wish there was a way that we could implement event sourcing into Redux as some kind of a middleware. So that way you can funnel all your events and replay them through the backend as well. And that's actually been a topic of conversation for us for a while. We haven't found a way to do it cleanly, but we're trying.

Ben:
You might be able to use LogRocket and we don't exactly offer that functionality, but this is a blast from the past, but when we first started LogRocket, the first functionality we built was Redux logging and everything else came later. So that idea of capturing every action so that when you have a problem, you can figure out what went wrong, that was what got us started. But so, okay, jumping back to Batch, so this event sourcing architecture, on the backend it sounds like the tools companies typically use like Kafka or Rabbit or some of those tools. And Batch record, is it accurate to say Batch records all the events that are going through your event sourcing system, and then when there's a problem, it allows you to replay them and recreate the state of your system or? Correct me, or explain if I'm wrong there.

Ustin Zarubin:
That's actually very accurate. And one of the benefits of Batch is most replay systems out there, they're very little, I think Amazon has one and maybe some other big cloud providers for their pub/sub networks. Basically, they're all time-based, which means you have to select a date of some kind, and you replay from that date to the date that you want to replay it to, and from dates basically. And with Batch, you can actually search for very particular events or the data that are within events. So basically you can search key values of those events and being able to only replay a very small subset of those events versus just saying, "Hey, I need, from last month's data," you don't want to replay a billion events into your system and basically DDoS yourself in a way, if you don't do it carefully.

Ustin Zarubin:
So that's very accurate. And I think the other thing is that events evolve and they change schemas and that's a really important thing that we thought about, because if you are only replaying events with the old schema, you can essentially break the system. So what we do is we normalize all the schemas for you into the most updated schema that could include everything. And that's what we actually send back to you.

Ben:
So do you have to define some kind of migration script that can take an old event and transform it into a new event, or is that done partially automated by your system?

Ustin Zarubin:
It's actually completely automated by our system because migraine schemas is actually a huge pain point for a lot of companies, so if I have all my data in something like, I don't know, Snowflake, and then I want to go back and query it as some engineer or a data scientist, what ends up happening is chances are the schema has changed and now you need to either migrate that data or figure out how to query it some other way. So we learned that that was a pain point for a lot of companies, so we just tried to solve for it by automating it completely.

Ben:
And I know there's a bunch of different event formats folks use, back to our Redux example, it's JSON or JavaScript objects, are your events on backend systems, what are some of the event formats you see people using nowadays, aside from JSON and, is there any limit to what Batch supports or how does that work?

Ustin Zarubin:
We definitely see a lot of JSON, but there's a huge push now into Protobuf. And Protobuf is basically, an in coded it's like, how should I? Hold on, let me figure out how to explain this better. Protobuf is basically allowing you to manage the schema of a payload for an API or an event bus. So for instance, in JSON, if your schema changes, that can happen and you can do it very easily, but it could break the consumers of that API. And Protobuf prevents you from having that, because you can only add, or you can do whatever you want, but in order to do it properly, you should only add and deprecate, so that your schema continues to evolve because there is an index there, and it's like columns in a database, and if you drop one bad things are going to happen. And you technically don't want to drop it, you just want to add and deprecate certain columns. So you still retain the API contracts to previous versions of those consumers.

Ustin Zarubin:
So we see that actually a lot more, and we've seen a lot bigger push into Protobuf from a lot of companies. We, ourselves are big fans of Protobuf, I was on the founding team of community, and that's where I learned a lot about it. We use the [inaudible 00:09:42] Protobuf, and now we're a Go shop at Batch so we use Go and Protobuf and gRPC as well. And that has been, honestly, I will say, I love it, but if you want to get something out fast gRPC and Protobuf do slow things down a little bit, especially, if you're not familiar, if that stack.

Ben:
And taking I guess, a step back, so your system, it's like in some ways, a backup system, as I understand, it's something that you rely on in case something goes wrong so you can reconstruct the state of your database or do debugging. Are there any other use cases besides those two areas?

Ustin Zarubin:
I think it's going back to the schemas that we were talking about, it's maintaining and automatically adjusting your schemas as it goes. So the replay is, we say, it's a replay, but it's more of a relay. So it becomes somewhat, I don't want to use the term data pipeline because that's not what we do, but it could function as a data extraction than relay that data somewhere else. But debug ability and observability are some of our biggest features. And that's where we get the most comments out of our customer base, because they'll have events that will come in, that they've deployed a system to production that all of a sudden they're incorrect and they can see that so they can roll back the changes and fix it and not have an outage.

Ben:
And speaking a bit about customers, customers and curious, what people do today, before they buy Batch, how do people solve this problem? You mentioned Amazon, and I'm guessing all the cloud providers have some basic functionality here, but what do you typically see people doing?

Ustin Zarubin:
Building it in house. So it's always build versus buy argument that we cater to. And I mean, that's what we did at all my previous companies is we always built the system in house. And in fact, we were talking to a really big customer, I don't want to mention any names, but you would know them. And basically, they were like, "Hey, where were you guys last year? We actually implemented this in house last year for a security team." And it was verbatim very similar architecture. And what we've come to realize is that there is, like we're SaaS and actually one of our integrations is our open source, two called Plumber. And so we're trying to add more and more of that Batch functionality into Plumber and we're currently building an electron app that... So I'm back to writing a lot of frontend actually, and TypeScript. And that has been what we're now pitching to customers. So, don't build this, just try it out, quickly deploy this and please give us feedback.

Ben:
And how does cost, is cost a concern? I imagine people typically have huge amounts of, can have huge amounts of data flowing through these events sourcing systems. And if you're essentially backing up a copy of every event, I imagine it's a lot of data storage. So is cost a concern at all, or if so, how do you address that issue?

Ustin Zarubin:
So we have two ways that we solve for that. We have something called short term storage, like hot storage and cold storage. And our hot storage is basically data showing in a very large elastic cache and we maintain that for you. But if you wanted to do something else with cold storage, you can choose any platforms or blob store, and we will automatically encode the data into parquet for you and we'll maintain that parquet schema. And parquet is just a very fancy data encoding/compression. So therefore we're trying to keep your costs low on your infrastructure as well.

Ben:
And do you typically see customers utilize that cold storage or can you get a lot of value from Batch by just maintaining some window of 30 days or however many days of events, and then just throwing away the data past that point, given you probably don't need it, if you haven't had a problem come up within 30 days?

Ustin Zarubin:
It's funny you asked that because I used to think that way too, that hot storage was the thing that was very important and what I've realized with most of our customers, they actually want it for their archiving. And the reason being is because let's say some security thing occurs I don't know, just some security event occurs at your company. You don't only want to see what happened five minutes ago, you want to see what happened five minutes ago, one hour ago, one day ago and you want to try to craft a pattern together, to investigate how this could have occurred or why it occurred. And so most people actually want to be able to retain very long sets of data and being able to search through that. And that's also a functionality that we enable. So in fact, our latest customer, they actually only wanted it for the archiving feature.

Ben:
And if the main use case is archiving, what is the value of Batch versus more traditional logs? Like if you're not going to be replaying events and you don't necessarily need the ability to transform, do the migrations from the schemas, yeah, how does a system like Batch compare to just a traditional Splunk or Datadog logging?

Ustin Zarubin:
I guess, because these aren't logs, these are actually system data, which is the events that have flown between all of your microservices. So it's more raw and more, how should I say? Fruitful to be able to know what actually, exact piece of data that occurred and how it was transformed. Because like logs, it's just a print statement inside of your microservice somewhere or inside of your backend and you could put whatever you want there, some are clear, some are not. And this is more so very exact piece of data and if you search for that one ID, you can actually see how that one event has transformed over a course of a year inside of your system. So basically allowing you to see how your system has changed.

Ben:
Got it. So give me an example of the classic example for Redux, a to-do list. Give me an example here of what would be a piece of data and how that seeing the changes over time would be valuable.

Ustin Zarubin:
Sure. I think the easiest, maybe example is security. So let's say that someone, I don't know, someone bought something and I don't know how to frame this example correctly, but let's say, you want to know if this person actually, let's say you logged to someone's account and you then logged into a whole bunch of other accounts in the course of 10 minutes, chances are, that's suspicious, why would you need to log in to 10 different things all at the same time? And you want to be able to track that person's ID or email across multiple systems or across multiple events and craft a little composite view around how this is going, what actually happened. So like Redux, you'll be able to see how all the states changed and you can roll back or move forward. You want to see the same thing to even understand how this person maybe possibly attacked your system, or maybe this was actually the harmless coincidence. Does that make sense? I hope that answers it.

Ben:
Yeah, definitely. So essentially, the fact that you have a more structured understanding of the events and the system makes it much more powerful than just a text-based logging system in terms of debugging or reconstructing, how things changed over time.

Ustin Zarubin:
Right. Because let's say if you have a database and your billing service, for instance, going back to the Redux example, the reason we can go back and forward is because you're actually emitting that physical event or removing it, you're pushing and popping it. And it's the same thing here. I could deploy a microservice and backfill its data store with only that one hour of data and to see how it's going to function.

Ben:
If I think about the Redux example, I have my Redux state and it changes over time as the reducers take an actions and change that state. And if I just had the last hour worth of actions, I couldn't necessarily create the state because there was some original state and then it was transformed from those actions. So is there a concept of snapshotting of state or how does that work?

Ustin Zarubin:
I think it's called bookmarking in the event sourcing world, and yes, we are working actually on that feature in particular. But if you're truly in event source system, which is our ideal customer, the way you obtain that cash or that foundational state is through the event bus. So by us replaying those events back into your event bus, technically you should have all the original state. Because the whole point of an event sourcing is that in a way you're stateless, you're just a consumer or a publisher. So that would be our ideal customer, but we've learned that not everyone's like that, not everyone wants to build event source in architecture, so we do accommodate by trying to implement these bookmarking features.

Ben:
So that's a good segue, you mentioned that something on your roadmap, what does the next year look like for the product?

Ustin Zarubin:
As I mentioned earlier, we're going more on this open source, open core pattern and the vision of it is we're trying to recreate Batch inside an of an electron app and instead of Plumber, which is now going to be a thing that you can deploy, but managing of schemas now that's going to be open source and along the lines of alerting and archiving. So basically, just more and more features tagging on onto event sourcing. And we're just exploring many different avenues right now where there're utilization's in blockchain with this or should we maybe do data flow tests and see how this event went through all your systems, check it got transformed how you wanted to. So we're definitely actively exploring a lot of areas right now. And I wish I could share more, but we're just trying to be a little stealthy on our open source at the moment. But our launch coming in probably around September for that.

Ben:
Awesome. And who do you see as primary competitors? We talked a bit about how you're quite a bit different than traditional logging tools. How about the distributed tracing tools, do those type of products, butt elbows at all with what you're building or if not, who would you say are potential competitors?

Ustin Zarubin:
That's a really good question because we, maybe, I shouldn't say this, but we haven't found anyone that's directly competing. I think we have a lot of features with, similar features of different companies, but that's not our core product, it's more just a side effect of what we're building. But if I would have to pick, it would, I wouldn't say Confluent, but Confluent does do a lot of this observability on top of event buses, but the thing is, is that they are Kafka based. So we speak all kinds of different protocols, I think we support over 10 different messaging tech and we keep adding more and more as we go, we actually added KubeMQ this week. And that's what sets us apart?

Ben:
And curious to learn a bit more about, have the go to market side. So are you now in beta or you have fully launched and how are users finding you?

Ustin Zarubin:
We're fully launched. Our go to market has been, let's cash in every single friendship token we've ever had to cash in as founders. And most of the time we're trying to look for people that are using tech, like Kafka, like Rabbit. And honestly, I think it's like a very traditional sell is you do your outreach, you figure out who is actually going to use this, you figure out what problems they're having, how this solves for their problems and go with that. But lately our go to market has been us working on open source because we see that, we've done basically zero marketing on Plumber, which is our open source tool, and it's getting close to a thousand stars, downloads, a thousand downloads every month. So it basically is like it's growing and growing and growing and we want to support that community. So that's probably what our go to market strategy is going to be like in the future.

Ben:
You recently did Y Combinator, I'm curious to hear a bit about how was your experience there and is that something you'd recommend for other infrastructure developer tools?

Ustin Zarubin:
A hundred percent. So my YC Batch was the summer '20 Batch and it was the first completely remote Batch because winter '20 COVID happened, unfortunately and places started to close so it was I think half and half, and we were the first fully remote. So I think I recommend YC to everyone, especially their application process, because it really helps you think about the business and what you want to build in your grand vision of the future. And having that all on paper actually makes it more real than having it all in your head. It's like good documentation for engineers, you got to get that presented, you got to make that accessible so that other people can learn from, and that's how it is with YC.

Ustin Zarubin:
The Batch, obviously I have nothing to compare it to because people always ask me, "Well, how's it like versus an in-person Batch?" I'm like, "Well, I don't know, I was never in-person." But I thought it was very well done. I actually got to spend more time with our partners at YC than I think I would have in-person. So highly recommended to anyone that is even interested no matter what stage of the company you're at.

Ben:
Do they still ask on the application the question, like, I think it was, tell us about a time you successfully hacked a non-computer system?

Ustin Zarubin:
Yes, I think they do.

Ben:
Do you remember your answer? Just curious.

Ustin Zarubin:
Yes. And it was basically, so as I mentioned, I think before we started recording, I'm originally from Russia and my wife is too, and I met her here and she had a lot of immigration problems. She I don't know, got stopped randomly at an airport and basically they were just giving her a really hard time about everything and made sure that she basically couldn't even adjust or extend or change her status. And I was a really broke student at the time I was studying physics and I didn't have money to hire a lawyer, but I was like, this is the love of my life, I got to figure something out. And I ended up reading a whole bunch of immigration law and I wouldn't say figuring out a loophole, but more so figuring out a way to get her to stay here legally and I filed all my paperwork, I wrote on my letters, I reached out to all the immigration people that I needed to, and I ended up getting them to grant her status to stay.

Ustin Zarubin:
And then I ended up getting my citizenship, then I could also basically get her a green card, so basically I hacked the immigration system as weird as that sounds, that was like my own personal hack.

Ben:
Wow, I mean, that's amazing. And so glad it worked out and certainly the immigration, the knowledge of the immigration system will come in handy as you grow the team. And hiring folks who don't have the full citizenship always requires way too much understanding of immigration. So going back to the beginning, I'm curious, you touched briefly on your background, you said you worked at Community and what inspired you to found Batch?

Ustin Zarubin:
I realized that in every single company, I am building an event source system in some way, shape or form, I mean, in some places way more broad than others, way more detailed than others. And I then stumbled, I didn't know it was called actually event sourcing for the longest time until I was at Community. And someone linked to me a Martin Fowler article and I was like, huh, this is what I've been building and this is a system I believe in. And I always wanted to make a very generalized solution and that's actually what inspired me to apply to YC. And I actually linked them to both of those Martin Fowler articles inside my application, explaining what events sourcing even is because I think it's actually hard to explain.

Ben:
Awesome. Well, Ustin, thank you so much for joining us today. This has been fantastic to learn about you and to learn about Batch. We'll definitely put a link to Batch in the episode description, and we'll also put a link to the Martin Fowler article that we continue to talk about. So folks out there, well, I think it's definitely a great read if you're interested in learning about event sourcing. And finally, is Batch hiring, are folks interested in contributing on an open source basis? Are either of those routes available?

Ustin Zarubin:
Yeah, I think definitely look us up Batch Corp/Plumber is our GitHub handle and would love to get a lot of help and feedback around our open source tooling.

Ben:
Awesome. Thanks so much for joining us.

Ustin Zarubin:
Thank you so much.

Ben:
Thanks for listening to PodRocket. Find us @PodRocketpod on Twitter, or you could always email me even though that's not a popular option, it's Brian@logrocket.