Dain Sundstrom:
The fact it's open source is not an accident. Building a database takes, I don't know, five to 10 years. And I don't want to work on something for five years, and then have some corporate effort change, and then your five years of code just goes in the trashcan.

Eric Anderson:
This is Contributor, a podcast telling the stories behind the best open source projects and the communities that make them. I'm Eric Anderson.

Eric Anderson:
We are live today with the majority of the creators of Presto. I have with me, Martin Traverso, Dain Sundstrom and David Phillips. Great to have you on the show with us today. These three represent three of the four original creators of Presto project that came out of Facebook and we're going to dive into how that came about.

Eric Anderson:
And this is also the first time we've had multiple people on the show at the same time, so this will be a bit of an adventure for us. Maybe so that everyone can become familiar with your voice, the three of you could each name and introduce yourself.

Martin Traverso:
I can start. This is Martin. As Eric mentioned, one of the creators of Presto. Been working with Dain and David on Presto for the last almost eight years now. And before that, we worked at a couple of companies together. We met back in 2009 when we were at Ning, working together.

Dain Sundstrom:
And this is Dain Sundstrom, also one of the creators of Presto, really happy to be here.

David Phillips:
And this is David Phillips, also one of the creators of Presto.

Eric Anderson:
Great. And I want to start out with, you have this amazing story. All of you have worked together through so many companies. Maybe you can talk about how each of you first met. I feel like that's kind of the beginning of Presto in some degree.

Martin Traverso:
Right. So, back in 2009, I was working at Ning. Dain and David joined Ning. We worked together. I mean, at least I overlapped with them for a couple months when we were there. Dain and David continued to work at Ning for a while. I went to work at another company.

Martin Traverso:
When they left, they came over and joined the company I was working at and we worked together for about two years there, got to know each other really well. And then, at some point, we decided to leave that company. We basically got hired by Facebook.

Martin Traverso:
And since we got along together so well and we were interested in the same things, we said, "Let's go do something together. Let's work on the same team together." So, we kind manage that after we joined Facebook.

Martin Traverso:
And we all had the idea of doing something around databases, and data analytics and warehousing at some point, so that was kind of the great opportunity to start something when we're at Facebook.

Eric Anderson:
How many people were at Ning? Did the three or four of you kind of find each other and instantly click? I'm sure there was a lot of other people to become friends with Ning.

Dain Sundstrom:
So, Ning was not really a huge company when we joined. It grew a bit after that, but engineering was 35 maybe. And I mean, Martin interviewed me and gave me probably the hardest tech interview I've ever had. And he's very, very knowledgeable about Java and computing, and that impressed me and I think I did pretty well. And we've been working together pretty much since, I think.

Eric Anderson:
Awesome. And David, how did you first meet the group?

David Phillips:
Same. We were working together at Ning. I didn't really interact too much with Martin. He was the chief architect at the time and I just joined as an engineer.

David Phillips:
Probably the biggest thing I interacted with him on was he gave an overview of the entire platform as he was leaving to explain how everything worked. And I think it was a two day presentation, and he printed up a graph of how all the components fit together on 50 pieces of paper and taped it onto a wall, so that was kind of funny.

Eric Anderson:
Martin, you leave quite an impression on people, it sounds like. I don't even know what a two day presentation means and a 50 paper sheet. They have computers now, you can put things on a screen. I'm just kidding. That's fantastic. Okay.

Eric Anderson:
So, Martin impressed everybody and kind of has these engineers gravitating towards him, and then you left Ning as a group or individually and somehow made your way to Proofpoint, right?

Martin Traverso:
I left first, April 2009. And Dain and David continued to work at Ning for a while. David, maybe you can jump in there. Dain and David, talk a bit about that.

Dain Sundstrom:
Yeah. David and I worked together on a bunch of projects at Ning. The biggest one was we actually built a data warehouse using Netezza, which David had used at a bunch of other companies. It's an excellent product.

Dain Sundstrom:
So, we worked on a bunch of stuff. And then, when I left, Martin recruited me to Proofpoint, and then I recruited David and we all ended up there together, again, working on leading the architecture of new Proofpoint applications.

Eric Anderson:
But, not necessarily databases at Proofpoint, correct?

Dain Sundstrom:
Not so much.

Eric Anderson:
Got it. So, the band's back together and you're not necessarily working on databases, just anything. And at some point, you tire of Proofpoint and head the Facebook?

Martin Traverso:
Right. We knew people at Facebook at the time. The guy running engineer at Facebook, Jay Parikh, was my manager at Ning, so I knew him pretty well. We had a bunch of people we worked with at Ning that had gone to Facebook, so they were all trying to recruit us. So, it was kind of the right moment to join the company.

Martin Traverso:
So, I mean, it was kind of a good timing, good opportunity and it was very convincing at the time, as they told us, there were so many projects, so many interesting things to do at Facebook. And it certainly was. The company was growing significantly. There were a lot of technical challenges.

Martin Traverso:
In particular, one of the things they talked about was some of the needs they had in the data infrastructure organization and this is something that, I mean, Dain and David had worked with data warehouses in the past. They had a lot of experience with that. Dain and David, maybe you can talk a little about how you were before that, you had always been talking about how you wanted to build something yourselves.

David Phillips:
So, I've been involved in data warehousing for probably 15 years, always been working in the internet space. In the internet space, you tend to have a lot of data that you need to store and analyze.

David Phillips:
And back in the mid 2000s, there wasn't a lot of options. Hadoop didn't exist. There were commercial systems like Netezza that Dain mentioned and the Netezza's was actually pretty awesome. It was like a distributed database built on top of Postgres, that came as an appliance, so it was really like a single rack they would just drop it into your data center, plug it in and it would just work.

David Phillips:
And that always kind of set the bar for how easy a product should be to use and how quickly you can get started with it.

Dain Sundstrom:
I think also we used Hadoop, and MapReduce and custom stuff in the early days of Hadoop. And in comparison to something like Netezza or the other commercial products, it was really frustrating, really hard to work with, slow and slow for no reason. You play with it and you're like, "This thing could be orders of magnitude faster if someone just paid attention to it."

Dain Sundstrom:
So, our experience with that made us want to go and work on some of these tools, but we didn't really have the opportunity until we were at Facebook. And Facebook, at the time, was relatively small. We joined just before the IPO. And so, there were opportunities where you could just go and form a team and do something really significant. Nowadays, it's kind of everything that needs to be done is being done by someone. So, it was kind of a different day.

Eric Anderson:
Yeah. There was a lot of white space, green field, whatever you want to call it, where you could tackle any projects. And you kind of self-selected, opted-in, kind of came up with this idea on your own, it sounds like. Was there a need at Facebook? What was your first use case?

Martin Traverso:
Yeah. There was certainly a need. I mean, this was, as I said, kind of the right opportunity at the right time. It was the opportunity to do something that we had been wanting to do at a company that actually needed exactly that.

Martin Traverso:
At the time, Facebook was using Hive for most of the data analytics. Hive came out of Facebook actually. They created it in 2008. They open sourced it. They made it on a budget project and it was heavily used for all the data transformation, data analytics.

Martin Traverso:
People were using it for interactive analytics too, or they tried to use it for interactive analytics. Basically, they would run a query, and then maybe wait an hour or two hours for the results to come back. Which as Dain was saying, that seemed ridiculous. We thought that could be done much, much faster and we set out to do that. We said, "We can do something to run this."

Martin Traverso:
There was a system that came out of a hackathon at Facebook that attempted to something like that. The system wasn't being maintained. It wasn't scaling beyond limits it had, and the architecture wasn't amenable to making it scale beyond what it needed to scale and to be able to add the features that need to be added.

Martin Traverso:
So, we said, "Let's look at it with fresh eyes," and we started doing something from the ground up, so that's how Presto was born basically.

Dain Sundstrom:
I was going to say, we were pretty lucky in that we showed up right when this other system was falling over and dying. The existing system wasn't easy to support and the people that supported it had left. And so, there was like this... Someone had already shown something can work in this space.

Dain Sundstrom:
We showed up, we had the credibility, but then we also had the right people, like Martin has a lot of experience in languages and distributed computing. I also have experienced in distributed computing, along with David.

Dain Sundstrom:
We're also all Java experts and the whole Hadoop ecosystem's in Java, switching to another language would have slowed everything down in the beginning. And we just had really good people. I have a lot of experience in open source. So, we just got, I think, really lucky. Right time, right problem. The needs, the opportunity.

David Phillips:
And Facebook's always had the philosophy of building it themselves and owning the entire engineering stack. They build their own servers. They build their networking equipment. They have a bunch of people working on the Linux kernel.

David Phillips:
So, at most companies, the idea of building a database would be crazy. But, at Facebook, that was kind of par for the course.

Eric Anderson:
And was it clear this would be open source from the outset? Or how has that decision reasoned?

Martin Traverso:
It was to us. We started the project, then when we were talking to Jay Parikh, we said, "Hey, we want to make this open source," so he was on board with that. That was around the time when Facebook was working on Open Compute and he was seeing that Open Compute, if you look at it, it ended up disrupting the hardware industry and we want to do the same thing for the analytics industry.

Martin Traverso:
So, he was on board with that. It's something that we want to do from the beginning, make it open source because we had worked with open source projects, we believed that the most successful projects are those that are open source.

Martin Traverso:
Getting other people and other companies involved in the project which makes for a healthier project, in terms of you end up not just building something that satisfies the needs from one company, but from everyone else, and in turn, you end up benefiting from that.

Martin Traverso:
So, that was something we kind of said from the beginning, "We're going to do it this way. We're going to make it open source". If you go look at the history of the project, the first commit was on GitHub. So, we used GitHub. We used all the tools we would eventually use when we open sourced it. It took us a year to open source it, but that was kind of the idea from the beginning.

Dain Sundstrom:
The fact it's open source is not an accident. We looked at this project and were like, building a database takes, I don't know, five to 10 years and none of us... Well, especially, I'll speak for myself. I don't want to work on something for five years, and then have some corporate effort change, and then your five years of code just goes in the trashcan. I've seen that way too many time.

Dain Sundstrom:
And in addition to wanting to get input from outside people and wanting to get more help, we wanted to make something that was going to have longevity. Our initial model was we want to build Postgres, but for analytics, and have it be open and free, and have lots of people involved in it and go in that sort of direction of a really big project.

Dain Sundstrom:
And from day one, we very carefully designed the project. We did everything on GitHub, every issue's on GitHub, the pull requests are on GitHub. All the reviews are public, which is pretty different than how a lot of companies do open source, like Facebook does open source today.

Dain Sundstrom:
We did everything publicly and we insisted everyone on the team do everything publicly, which is a pretty big change. But, then it makes the project more open and brings in people. And they don't feel like you have a special place because this group of people at one company founded the project.

Dain Sundstrom:
They're not treated special. Everyone's code goes through the same process and you can see it because it's all in public. So, we designed it so that it was this big open thing, and that everyone could see it and feel like they're an equal member.

Eric Anderson:
That's amazing. Tell us about that year of closed source too. I mean, I guess you're on GitHub, you're using the open source tools, but if I understand right, you hadn't made this public yet for the first year. Was there a milestone you reached before you felt comfortable doing so?

Martin Traverso:
Well, I mean, we started the project in August 2012. It took us about six months to go to production. I think it was beginning of 2013, the first time we open it up for internal users.

Martin Traverso:
So, of course, we weren't going to open source it before then because we had nothing to open source really. It was just a bunch of prototype code and trying things out. We actually thought that, I mean, we wanted to open source something that was going to be useful for people.

Martin Traverso:
So, having something that is kind of vaporware wasn't ideal either, so we kind of spent that entire year building Presto, making it work for internal use case. And when it was proven that it was actually useful, there was something behind it, there was a future to it, then we said, "It's time to open source it."

Martin Traverso:
We also had to work on cleaning up documentation, a bunch of other things, to make it more appealing to people that weren't immediately familiar with the code base.

Dain Sundstrom:
Yeah. I was going to say that it isn't as you get to a point where you're just like, "I'd like to open source this," but you don't just turn around and open up the repo. And maybe David can speak to some of the work, I know he spent a really long time on making the thing work well for the open source community.

David Phillips:
Yeah. So, what we did in the first year was get it working really well at Facebook. And that system that we talked about that we had to replace, there were about 20,000 reports currently using that old system that we had to migrate onto Presto.

David Phillips:
And we actually implemented a ton of features in Presto, that as a consequence, that those reports needed. And so, having that, those concrete use cases, it gave us specific things to work on, instead of just kind of adding random features.

David Phillips:
And then, we knew that a distributed system is really hard to install and get running. And especially if you look at the Hadoop stuff. Hadoop, especially back in those days, was crazy complicated to install, and could take you days and you'd have to be doing a bunch of searching and reading documents.

David Phillips:
So, we said, "People have five minutes that they're going to spend on this, so how do we make sure that people are successful on their first try?" And so, I wrote documentation, how to install it.

David Phillips:
And as I was documenting it, I would find things that were just really hard to explain or just seemed weird, and I said, "Go fix the code, so that I don't have to document this step." So, it was probably at least a month of iterating on writing the documentation, figuring out what parts sucked, fixing those, fixing the documentation.

David Phillips:
And at the end of it, we had a system that worked really well. And surprisingly, I don't think we had any questions about how to install it in the first few weeks. People actually could download it and get it running, which was a pretty big change.

Dain Sundstrom:
I was going to say, the measure of success is when people don't ask you about that and they start asking about the next level.

Dain Sundstrom:
But, the other measure of success was that there are a bunch of competing products in the space and one of the most common responses we got was, "Hey, I wanted to use this other product, but I gave up because I couldn't get it installed after four days, over a weekend," or "Actually, I came in on a Saturday and in hour, I had your stuff up and running in our production environment".

Dain Sundstrom:
And so, one of the biggest things you can do as an open source project is make the thing dirt simple to install, just remove everything that's complex as much as possible.

Eric Anderson:
That's a great story. But, there needs to be a name like the documentation driven development or something. Right?

Dain Sundstrom:
Yeah. I found that, over the time as I've tried to write documentation, it tells you what's broken in your system. When you get to the second page of explaining how something very simple works, you're like, "No one's ever going to understand this." And you start, as an engineer, going like, "Well, I could rewrite this in about 30 minutes and it's going to take me about four hours to document it, so I'm going to rewrite it."

David Phillips:
Yeah. It's actually really frustrating to write documentation as an engineer because you kind of got to focus on just getting the documentation done, but then you'll find all these things that you actually want to go and fix.

Eric Anderson:
Yeah. I imagine it'd just be full of to do's as you write the docs. What's an open source launch like at Facebook? Does the marketing team get involved or do you just flip a bit and email your friends?

Martin Traverso:
It depends on which project. I mean, we had a lot of things ready because we already had the coding on GitHub. We already had the processes and the mechanics of how you interact with the code base, how you look for a request and all that.

Martin Traverso:
Some projects go through a process of trying to move from an internal repository to GitHub, switching tools, switching the way people do things and think about doing things. So, we didn't have to go through any of that.

Martin Traverso:
So, for us, it was just, everything that David was talking about, a lot of cleaning things up, making sure that the experience was smooth. We built a website, and then it was just clicking the button on GitHub and saying it's open now.

Martin Traverso:
And then, of course, there was a conference. @Scale data scale conference back in, I think it was, 2013, that they held it for the first time, so we made our presentation there. We wrote a blog post and so on.

Martin Traverso:
But, it was kind of grassroots aside from that. It was we made it open, and then some people saw it. They started engaging with the project and kind of grew from there.

Dain Sundstrom:
I think you're leaving out part of the story, Martin. I mean, we've all been around for a long time, so we spent a lot of time going and talking to everyone else we knew around us. So, we talked to all of our friends, went and talked to their companies, figured out if this was something that was going to work for them, tried to get people to be engaged in the project.

Dain Sundstrom:
So, we went and personally recruited companies like Airbnb, and Netflix, and LinkedIn and kind of all these companies, to get them involved in the early days of the project because we wanted to bootstrap the actual having a real community. So, it didn't just turn out to just be five people at Facebook hacking away.

David Phillips:
And we actually had these companies beta test the software, so that when we did launch, the problems that they had found had been fixed. And so, the first time experience of people wasn't the first time anyone had ever used it externally.

Dain Sundstrom:
Yeah because Facebook's environment is really custom and doesn't really reflect what a lot of normal companies would have, like the number of servers. I mean, Hadoop is forked, and Hive is forked and kind of everything is forked. So, you want to make sure it worked in real environments, that actually worked in cloud environments and things like that.

Eric Anderson:
Any surprises as you bring in all these new use cases, new needs, new people? Or may be a testament to your planning, or at least Facebook's diversity of use cases, that the product more or less was what you needed?

Martin Traverso:
Well, I think that one of the things that was a bit surprising, I mean, when we wrote the first versions, there were a couple things that we wanted to do. One was make it open source, but we also had to make it work with internal Facebook infrastructure.

Martin Traverso:
At that point, Facebook was running a custom version of Hive. Even though Hive came from Facebook and was open source, eventually Facebook forked it back in. So, they had customizations. They had their own version of HDFS. And there were a bunch of other systems that we need to be able to integrate for all the monitoring, and collecting metrics and all that stuff.

Martin Traverso:
So, we said, "We need to make sure that Presto works for Facebook, but we also want to make it open source, so how do we do that?" And we kind of realized at some point that we could separate the engine, the core query search engine from the storage layer, and we put it behind a plugin interface.

Martin Traverso:
And that was kind of out of necessity. It was like, well, we need to be able to have Presto run on top of Facebook Hive and HDFS, but also work with open source Hive and HGFS. So, we did that by having plugins that could be swapped out.

Martin Traverso:
So, that was kind of the motivation for that. But, very quickly after that, especially after we open sourced it, we started seeing people using that for integrating with other backends, like with databases and other systems. That was something that we didn't really plan ahead of time.

Martin Traverso:
But, it became one of the pillars of Presto as one of the things that people look to when they think about Presto and they think are using Presto is the ability to connect to different data sources, bring all the data from the sources together and run queries across all data sources at the same time.

Eric Anderson:
And I imagine that, at some point, there's more than just the group here on the call. Are you able to hire folks at Facebook or do you find a bunch of contributions from outside? Who else is building Presto at this time?

Martin Traverso:
Our team was four people for the first year, then we got a couple more people. We were at six or seven people for about two years and a half or so. And then, the team started growing after that.

Martin Traverso:
I mean, in terms of people that worked on the core of Presto at Facebook, it was always a small set of people. There were a lot of projects related to Presto and we had to get more people involved at that level. But, outside of Facebook, there were many companies that got involved from the beginning.

Martin Traverso:
For example, there's a company called Treasure Data. They eventually got acquired by ARM a couple of years ago. They were one of the first contributors. They took the code, and then within a week, we already started seeing contribution from them and they've been involved since.

Martin Traverso:
And then, over the years, we've seen a lot of companies. Some companies came and went, in terms of being involved and contributing, but there are some companies that have been contributing since then. I mean, we have companies like Netflix and LinkedIn, Qubole, Treasure Data, all those companies are involved in the project and contributing to this day.

Martin Traverso:
And then, over the last year and a half, we've seen a lot of new companies start showing up in the community. For example, Salesforce, they started using Presto recently, I mean, that we know of. And they are super involved and heavily involved in the community now, getting a bunch of contributions around security and improvements on that area.

Eric Anderson:
Let's talk a bit about governance. It sounds like from the beginning, you planned on this being open source, and you laid a foundation to be ready to share code and collaborate. But, other projects went to the Apache Foundation, sometimes Facebook kept projects kind of self-governed. How did you as a group reason about governance and where'd you end up?

Dain Sundstrom:
So, I have a lot of experience working in different open source projects. From the beginning, we had a model that was very much modeled after the early days of Apache. I call it the pre-PMC days.

Dain Sundstrom:
Where we literally had a policy of there are no private lists. Everything gets talked about in public. You strive for a consensus. If you can't reach a consensus, you figure out what you can agree on and you move forward.

Dain Sundstrom:
And that's actually the core of the way the project's run to this day, is everyone get together and work together as much as possible. And where you disagree, figure out where you can move forward.

Dain Sundstrom:
And I think that works really well for, I'll say, smaller projects, if you have less than 20 people working on stuff all the time. I'm talking about people working together all the time, not people that just show up, put in a few contributions and leave. The core people, everyone can know each other, they build a relationship, and that works really well. And that's pretty much how we've led it since.

Martin Traverso:
One of things that we do believe in is that it's a community made of individuals that are passionate about open source, passionate about making something that will stand the test of time basically. And they can do something that other people can use and can be successful with.

Martin Traverso:
So, this is not about companies being involved in the project. Of course, companies employ the people that are involved in the project, but this is about people that are really passionate about working on an open source project and as such as a project that's based on merit.

Martin Traverso:
If you get involved in the project, you get to know the people. You get to work with the people and you build trust with the people, and then you'll be able to influence it more.

Dain Sundstrom:
One of the things we work really hard on is to make sure that no one's treated special. That's really frustrating to new people that show up and there's no reason for it.

Dain Sundstrom:
It's if I show up and have an idea, people know who I am and it comes with a bit of credibility, but if people disagree with me, I'm actually obligated to work out the disagreements. I don't just get to put my code in.

Dain Sundstrom:
And I lose arguments all the time around what's going to go in and it's not the most efficient way to run a project, but I think it leads to a really healthy community in the long run.

Eric Anderson:
That's great to hear. And you mentioned some pillar companies, Treasure Data and others, do you feel like this is still kind of a Facebook project or is it as much kind of a community run project at this point?

Martin Traverso:
No. At this point, it's clearly a community run project. I mean, if you look at contributions over the last few years, they come from a very diverse set of people. I mean, it is true that, today, some of them work at Starburst because we ended up joining Starburst as some of the biggest contributors.

Martin Traverso:
But, there are people from many, many companies that are involved these days and they're active everyday on the project. So, the ideas, the direction of the project, it comes from a group of people that work for different companies.

Dain Sundstrom:
I would say that the sign that I look at is that there are a ton of things going on in Presto that I don't understand, and I'm not following and they're moving along just happily and self-managing.

Dain Sundstrom:
And to me, that's the big sign is when you're not involved in everything. Actually the point where you're like, "Hey, Martin, what the heck is project X?" And he's like, "Oh, that's something from this other group of people and they're interested in moving Presto in this other direction." It's like, wow, that's really cool.

Eric Anderson:
Yeah. Martin points to the 50 printed out sheets on the wall and is like, "Oh, that's over on this one." So, let's talk about the future. Where do you go from here? I mean, both on a personal level, you've worked together for a long time, maybe someone decided to try something new, or as a project, you've now built this thing to last, do you continue to spend time on it? And where do you see the project headed?

Martin Traverso:
I mean, I plan to keep working on Presto for, I don't know, another 10-20 years. We started eight years ago, but, I mean, if you think about it, it's still scratching the surface, in terms of what Presto's capable of doing and what we need to do over the next few years. So, I mean, it's something that I feel very, very passionate about, so I have no plans on doing anything else.

Dain Sundstrom:
Yeah. I'm the same way. I can literally go, and look and be like, "I got five years of work." At least, that I can lay out myself. And even with lots of people helping, there is still tons and tons of stuff to be implemented, and new things to work on and new features to be putting into the system. I think it's super useful today, but there's so much more possible.

David Phillips:
Yeah. This is the type of project that we look at Postgres as the inspiration. Postgres started in the eighties, it became a SQL system in the nineties, and it's still in active use and active development today. We say we want Presto to have the same kind of history.

Eric Anderson:
Yeah. That's awesome. Anything we haven't covered that we should cover?

Dain Sundstrom:
Any good stories we're missing, Martin? David?

Martin Traverso:
I mean, there are the couple of stories the first six months of Presto, if we want to talk about that.

Eric Anderson:
We can throw them in if we missed them. Let's do it.

Martin Traverso:
Okay. So, as I was saying earlier, we started Presto in August 2012, took us six months to go to production. In those six months, we ended up rewriting the code three times. I mean, we wrote something. We realized this is not the right way. We throw it out and start again.

Martin Traverso:
We looked at a bunch of literature at the time, a bunch of research papers. But, I mean, if you look at research papers, they always tell you part of the story as you have some idea for inspiration, but if you try to copy them or do exactly what they say, you always have things that are missing.

Martin Traverso:
So, it was a learning experience at the time. We had never built a database before or a query engine before. We had experience with distributed systems, with languages, et cetera. But, we never had put all those pieces together, so it took us some time to kind of get going with that.

Martin Traverso:
Anyway, so before we went to production, we were trying out different versions. In one of the meetings with our manager, we wanted to show him what we had done so far. So, we had Presto running on one rack of machines. It's a single cluster. It was, I don't know, maybe 20 nodes, or 30 nodes or something like that.

Martin Traverso:
So, we say, "Let's do a quick demo," so we load some data. We show him a couple queries. It was super fast. It came back within a few seconds, queries that would take, I don't know, half an hour or something.

Martin Traverso:
So, we finished our meeting and we went back to our desks to continue working. And we went to log into the Presto cluster and we couldn't log in. It's like we couldn't even ping the machines. Something happened with the cluster, so we're like, "What's going on?"

Martin Traverso:
So, we ended up talking to a bunch of people in the networking team. And at some point, someone told us, "Oh, you own that cluster? Yeah, we had to shut it down. It caused a brownout on the network that affected ads, and delivery and a bunch of other problems." So, it was like, wow, we run a simple query and we cause a brownout across the Facebook infrastructure.

Martin Traverso:
So, what happened was Presto was so fast, it was able to pull data across the network faster than the computers could process it at the time. It was basically a shift in how the systems at Facebook worked at the time. Hadoop was very inefficient, so we could never saturate the network because it was CPU bound.

Martin Traverso:
Presto, on the other hand, was able to saturate the network with the available CPU. And what ended up happening is that because of the architecture of the network at the time, we ended up hitting some edge cases of how their network operated and ended up causing a brownout.

Martin Traverso:
So, that was an interesting and eye opening experience. And of course, we had to adjust the way we deploy Presto to avoid that kind of problem. And then, over time, the network architecture of Facebook evolved to not have that problem at all.

Dain Sundstrom:
I was dealing with the network not being fast enough for the next six years. We were working on new network architecture when I left Facebook because Presto is the single biggest user of network or can use more network than everything else at Facebook.

Dain Sundstrom:
And that just comes from very careful design of the system to be really efficient at computation. And the thing you learn is when you make one thing more efficient, something else is going to be the problem. It's a pretty big eyeopening thing for us.

Eric Anderson:
Yeah. I'm sure it says something about your efficiency and also the network safeguards within Facebook for noisy neighbor workloads.

Dain Sundstrom:
Oh, absolutely. The design of the network in those days was extremely bad. That was before they had started to roll out their fabric networks. It was super bad.

Dain Sundstrom:
Actually, one more item for that last story is an interesting thing I found while working at Facebook on those systems was that you can work on making your code more efficient, but at some point, it doesn't matter anymore because what was actually happening was that as I made the system more efficient, I just idled more, and more and more CPU. And since I'm paying for the machines anyways, just to have them powered on, I'm actually wasting money because I'm spending my time on the wrong problems.

Dain Sundstrom:
So, there actually is this interesting point you reach where for the architecture you have, you can reach a maximum efficiency. It's like I need to process data, and the data's as compressed as possible and I get it to the system, and then I processed it as efficiently as possible. And anything beyond that is actually just a waste of effort, which I had never thought of that as being a software engineer.

Eric Anderson:
When your burn down list just becomes unimportant anymore, it's like it doesn't matter if I fix these bugs.

Dain Sundstrom:
Yeah. If I make it 10% more efficient, then I have 10% more CPU sitting there idle, so I'm going to go work on doing something else. So, it's a really interesting problem.

David Phillips:
And you run into that in other places. If you make your file compression better, well, now you have more data to process, so you end up needing more CPU. Or if you get rid of cold data on your disc, now the data that's remaining is more hot.

David Phillips:
So, if you are actually limited on how fast you can read the working set of data from the disk, well, the cold data actually isn't doing anything for you at all, so it actually doesn't matter. So, when you get rid of it and you put more hot data on there, you've just created a problem for yourself.

Dain Sundstrom:
Yeah. You become IO bound instead of disk bandwidth bound. And we found this every other year, we'd run into a different version of this same story.

Eric Anderson:
Great. Let me ask you, as kind of a final question, if you have anything you want to tell the community? If somebody's just hearing about Presto, either for the first time or they're interested in the project, but never quite been involved, is there a place for them and how do they find that place? Or any other recommendation you give new users to Presto?

Martin Traverso:
Yeah, certainly. So, we have a website, prestosql.io. So, you can get documentation, you can see, you can download the software. There are links to different resources. For example, how to get involved if you want to develop on Presto or if you want to join, we have a Slack for the project. There are links to join there.

Martin Traverso:
And if you want to get involved, Slack is a perfect place to be because that's where we have almost 2,500 people involved right now, so about 300-400 people active every week. So, it's got a very, very active channel, so if you have questions. If you're running Presto and you have questions, you can certainly get help from the community there.

Martin Traverso:
If you want to get involved on and looking for things to contribute, that's also a good place to ask, and you will get guidance from all the people that are developing and can help you out getting your bearings.

Dain Sundstrom:
And Presto is a worldwide phenomenon, so there's someone on Slack 24 hours a day because there's people all over the world who work on Presto. So, wherever you are, there's other Presto people, probably at least in your same country, if not in the same city as you, and they're all there on Slack.

Eric Anderson:
That's wonderful. And Martin, Dain, David, thank you for joining us today and all your input here. One moral for the story is that the next job interview I have, I may be working with that person the next decade. You never know.

Eric Anderson:
And also you got to be nice to your coworkers because you might be with them for a decade. It sounds like we could probably do this show again in 10 years and you all plan to be doing the same thing on the same project with the same people.

Dain Sundstrom:
I hope so.

Martin Traverso:
Absolutely.

Eric Anderson:
Thank you very much.

Martin Traverso:
Thanks for having us.

Dain Sundstrom:
Thanks for having us.

David Phillips:
Thank you.

Eric Anderson:
You can find today's show notes and past episodes at contributor.fyi. Until next time, I'm Eric Anderson and this has been Contributor.