Ben: Hello and welcome to PodRocket. I'm Ben and today I'm here with Mark Erikson, who is rejoining PodRocket. We had Mark on the podcast a few months ago, it could have been even a year ago, it was a long time. But he's a senior front-end engineer at Replay and also a Redux maintainer. You also may know him as the guy on Twitter with the Simpsons avatar so it's great to see that Mark is a real person and not just a Simpsons character. But welcome back, Mark. Mark Erikson: Thanks. I have actually been mistaken as a bot more than a few times. Ben: Yeah. Well, when we last spoke we talked a lot about Replay and I really enjoyed that conversation. For anyone out there who did not get a chance to listen to the Replay conversation, we'll put a link to that previous episode in this episode's description because if you're interested in just understanding how websites work and what Replay does and how Replay works and we talked all about non-determinism and cool stuff like that. So we'll put a link to that episode because it was a fun conversation. But today we're going to be talking mostly about a recent talk you gave looking at some of the fundamentals of React and how rendering works and stuff like that. So could you give us a quick overview of what was the background for this talk, why did you write it and then we can dive into some of the content? Mark Erikson: Sure. So the talk itself is based on a blog post that I wrote I believe two years ago entitled A Mostly Complete Guide to React Rendering Behavior. And I spend a lot of my time answering questions about React, and Redux, pretty much anywhere there's a text box on the internet and most of my blog posts are inspired by seeing the same questions pop up over and over. And usually when I find myself repeating the same answer a few times, that generally means it's about time to try to put that into a blog post so that I can just reference the blog post as an answer. And in this case, the React Docs tell you a lot about how to use React, but they don't necessarily tell you a lot about how it works. And there's a lot of fundamental aspects of React's behavior that people don't seem to understand. And what I was seeing was that a lot of this information was either scattered in different places or it was something that you could infer based on a few key bits of information, but there wasn't a single resource that put all this information together in one place. It's funny because I remember it was a Friday night and my plan for the evening was just to sit down on the couch and play games. And I'd been tossing around some bits and pieces of this idea for a few days, I remember tweeting, "I feel like I really should write a blog post about this." A couple people actually responded and said, "Yeah, go for it." I ended up cranking out 5,000 words on a Friday evening and the post grew to about 8,500 words in bits and pieces over the next year and a half as I thought of a few different sections to add. And then I went back and expanded it within the last couple months to cover React 18 and some updates. So I think the post now totals about 11,000 words. And so when I was trying to think of conference talks to submit to React Advanced, I tossed SASSY in a few ideas and doing a talk, basically a condensed version of the blog post was what I settle on. Ben: Got it. And before we dive into some of the concepts, why do you think it's important for folks to know more of what's going on under the hood in React? If I just want to build an application, let's just say I don't have a huge interest in contributing to React or going under the hood. What are some reasons why it's important to know this stuff when I'm working with React as a developer? Mark Erikson: So let me answer that in of a roundabout way. Some of the first JavaScript web app stuff that I did was using an older framework called Backbone. And Backbone was very small, the actual source code of the library was one file, 2,500 lines. And one of the things I liked about it at the time was that I could stick a break point in one of my own event handlers, keep clicking next line, next line, next line, and actually step into the middle of the Backbone library and especially the code that ran its event trigger function and then keep on stepping and pop back out the other side in one of the event callbacks. And it was nice because there was no magic going on inside of Backbone, you could read the code, see what was going on, step in, step out. And the community eventually found all the various limitations of Backbone, we moved on to React and virtual DOM and templates and all this other stuff. React on the other hand, the internal implementation is massive and it is really complex. And even just looking at the actual source code, there's a lot of concepts in there that are pretty hard to follow. The good news is that you don't need to know 98% of that to be able to use React effectively because React gives you a very strong mental model for how to write your application. If I look at my UI and something's wrong on screen, there's basically one of two things that are wrong. Either my data was wrong and the logic that was using it is correct or the logic that is outputting the UI is wrong. And if the data is wrong, then I trace backwards to see, well, where did that data come from in the first place? Is it component state, props? Was it fetched from Ajax? Whatever. You work your way back up the tree. And so we use the catchphrase, "UI is a function of state, it is a very powerful mental model." This does start to get more complex when you start tossing in component life cycles and the dreaded useEffect Hook and all that other stuff. But for the most part you don't need to know how React works internally and how all this stuff is implemented to be able to write applications that use React. The flip side of that is that you do need to understand the externally observable behavior in order to use it effectively. For example, React supplies tools like React.memo, which we'll probably talk about a little more later. And you need to know, well, okay, here's this tool, what problem is it supposed to solve? When and why would I use this? How does the app behave with and without that tool? And so understanding the visible external behaviors is important in order to use the tools that React does provide to us. Ben: So with that, let's dig in a bit. And so I have a list of topics that we pulled from, I think, the talk and the blog post and we can talk about some with an emphasis on why they're important and why someone building a React app should have a general sense for how these things work under the hood. So we can start with render and commit phases. What are the render and commit phases and what does rendering even mean in React? Mark Erikson: So when React first came out, it popularized a buzzword, the phrase was virtual DOM. And unlike the other frameworks and tools at the time, React does not just immediately apply changes to the actual DOM nodes that are in the page, React components return descriptions of what they want the UI to actually look like. And these descriptions we write them normally as JSX syntax, your angle bracket looking HTML inside of our JavaScript. But at compile time, all those get transformed into just plain JavaScript objects and loosely put, they end up looking like type, whatever the thing is, you want to render a component or the string name of an HTML tag like Div. Props, an object full of the data that's going to get passed down. And then an array of the children that you said should go inside of that. And so these ultimately form a big tree of parent, child, grandchild and React is going to compare the current tree that your components have asked for versus the previous tree. And it's going to figure out, okay, what's changed? Have you actually asked for anything different at all? And then if there are any differences to apply, then it's actually going to apply just those changes to the actual DOM. And so this work of asking the components what you want it to look like and figuring out what did change is known as the render phase. And then the process of actually applying the specific targeted updates to the DOM is known as the commit phase. Ben: Got it. And so I guess the original design decision to even use that virtual DOM, is that because the browsers are not smart enough to do this diffing before doing expensive operations? Or are there other reasons as well that the React team originally went for that virtual DOM approach? Mark Erikson: There's a bunch of reasons conceptually. One is that historically, updating the DOM has been slow. Now it depends on which operations you're doing and in what order, but for example, there are some DOM operations where even reading a value forces the browser to stop, recalculate all the way out so everything's up to date and then hand back, for example, the client width of a specific DOM node. And so part of it is, how can we figure out the minimum operations that we need to do and try to do them in a fairly efficient order? Another is that conceptually if you think back toe the jQuery days, say I want to toggle a modal, you might call dollar sign, pound my modal dot hide or show, but if you start basing the decision of whether to turn it on or off on the idea of, is that DOM node already visible? You need to start taking into account, well, if it was visible, now hide it and if it was hidden, now show it. And then you start having to manually track, well, what was it before? What should it be now? What are all the possible combinations of changes we might need to make? And that's where things become really complex. So if you're using the DOM as your source of truth, you start having to manage a lot more code and a lot more state yourself. So part of React's idea was if we just hand wave all that and stick it in the background and we can simply say, "Given this data, here's what we want the UI to look like" I as a developer end up writing a lot less code and we can just let React worry about, well, is this the first time? Create all those DOM nodes from scratch. Did I change the text in one list item? Then literally just change the text content of that DOM node. Did I rearrange or move or delete some items? I don't know, let React worry about it. And so the idea was that by making it very simple and just saying, here's what it should be now, React handles all the details for us. Ben: And let's now talk about queuing of rendering. So yeah, this is something, maybe a not as well known a topic that folks can think about how do they want to queue up rendering and what's the logic behind that? And so maybe you could give a little context there. Mark Erikson: So React has always required that you be explicit to say, "Here's when React should start to update the UI." And that always requires calling some form of a setState function. Now the different functions available to us have changed over time. When we had class components, you call it a method called this.setState. Now that we have hooks, there's the setter functions from the useState Hook and the dispatch function from the useReducer Hook, these all eventually combine into the same internal mechanism, they're just different ways to get to the same behavior inside of React. So other frameworks like, for example, View uses an observable based approach where you have a data object, it's fields are being tracked automatically by the framework and when you write code that changes one of those fields, the framework says, "Aha, this field changed. Let me see what other pieces of the UI care about that piece of data and automatically go redraw them." And React in a sense wants you to be explicit about, okay, now we're going to start to re-render and here's a new piece of data that will be used during the render process. So every render always starts with setState being called somewhere. Ben: Got it. So I guess that's typically how you render. Is there anything more to think about in terms of queuing? If I know I'm going to be doing a bunch of rendering, how does that concept of queuing work exactly? Mark Erikson: So this behavior has changed over time and this is where we start to get into some of the nuances of the actual internal details and observable behavior. React has always had this notion of being able to batch multiple queued updates into a single render pass. And up through React 17, that really only happened by default inside of React event handlers, like an on click function. And this is because right before React would call your event handler, it would set an internal flag that says, "Hey, if setState gets called, don't re-render immediately, instead just add any updates that happen in this event handler into a list. And then after the event handler is done, then we'll start a render pass and any of the functions that flag themselves as, 'Hey, we've got some setState happening,' we'll check all of those during the one render, see if they need to update and keep going down from there." And that also meant that if you tried to apply state updates outside of an event handler, which could have been in a set timeout or after an await command, those were outside of the event loop tick, React was no longer trying to check if it should batch updates together and each of those separate setState calls would end up causing a separate render pass immediately and synchronously. So one of the big changes in React 18 is that they've flipped the behavior and React will now always batch any updates that happen within one event loop tick. So even if you have an await and then two setStates after that, even though you're no longer in an event handle or anything else, React will say, "Okay, these two happened one after the other, let's just wait till the end of the event loop, see if any other components want to start something and then render all these things together." Ben: Got it. So I guess smarter behavior that makes it a bit easier in the latest version of React. And so less to think about as a developer, which is always nice. Let's talk about some of the dos and don'ts when rendering and especially the concept of purity versus having side effects. Maybe just explain what does pure mean in the context of rendering? What are examples of side effects and how to be thinking about those? Mark Erikson: So React has always been at least partly inspired by functional programming concepts and the terms pure and side effect both come from that functional programming paradigm. So a side effect loosely put is anything that affects the state of the world outside of this one function. Very strictly speaking, logging to the console is a side effect, it affected something outside of this one function that's running right now. Mutating a variable that was defined outside of a function is a side effect. Making an Ajax request is a side effect. Calling a random value or even using date.now is a side effect. A pure function is something that only depends on the arguments that were passed in and returns a result and every time you call it with the same inputs, it always produces the same result every time. It doesn't affect anything outside of itself. If you've ever worked with reducer functions in Redux or with the useReducer Hook, those are also intended to be pure functions with no side effects. So in principle, React components and their rendering behavior is intended to be pure and not have any side effects. It's just state and props in, JSX, UI elements out, don't do anything else. Now there are some caveats to this. So on the one hand your rendering logic should not make Ajax requests, although there's a double caveat to that I'll get to in a second. In theory, you shouldn't queue state updates, you definitely shouldn't be mutating existing data. It is okay if I were to create an object while rendering and then immediately mutate it because no one else is ever going to see that one variable. So as long as it stays hidden inside the component's rendering logic, you can mutate all you want as long as it wasn't created outside the component. It is okay to throw errors while rendering a React component because React basically has the equivalent of a giant try-catch around it. And it is technically possible to do some lazy initialization of values like with a useRef Hook. Where we start to get into some caveats. So React does technically support calling the setState function while rendering if it is a function component and you do it conditionally. So maybe say, if a prop changed, call setState. And React literally has a loop. If you check the actual React source code, it says, "Call the function component and then if state was set while rendering, loop back right around and try it again up to 50 times." And if you do it 50 times in a row, they assume there's an infinite loop and it gives up and throws an error. But it is actually legal to do that. And then the other one that I still haven't wrapped my head around is some of the new React async rendering RFCs, where the React team is talking about actually letting you initiate Ajax requests while rendering. Now there is still some of this discussion about, okay, what are the rules around that? How much can you get away with? How consistent and predictable is that behavior? That's where it starts to get really complicated and I still don't understand a lot of that myself. Ben: Yeah, I'm curious, what's the use case for that? Does the RFC talk about potential problems that solves? Mark Erikson: So a lot of it has to do with the relatively new React feature called Suspense. And as a very loose way to put it, Suspense is also a little bit like putting a try-catch around a portion of your Component Tree, except that instead of catching errors it's more like catching requests to fetch data so that you can put a loading spinner around that part of your Component Tree. And just in the same way that any piece of code within a try-catch block could throw an error and you know it'll get trapped at this one spot. You can think of Suspense as being similar for data fetching. If any component within that subtree says, "Wait, I need some data first, don't render me yet." Putting the Suspense boundary around that will catch any of those requests, React will file that away for later. And when this promise resolves, "React's like, "Oh well, now I should go back and actually re-render these components because now they have their data." So then the question is, well okay, what is the mechanism for telling React that these components are waiting for some data? So far it's been an undocumented technique where you actually literally throw a promise in your rendering method just as if you were throwing an actual error object. But they've got an RFC up for a new special hook that is literally just the word use. And the proposal is that if you call this use hook and pass in a promise, that will be the trigger for the, "Hey, I need to wait for data," mechanism to kick in. Ben: Got it. Interesting. And Suspense, that is currently available. Which version of React is Suspense tied to? Mark Erikson: So technically the Suspense component has existed publicly since React version 16.6 I think because it's also used for lazy loading components as well. So if you use the React.lazy method to wrap a component, basically wrap it in a promise, and you put that inside the Suspense, then React will fetch the component, say, "Oh hey, I have it now." And then actually render it. The bigger, more hand wavy Suspense for data fetching is a mythical thing that has been promised to us for years. Okay, joke not intended, but it works pretty well. And so there's apparently two big pieces the that the React team has been waiting to try to nail down the APIs for. One is this new use hook, which will be the officially sanctioned method to say, "I have data, please come back to me when this is ready." And then the other piece is some kind of a built in cache so that you can consistently tie the fact that we have data to a component without having to re-fetch it every time we come through. And the React team has not yet put up the RFC for what that cache API will look like, although a prototype version of it is in the code right now. Ben: So let's talk a bit about fibers. Give us a quick overview or reminder, what are fibers and how do they fit into the context of this discussion, particularly around rendering? Mark Erikson: So it might help to define a couple other terms real fast. So when we're rendering and we write these JSX tags and they get converted into those plain objects, those objects are known as elements. So the type props children objects. And then a component is either a class that extends React.component, or more commonly today it's just a function that accepts props and then returns JSX elements. In a very real way, both class components and function components are facades, they are not actually the real data. React actually stores everything about the Component Tree in a set of internal data structures. And this data structure, the individual objects are called a fiber. Now these fiber objects were first used starting with React 16 and if you do a search for the phrase, "React Fiber," you'll see some talks and discussions because React completely rewrote the internals going from React 15 to 16. And so they changed the process of how they looped through and checked all the components, but also they now started tracking the data for each component in this object type called a fiber. And so a fiber object stores literally what is the type of component that we're going to render? What are its existing props, incoming props? What are the cued state updates, the existing state? What's the parent, the child, the siblings? And a bunch of different internal pieces of metadata. So when we talk about the Component Tree, these internal fiber objects are really how React stores the information about the current and the work in progress componentry. And whenever your component calls the useState Hook or tries to access this.state, that data is really coming from the associated fiber object for that exact instance of the component. Ben: So let's talk about rendering by default. Could you give us an explainer for that? Mark Erikson: Sure. So I see so much confusion about this on a daily basis. A lot of people think that React only renders my component when the props change. Or I should wrap every component in React.memo to optimize things or whatever. And I think this is one of the biggest points of confusion. So React's default behavior is that when you call setState in one component and then React actually starts the render pass, it skips down to that component, renders it, looks at the children and then by default it's just going to recurse all the way down the tree of children from that one parent component. So when you call setState high near the top of the tree, every component inside that subtree will render by default. Now again, rendering does not automatically imply that there are changes to make, but React has to call all those components and ask, well, what do you want it to look like right now? And so a component can render, even though it got the same props as last time or there's been no state change inside of it, just because one of its ancestors rendered. And so that right there is just such a big point of confusion, it's just like, "By default, let's work our way all the way down and just see if this had any ripple effects that affect any other components." And so getting that locked into your head as the key part of the mental model, I think, would really benefit a lot of people. Ben: And is why is that the default, I guess? Because you would think, "Oh, we're actually be smart enough to say, okay, if a component's a function that takes in props and if those props have not changed, the component shouldn't render." But is it because people might have impure components that have these side effects that we don't want and so you do have to re-render it because you don't know that there might be side effects under the hood? Mark Erikson: Yeah, there're multiple factors here. One is that certainly in the early days of React when we were still working away from Backbone and Angular and maybe your React app was a mixture of React and some other framework, it was much more common to have external mutable sources of data that might have changed and React wouldn't know about it. And so number one, by automatically double checking all the components inside of us, we figure out if any of those had changed because now the output's different. On top of that, going back to the early conceptual approaches to React, some of the early talks said, "Think of it as if React literally replaces the entire UI every time there's an update." Now in practice that would be a horrible idea because that's blowing away tons of DOM nodes, you lose focus and inputs and all this other stuff. But conceptually we can think of, start from scratch every time and then figure out what changed. This also does go back to, because React is using a virtual DOM as opposed to an observable or template based approach, it has to ask all these components, what changed? As opposed to something that does a more targeted direct dependency based update. So I mean part of the answer is that this is literally just how React works as a principal. Ben: And when you're designing components, is there anything you should be thinking about knowing that if this component may be in a case where it's ancestor maybe re-rendering a lot, but you don't always want that component to re-render because it's maybe an expensive component to render? Mark Erikson: I mean, the React team's general advice is, don't worry about re-renders by default, just write code that works and then if you're seeing performance issues, then start to measure and use the React DevTools profiler to see which components render and how long they're taking. And then use optimization tools like React.memo to target specific components that you know are either expensive themselves or that subtree is fairly expensive. Anyway, it's almost like choke points, pick the important spots and then memorize those to get the best bang for your buck. Ben: Got it. Makes sense. And then let's talk about async rendering? Mark Erikson: So this is the other really common thing that I see pop up on seemingly a daily basis and it's especially become more of an issue since the switch to function components and hooks. So with class components we had all the problems with needing to understand the, this keyword in JavaScript and when do I bind my functions? And state is not a field of undefined and all those other lovely errors we used to deal with. The good news with function components is we don't have to worry about the, this keyword anymore. The bad news is now we have to understand closures, which in some ways is even worse. So something I see people try to do all the time is they have an event handler like a handleClick and they call setState and then they try to log the new value on the next line. Now this is especially common with beginners because it's almost like they're not sure that React actually updated and so they want to see the new value right after they tried to set it. It's like, "Did it actually work?" But it won't work and that's for two reasons. One is because of this queuing, batching mechanism, React literally has not tried to render yet. And the other problem is that because these are closures and they capture data that's available in the scope right now, a click handler and a function component is like a snapshot, it can only see the data and the variables that existed at the time the component rendered and this function was created. And when you queue up a render, that's going to make another render sometime in the future, which will have a different set of variables and a different bunch of closures and functions. That's a totally different snapshot and the code in this function now can't see the future data. So in that sense, trying to log the new value right after you tried to do a setState is not going to work. And this is a very difficult mental model for people to try to wrap their heads around. So if you can get that idea that, what I have access to inside these handovers is a snapshot and it's never going to reflect the changes that I try to make in the future, I think that can help. Ben: And finally, why don't we talk about immutability? Mark Erikson: So this also goes back to functional programming inspirations. We talked earlier about functions being pure and not having side effects and that means you can't mutate existing data. And so the only other way to create changes is to make copies of original data, then you can mutate the copies because they're fresh and use and return those. So React has always been like a functional programming light set of approaches and you've always been supposed to handle updates immutably, but in fact with class components you could actually get away with mutating just because of the internal implementation details. Well, with the useState and useReducer Hooks, React has put its foot down and said, "No, you really need to do these immutably." And as both optimizations and intentional aspects of the implementation, both useState and useReducer will check to see, is the new value you're passing in triple equals identical to the last value that we already had saved? And if it is, we're just going to bail out and not actually render anything at all. Now, strictly speaking from an implementation detail, because React tries to apply these changes while rendering, if you do a setState, it will actually start to render the component. It has to run the useState Hook and everything else to see if the data would have changed. But after it gets done, if it says, "Well, that was actually a setState with the same value, you know what? Let's just throw this away and pretend it didn't happen." So it is a somewhat common mistake for people to mutate data and pass in the same object or array reference and then get confused and say, "Well, my UI didn't change at all." And that's because you effectively told React, "Well, this is the same thing, there's no need to change." So understanding what immutability is and that you should always do updates immutably is very important. And fortunately the new React Beta Docs are much more clear about this. There's a couple pages specifically devoted to talking about, here's how you do updates for objects and here's how you do updates for arrays and they give some examples. And they both also talk about a very nifty library called Immer, which makes it simpler to write immutable updates with code that looks like you're actually mutating data. We use Immer in Redux Toolkit as well, it's a great library. So understanding some of these functional programming principles really starts to help your understanding of how you should use React. Ben: Well, Mark, it's been great having you on the podcast again, as always, I enjoyed the conversation. And we will put a link to the talk and the blog post in the episode description, as I mentioned, we're also going to put a link to... I think you've been on the podcast now twice, I misspoke at the beginning. So two previous episodes, both super interesting conversations. So yeah, thanks again for joining us. Mark Erikson: Sure. Happy to.