If you are like most engineers, you probably prefer using open source. You may even prefer using tools like ELK and Grafana for observability due to their ease of use and built-in community. But what if you could use these same tools without sacrificing enterprise great scale support and security. With logz.io, you get the best of both worlds. A fully managed service that offers complete cloud observability on one unified platform, log management, and Cloud SIEM based on ELK and infrastructure monitoring based on Grafana. The open source you love at the scale you need. Sign up today for a 14-day free trial at logz.io/GTC, and for your chance to receive your free logz.io t-shirt. JOHN: Welcome to Greater Than Code Episode 174. This is a panel-only episode. I'm John Sawers and I'm here with Chante Thurmond. CHANTE: Hey, everyone. Happy to be back. I'm going to introduce my great friend here, Rein Henrichs. Hello, Rein. REIN: Hi. Thank you. This is a sort of special episode where we don't have a guest, but we do have a topic for discussion. And the topic is resilience. So, I want to start by asking each of you what does resilience mean for you? That could be a definition of the word or stories or examples, whatever comes up when I asked you the question. Who wants to go first? JOHN: I'll go. For me, what always comes to mind with that word is like a reed or a flexible tree that can bend with the wind or the environment, so it doesn't resist being influenced by the environment. It doesn't break under the force of the thing. It just bends and then returns to its natural orientation afterwards. There's a way of things happening in the world affect you but then you're able to return to your homeostasis readily. REIN: Chante, go. CHANTE: Ooh, that was a good one. I feel like to me resilience, I think of the word tenacity and grit and being able to cope or withstand something that you didn't foresee. It doesn't break you. It maybe perhaps makes you stronger. And I'm just going to introduce now early on is that whenever I think of the word resilience, I often tend to think about the book Antifragile. REIN: I heard two different things there. One is sometimes called resilience as rebound. So, rebound from some trauma and the ability to return to a previous natural state or equilibrium. And the other is this idea that resilience is the ability to withstand unforeseen perturbations. Sometimes resilience is confounded with robustness which is the ability to withstand foreseen perturbations. So, when you build a robust software system, you design it to deal with expected inputs in a way that doesn't break. But resilience I think is more about unforeseen surprises. David Woods has this paper called Four Concepts for Resilience. And the four definitions of resilience he mentions are first, resilience as rebound; second, resilience as robustness; third is resilience as the opposite of brittleness which is sort of like antifragile, Chante, which is that rather than the failure mode being this sudden and catastrophic thing, that there is this graceful extensibility, is the term of art, where the thing continues to work just not quite as well as it's experiencing that event. It doesn't just shatter. And then the fourth is the idea that these complex systems can sustain the ability to adapt to future surprises for long periods of time. So, it's not just responding to today's surprise or tomorrow's surprise, it's longer term viability. So, I got two sort of folk definitions of resilience, they're just what they mean to you and that's totally valid but it's interesting to me that I think he has captured there a lot of what people mean, they are varieties of what people mean when they say resilience. JOHN: I like seeing that sort of enumerated out as the different aspects of it because I think Chante and I each picked up one aspect of that, and it's nice to see a full list. REIN: Yeah. Chante, when you mentioned unforeseen or surprise, I think you hit on one of the more important facets, for me personally. CHANTE: Yeah. I'm looking up David Woods now and those are four interesting components of resilience. And so, I'm wondering if the definition, like the word can be flexible and applicable to any of those four things. Or do you feel like all of them have to be there in order for it to be the true depiction of resilience? REIN: So, this is sort of an attempt to characterize popular ways that the term has been used. Woods himself really focuses on three and four. He actually thinks that two is sort of an anti-pattern or he doesn't like conflating robustness with resilience. Basically the difference for him is that robustness is about known knowns and resilience is about unknown unknowns. So, robustness is about you build a thing to withstand the expected problems that it's going to encounter. You build Mars rovers because you know what Mars is like. So, it's robust trying to traverse Mars. Resilience is about how you handle surprise, how you handle things you couldn't predict, for him. CHANTE: David Woods, is he a software engineer? REIN: No, he is a researcher and academic. He sort of pioneered the discipline of resilience engineering as an academic field. CHANTE: That's quite interesting. You know what? One of the thoughts before we had this conversation was, for example, should we assume that resilient people create and build resilient code? REIN: Well, Woods would say that resilience in socio-technical systems comes from people. It's people and our ability to make decisions that provide these systems with their adaptive capacity. JOHN: Yeah, that's interesting. And I wonder if you could personally be not terribly resilient but still be able to think about a system in such a way that you could make it resilient. REIN: Yeah. We contain multitudes. We can be personally resilient or not in all sorts of ways. It's not a single static quantity that represents our capacity to adapt in every situation. Some of us are more resilient when it comes to dealing with software failure or when it comes to personal relationships, let's say. JOHN: One of the things I love about this conversation is how applicable most of the discussion is to both software or socio-technical systems and to individual biological systems like people. I don't think this is blanket applicable, but I would imagine that most of the things that we are about to talk about are going to apply equally to a person managing their own internal state and their own life as to a person managing a software system, or operating within it, rather. CHANTE: Yeah, ditto. I would agree. I think it's a fascinating topic to dive into. This is one of those topics that has been on my mind for quite some time. There was a period in my life, even before I had children, where I wanted to study resiliency. And basically I was wondering what made me so resilient and how I might be able to teach leaders how to do that and why it matters and what kind of fell down the rabbit hole of that whole positive psychology framework and there is so much there. Bottom line was that resilience is one of these sort of high-performance indicators and skills that we do need from leaders of the new school. And I think if I'm applying this to leadership and thinking about organizations and how they might consider this is, I think, there's an old sort of belief that resilience was silent, that resilience was to put your head down and keep going and not to complain, moreso like that gritness versus a tenacity and a flexible and a bendy form of resiliency and it could show up in so many different ways. I'm really interested in just having more conversations that involve resilience as we build the next few versions of technology and get into this fourth industrial revolution. REIN: It's interesting that you mentioned high-performance. Are you familiar with the High Performance Organization's framework? CHANTE: A little bit. I've done research on it but I can't recall it right now, so please if you are... REIN: Basically, HPO was an attempt to study how high-performing organizations worked and what could be learned that sort of generalizes what was discovered about specific cases. It sort of grew out as an alternative to Taylorism, command and control by the numbers management and it covers things like organizational design and team composition and leadership and organizational strategy, innovation, like the whole gamut. One of the things I find, for example, is that the teams that are the highest performing tend to operate semi-autonomously. They set their own schedules, they manage their own quality, they solve their own problems, and so on. So, teams in high-performing organizations tend to be self-directed. High-performing organizations tend to invest heavily in their workforce in learning and development and growth and such. So, it's basically a whole set of observations and then practices around building high-performance organizations. CHANTE: Yes, in that sense, I am familiar with that framework. And there's so much, I think, here in this particular -- if you just Google HPO or high-performance organization, and high performance and self-actualization, oh my gosh, there's so much literature out there. And it doesn't have to be scary, lots of books on it. I feel like we're kind of getting into this -- lately I've seen a lot of folks who are really into trying to hack their way into this high-performance lifestyle and mindset. And I've come to the conclusion that there's more than one way to get there and there's no two ways to high-performance. There's so many resources available in this age where data and information is free flowing and you can learn so many things about yourself and about the organization, the work that you're doing. There's really, in my opinion, no excuse why people shouldn't be aiming to have this as like the ultimate goal. JOHN: Are you speaking specifically as far as organizational design or more as like personal development? CHANTE: Both. Because my interest in it is so many of us are spending so much of our time at work or on our way to work that really at the end of the day, it's an organization's best interest to be focusing on this skill. And because whether you have people who are high-performers at work or not, the goal is you would want everyone to become a high-performer, or that you could optimize everyone's life. And everyone's optimization doesn't look the same, because some of us are born with able bodies and some of us are not, or some of us have more privilege than others. So, I think if we just strive for this optimization and what I would say, this resiliency, that while we won't all be at the same place, we certainly would have greater lived experiences when we're together and especially at work and our communal spaces. REIN: For me, this isn't just about the business value of resilience. For me, it's also an ethical issue. I think that high-performance organizations or resilient organizations or whatever you want to call it, also reduce the harm, the suffering of the people that work there compared to other organizations. So, I think we have an obligation to build these organizations for the people who work in them. CHANTE: I agree with you. I think that's a great point. I wasn't able to articulate that but I think that's actually where my interest is at. Every business is in the business of basically employing people. We have some folks who are trying to have machines replace people, that's a different story, Amazon. But I think at the end of the day, we can't run from the fact that people make up companies and communities. And so, that being said, it should be everyone's aspiration to sort of get people to this high-performance circumstance where, like I said, it doesn't have to be that everyone's high-performance resilience looks the same. But if that can be a shared unified goal, for example, when we're thinking about public and community health, or like when we're thinking about sustainable communities, we don't really hear conversations around resilience. We hear things about sustainability, but I wonder if we swapped in the word resilience, would our lived experiences change? JOHN: I'm curious as to how changing that word out would change sort of the manifested impact of like a neighborhood organization that's trying to improve the quality of life in a neighborhood if they changed it from sustainability to resiliency. How would that change what they do even if the end goal is largely the same thing? REIN: I love that question, and I'm waiting for Chante. CHANTE: I feel like if we swapped out the word sustainability for resilience, for example, that we would see more of us striving for or being okay with it and maybe embodying this sort of mindset that we can change and adapt especially when we have unforeseen things. When I speak of the word sustainability, I think of homeostasis. I think of getting to a place where we're all kind of on the same page and being like, "All right, let's maintain this." I feel like sustainability also goes with maintenance. I don’t know. Maybe I'm wrong and perhaps truly it's that sustainability would encompass resilience. I don’t know, because that’s probably what would actually require a complex and adaptive system to get bigger and better is you have to have these folks. You cannot have homeostasis at all time for the ecosystem to improve. REIN: Yeah, this may be a sort of uncharitable characterization of how people who work on sustainability think. But just if you look at the word, it's more about maintaining a status quo than it is about adapting to change. For me, the difference in terminology for resilience is about acknowledging that there is no such thing as a static organization. Change is the constant. And so, for an organization to be even sustainable, it has to be able to respond to change. And since it can't predict every change, it has to be resilient. So, I think sustainability is impossible without resilience. I imagine a lot of people who work on sustainable orgs would probably agree with that. I don’t want to put words in people's mouths, but I think they have a pretty robust understanding of sustainability. CHANTE: Yeah, I think so. I do think though there's some room for us to maybe welcome this resiliency or this resilient mindset just in everyday lives. I don’t know that I hear a lot. I feel like sometimes it feels like a shiny thing that people are like, "Yeah, that’s great." I'll be resilient when I have more money or I'll be resilient when my situation or my circumstances get better. And I'm noticing this because often my job allows me to have this sorts of private conversations with people about, when I'm coaching and whatnot, about maybe why they want to change jobs or why they don’t like work or why they are having a hard time asking for raises or maybe why they should go back to schools. All these sort of different things. It's a mindset that I'm noticing and there's sort of a lack of this kind of positive twist to it which I think being able to withstand challenges or things that you didn’t foresee is a good thing. And I have to coach people through that all the time. REIN: One thing that’s tempting, I think, when we talk about resilience is talking about how to create resilience. I think it's really important to acknowledge the resilience that’s already present in basically every system. We all worked at startups, right? CHANTE: Yes. JOHN: Basically. REIN: So, the fact that you're working at that startup means that startup has already demonstrated incredible resilience compared to the 90% of startups that don’t exist anymore, right? JOHN: Yeah. REIN: I think we need to start by acknowledging how resilient people in organizations already are just to continue to exist day to day. And so, it's really about building on that foundation. CHANTE: Agree, totally. Reminding people that like even to have survived and to be born into the world requires resiliency. REIN: For me, it's really about discovering or uncovering those sources of resilience that already exist and learning how to grow and nurture them. JOHN: That’s a really good point. Because one of the things that occurred to me as you were saying this is that the place you start at when you try to approach resilience or building that in your self, for example, is usually important. If you’ve been beaten down and your life has gone from horrible thing to horrible thing, building that resilience is a lot harder, I think. I mean, you’ve already got some and I think recognizing it is one of the great ways to start building it by realizing what's there and realizing you do have some because you’ve made it this far. But to compare that against someone who's doing relatively well and hasn’t endured a lot of trauma that if they're trying to build resilience, they're coming at it from a complete different angle. And I would imagine the approaches in those two different situations are vastly different. CHANTE: True. There was a psychologist, I wish I could think of her name. When I was studying this in my master's degree, one of the things she said was folks who experience a lot of trauma, one of the greatest privileges is just literally space and time. She said, "Have you ever noticed that these children who are in highly traumatizing environments, they don’t get a break ever. And sometimes, those are the most tenacious children, yet they don’t even recognize that they're in the middle of chaos and trauma. But there's this great privilege and power getting people space and time in order to actually acknowledge and to feel that they have survived something to say that that was actually an act of resilience." So, it was quite interesting. JOHN: It's a really powerful frame especially if many, many terrible things have happened to phrase it like, "I've made it this far. I clearly have some ability to survive and some of this to build upon." Rather than saying, "I'm starting from zero and now I have to forget how to become resilient." CHANTE: Yes. JOHN: Like you were saying, Rein, the same applies to an organization. If it's still in existence, it's obviously developed some sort of resilience to the environment that it's being built in. And pulling out those abilities to highlight them and build upon them seems like a great technique for enhancing and perpetuating that resilience. REIN: A metaphor, I think, that is useful here is adaptive capacity which is our capacity to adapt to unforeseen circumstances. You can sort of think of it like a fuel tank. And every time you have to be adaptive, every time you have to demonstrate resilience, you deplete some of that fuel. Actually, the spoon's metaphor is very similar to this. And so, we all have some adaptive capacity. One of the things that can make a larger organization resilient is our ability to share individual units of adaptive capacity which is basically people to share that adaptive capacity with others which is basically helping people. JOHN: That’s like adding extra support, like people adding extra support within the organization is one way to do that. REIN: Yes. One of the hidden things that happen throughout organizations all the time is that people are sharing adaptive capacity with others. People asking for extra QA help to get this thing move through the sprint. All sorts of things happening all the time, most of which are invisible. Without which, the organization effectively couldn’t function. JOHN: And this actually brings up something that occurred to me right at the beginning of the episode where the two pillars that you're talking about where like, being able to rebound from a trauma and then being able to continuously adapt that adaptive capacity. So, I think those two together are pretty interesting because if you think of just the first one, like think about a human, they can technically rebound from a trauma but they may rebound in a different shape than they were when they started, and then they will continue operating in that shape without extra work or healing or whatever. And so, that can change the way they operate. And the same thing happens with an organization. If some assault comes from the outside and then the organization builds that scar tissue and that 'this is the only way we can survive' kind of attitude that warps their behavior and that reduces their adaptive capacity. Thinking about those two together as being required, I think, is really interesting. REIN: One of the really interesting things and this is for me something that antifragile sort of gets wrong is that brittleness and resilience are actually not mutually exclusive. So, for example, bone is quite brittle. It breaks. It doesn’t really bend. But bone is also incredibly resilient. If you create the right conditions, it will heal itself. And in fact, it can become stronger than before. JOHN: There was a talk at REdeploy from a bone doctor. Bone doctor, oh my god, that’s the worst flaig name. [Laughs] REIN: [Laughs] That was Dr. Richard Cook. He's actually an anesthesiologist. But he's also one of the sort of pioneers of resilience engineering. It's a great talk. It's on YouTube, you should go find it. CHANTE: I agree with both of you. I think that those two pillars there are quite interesting. And I think you're right to point out that we don’t want to misunderstand or basically attribute this anti-resiliency to being the same as antifragile. So, thank you for pointing that out. REIN: For me, what's really interesting about this is that I want to convince organizations I work with, my employer, for example, that they should become more resilient because it will help the people. It will reduce harm to the employees there. That’s my motivation. But I can make a very strong business case for this as well. So, I would rather not have to do that but pragmatically speaking, I will do that. CHANTE: I'm right there with you. I feel the same way and have similar aspirations in terms of my personal and professional life. And I would love to see all organizations be striving towards this. I want to see all communities and all individuals doing this. I really believe that this sort of resiliency, like I said, is when these skills that we often are not really cognizant of or consciously teaching even younger people. But I would love to see us teach this to kids in Pre-K. It's really, really important. Just like [inaudible] agency, for example, is super important because the more you can recognize resiliency and get comfortable with it, the more you can cultivate it and make it better. By the time [inaudible] from five years old being introduced to this sort of simple thing and you evolve to 25-year old or 50-year old, wow! What a world of difference that could make for somebody. REIN: There's a sense for me in which this is somewhat similar to diversity and inclusion. Chante, you're the expert here, so tell me if I'm speaking out of turn. But I see a lot of D&I directors and consultants, they focus on making a really good business case for D&I. It improves performance, this and that, there's some great statistics out there. And for me, it’s an ethical issue. End of story. But what I get from talking to many of these people is that it's an ethical issue for them too, but they have to do what works. They have to tell the story that reaches the people who have the authority to hire them and give them money to do their jobs to make people's lives better. CHANTE: It gives a good callout too to say it's like this. Like the diversity and inclusion conversation. I would even argue to say that resiliency is more important than inclusion. And the reason why I say that is because for example, as a person who suffered trauma or having issues of psychological safety at work and feeling like, "Hey, I might be triggered or my trauma and my anxiety might be compounded by the fact that I'm operating in a really crappy work environment." Let's say that I'm triggered and somebody's calling me a derogatory term. First and foremost, we can't quite get to the conversation about inclusion until that person is calm and would require some resiliency, some soft skill there that like, "Listen, I need to have composure and this thing that happened to me does not necessarily define me. This is how somebody might think of me and define me, but I am not that thing." So, I think resiliency is a skill that I would put more [inaudible]. REIN: These are complex system and they're intertwined. One of the things that happens is that diversity improves resilience because you have more different ways of looking at problems, you have more different ways of being creative to come up with solutions. And so, variety improves resilience. But the other thing is that if people are spending their adaptive capacity dealing with being discriminated against or excluded in the workplace, they're not going to have it for dealing with the problems of the workplace., with dealing with the business problems they're trying to solve. So, this has to be ballistic. CHANTE: Yeah, you just nailed it. While we're talking about the workplace, think about a family. You have these different folks and people in the family make up a unit, and there's a trauma or there's a thing that happened. You can't really get to the heart of issues about addressing these things, or acknowledging that this is a thing. REIN: Arguably, the focus on the atomic family rather than the community has made communities less resilient because it's reduced variety. Families are generally pretty homogenous. Communities could be much more rich and diverse. CHANTE: My family is not homogenous. REIN: I have no idea how to pronounce that. CHANTE: [Laughs] This is such a good, interesting conversation. I am curious, though, if we think about resiliency specifically within software development and code. I feel like there's some, for example, maybe in my circle of folks who I hang with, who are working like in the human resources or organizational development, learning and development, this is more of a common concept we might discuss. But I'm curious if like this is something that you feel is out there and being discussed amongst folks in the software development industry or profession. REIN: I can't say that resilience engineering that sort of more formal discipline has a bit of an academic flavor to it. We're talking about David Woods, Richard Cook. John Allspaw who is the CTO of Etsy is also part of this. Resilience engineering originated in fields like medicine and an industry in aviation. But a lot of folks in resilience engineering are fascinated by technology because the rate of change, the rate which these systems experience unforeseen events is so much greater in tech than it is in these other domains. And so basically, resilience plays out so much faster. It's basically just a much better place. If you want to study resilience, you want to find a system that is changing as rapidly as possible. And tech is that. So, a lot of people in the resilience engineering community are looking at tech as a sort of test bed for these ideas because there's unique characteristics. And we get to benefit from that. JOHN: That’s really interesting. Studying resilience, and for example, the Brookings Institution would probably take a lifetime because trying to gather data to see how it handles 50 or 100 years of change, it would be [chuckles] -- REIN: Exactly. In the 19th century, you would have a lot of bridge failures but a bridge fails once in this lifetime. That’s how that works. But software can fail millions of times a day. CHANTE: [Laughs] JOHN: It's our superpower. REIN: I mean, that’s not exaggeration. JOHN: Oh, not at all. REIN: The challenge of resilience in software is first of all -- the argument that I would make is that software per se, software itself cannot be resilient. It can only be robust, because it only does what we’ve programmed it to do. Resilience comes from how humans operate software. JOHN: And when it's expanded into the socio-technical system from just the technical system. REIN: Resilience in software comes from how humans respond to incidents and make decision to make changes on the fly. The software itself can be made more robust but it can't be made more resilient, because resilience requires responding to unknown unknowns. And that’s not something software can do until the glorious AI future which will never occur. [Laughter] JOHN: The software is there and runs as it runs, and then it doesn’t change, it doesn’t adapt except for some very limited cases until humans do things. REIN: The software has to be incredibly robust to run it all. But then layered on top of this, there needs to be resilience in the ways we operate and maintain the software. CHANTE: Yeah, this is huge. JOHN: I just thought of a parallel again between people and software. So, there's a book called Rejection Proof about some guy who just decided that he was going to try and get rejected like 5,000 times in a year. So, he would just ask random people for things and they would say, "I'm not going to give you $1," or whatever. And so, he got very, very comfortable with becoming rejected as a way of building resilience around that sort of thing. And I'm thinking it's almost the same thing as the Chaos Monkey in the chaos immunes at Netflix where they build failure to the system so that the system constantly has to deal with that and is built with that in mind from the start. I feel like it's the same thing. REIN: It's very similar. Actually, Casey Rosenthal and Nora Jones who wrote the Chaos Engineering book have both spoken at REdeploy and are both pretty heavily involved in resilience engineering. Actually, Casey's talk at REdeploy 2019 is great and I want you to go watch it. JOHN: Awesome. CHANTE: I'm definitely going to go read up on this and find out more. REIN: You're right. There is heavy overlap in those domains, and that’s one of the ways that I think resilience engineering has been able to make [inaudible] into software is that people in software have already been thinking along those lines and have really latched on to resilient engineering as, "Oh, someone else had already thought out all this stuff and I can just go learn what they’ve already learned. Cool." JOHN: It's already feeding back into the system and making all resilient. CHANTE: It makes me also wonder what price we'd pay if we don’t pay attention to resiliency and prioritize as we move into, like I said, this next fourth industrial revolution. What price do we pay as humanity if we don’t prioritize this because it's going to take not only this evolution for humans to exist and to get better, what certainly within our software and technology requires as sort of being, I think, a pillar or foundational sort of value and belief. So, what's the price that we pay by not prioritizing it? REIN: To find a point on it, existential risk, these organizations will cease to exist. JOHN: And lives too. If your healthcare software is not resilient, people die. Autopilot not resilient, people die. The software that’s [inaudible] the world is becoming that central thing where the one bug causes everyone to get ten times their dose of whatever medication -- like, the real world impacts now magnified so colossally. CHANTE: For all those listening, it's real. I believe that. I feel like it's got to be talked about more. I would love to see this happening. I would love for this to be the buzz word of the century. While we say that, I'm really happy that we're having this conversation. It's definitely giving me more questions than answers. And I'm looking forward to basically, the minute we depart from this conversation, I'd probably dive into more because there's so much that I don’t know. We don’t know what we don’t know, so I'm making myself more resilient here. But I'm looking forward to just kind of see, like I said, see this conversation pop up more frequently and to become a center or a pillar to people's organizational values and aspirations and certainly some underpinnings with the software and development that we have going on in the tech world. JOHN: This is well-timed for me as well. I think that I'm really starting to think about my technical organization in different ways and this resilience feeds into that. It's also a great word to hang a lot of concepts on top of as a way of communicating what I want to get everyone else on board with. So, this is particularly useful for me. REIN: The last sort of major point that I would really like to get across, and this is sort of the theme of Richard Cook's talk about bone is that resilience -- what we can do to make systems more resilient is focus on creating the conditions in which the naturally occurring resilience in people can manifest themselves. So for example, when a doctor sets a bone and puts in a cast, they are not healing the bone. They are putting the bone into alignment and in a position where the body can heal the bone. So, a lot of "creating resilience" is really about creating conditions in which people can perform at their best. And so, that’s why high-performance is irrelevant. And it's about creating conditions where people aren't artificially restricted which goes to inclusion and other such things. So for me, resilience is about creating the conditions that maximize everyone's potential. CHANTE: I couldn’t have said it better. That’s it. That’s the definition. Wow! I'm going to have to borrow that from you, Rein. I love that. I want to have a conversation about this again. REIN: Yes, that’s why resilience is a socialist to me. I know that sounded like a joke. It was not a joke. JOHN: No, I can take it that way. CHANTE: [Laughs]