Dana Mackenzie_mixdown.mp3-from OneDrive Dana: [00:00:00] At one point he realized something very fundamental and remarkable, which is if you switch the fathers and sons and you plot the sons side as the independent variable and the other side is independent variable, you get the same thing, you get the same fuzzy thing and you get the same correlation. And so correlation is something that is completely independent of causation. Harpreet: [00:00:36] What's up, everybody, welcome to the artists Data Science Podcast, the only self-development podcast for Data scientists. You're going to learn from and be inspired by the people ideas and conversations that'll encourage creativity and innovation in yourself so that you can do the same for others. I also host open office hours you can register to attend by going to Bitly.com/adsoh forward slash a d s o h. I look forward to seeing you all there. Let's ride this beat out into another awesome episode, and don't forget to subscribe to the show and leave a five star review. Our guest today is a mathematician turned science writer who has written for a variety of popular science magazines over the last Harpreet: [00:01:30] Two Harpreet: [00:01:31] Decades. He's earned a PhD in Mathematics from Princeton University and taught math for Harpreet: [00:01:37] A combined 13 years Harpreet: [00:01:39] At Duke University Harpreet: [00:01:40] And Kenyon College. But through it all, Harpreet: [00:01:43] He never felt Harpreet: [00:01:44] Like teaching was his true calling. Harpreet: [00:01:46] So he returned to his childhood love, which was writing. He since transitioned to the Science Communication Program at UC Santa Cruz Harpreet: [00:01:54] And has written books such Harpreet: [00:01:56] As The Harpreet: [00:01:56] Big Splat or How Our Harpreet: [00:01:58] Moon Came to Be The Universe [00:02:00] and Zero Words, and coauthored the Harpreet: [00:02:02] Book of Y alongside Data Perl. Harpreet: [00:02:05] His writing has been included in the New Harpreet: [00:02:08] Scientist, Scientific Harpreet: [00:02:09] American and Discover Magazines, and when he isn't writing, you can find him volunteering at his local animal Harpreet: [00:02:16] Shelter or Harpreet: [00:02:17] Playing chess. So please help me in welcoming our guest today, a man whose job Harpreet: [00:02:22] It is to get Harpreet: [00:02:23] Free lessons from the smartest people Harpreet: [00:02:25] In the world and then write about them. Harpreet: [00:02:27] Dr. Dana Mackenzie Dana, thank you so much for taking time at your schedule to come on to the show today. I really appreciate having you here. Dana: [00:02:35] Thank you so much, Harp. This is a great opportunity. Harpreet: [00:02:38] I feel in many ways we kind of have a similar job because Harpreet: [00:02:41] I get free lessons Harpreet: [00:02:42] From smart people like you. Harpreet: [00:02:43] I get to read books, I get to talk to Harpreet: [00:02:45] Them about their books and don't necessarily write about them, even though I Harpreet: [00:02:48] Love to up my Harpreet: [00:02:50] Writing skill. But we get to talk about them, and today I'm really excited to talk to you about some of the work that you and Dr. Perl did for the book of why. But before we get into that, let's learn a little bit Harpreet: [00:03:00] More about you. Talk to us a bit about where you grew up and what it was like there. Dana: [00:03:05] Sure. Well, I was born in Tennessee and Nashville, Tennessee, Harpreet: [00:03:10] And grew up in the Dana: [00:03:11] South, and people always are a little surprised to hear that because I don't think I have any shred of a southern accent anymore, Harpreet: [00:03:17] But I lived Dana: [00:03:18] In a bunch of places when I was growing up. I also lived in Indiana, Virginia, Harpreet: [00:03:24] Then went to Dana: [00:03:25] School in Massachusetts and college in Harpreet: [00:03:28] Pennsylvania and then graduate school Dana: [00:03:30] At Princeton, as you mentioned. So, so I kind of lived all around Harpreet: [00:03:33] The East Coast, the Dana: [00:03:35] Eastern half of the country for my childhood. And so you mentioned this big transition in my life that occurred in the mid-1990s when I was about thirty seven or so and I changed careers and also as a change of location, I moved out to California, Harpreet: [00:03:52] Which was a place I never thought I would live in. Dana: [00:03:55] And but I've now lived here for close to a quarter of a century and longer [00:04:00] than I've been anywhere else. So. So now I really feel like a Californian, and I love being in Santa Cruz, which is Harpreet: [00:04:07] A home of lots of Dana: [00:04:08] Creative and amazing people. Harpreet: [00:04:10] So, yeah, absolutely love Santa Cruz and being a native Californian myself. The first quarter plus century of my life there, Harpreet: [00:04:18] It's one hell of a place Harpreet: [00:04:19] Man. Santa Cruz is one of my favorite places, for sure. Harpreet: [00:04:22] So you, as a kid, you loved Harpreet: [00:04:24] Writing, Harpreet: [00:04:25] But then you ended up studying Harpreet: [00:04:26] Math at like the highest Harpreet: [00:04:28] Levels. Yeah. Harpreet: [00:04:29] Was that something that you foresaw happening? Were you always into math? Was it like a choice between math and writing? Like, how did this play out? Dana: [00:04:36] Well, that's that's a great question, Harpreet: [00:04:38] And a lot has to do with my parents who encouraged me in Dana: [00:04:43] All spheres of academics, Harpreet: [00:04:45] But particularly Dana: [00:04:46] My father was very interested in mathematics, Harpreet: [00:04:48] And he would Dana: [00:04:50] Do things like give us math puzzles. Harpreet: [00:04:52] And, you know, I Dana: [00:04:53] Didn't actually need a lot of persuasion because Harpreet: [00:04:56] I just also fell in love with math from the beginning. But I remember Dana: [00:05:01] One story I like to tell, which is that when I was in second grade, my father one day asked me, How many ways are there for my class to Harpreet: [00:05:09] Line up to go to lunch? Dana: [00:05:11] And there are twenty two kids in the class. And so I thought, Oh, well, twenty two ways that he's Harpreet: [00:05:16] Kind of Dana: [00:05:17] Laughed. And then my sister, who is actually younger but a little bit smarter than me, Harpreet: [00:05:22] Said, Oh no, Dana: [00:05:23] Forty four. And then, you know, then my father explained to us about pictorials and the fact Harpreet: [00:05:28] That there's this incredibly Dana: [00:05:30] Gigantic number of ways to. Twenty to twenty one, twenty eight, 19 and so Harpreet: [00:05:35] Forth, so I thought this is like incredibly Dana: [00:05:38] Wonderful, I mean, these huge numbers just, you know, maybe so excited. So I decided I was going to work out twenty two factorial. Now the trouble was, I didn't know how to multiply yet numbers longer than one digit because in second grade, we'd only learned that times tables for one digit numbers. And so, Harpreet: [00:05:56] So basically, I could multiply one Dana: [00:05:58] Times to times three up [00:06:00] to nine. But then after that, I was stuck. But I knew that multiplication was the same as repeated edition. And so once I got the multiplying by 10, I just listed number 10 times and I added them up and then I have to multiply by 11. I left that 11 times. Add them up. And so you can imagine how long this, and I'm sure I made zillions of mistakes. But finally, you know, by the last day of school, I actually got an answer, which was probably wrong and sad to say. I don't even have it written down anywhere what I came up with. But to me, this is just sort of an Harpreet: [00:06:35] Indication of how weird Dana: [00:06:37] I was from an early age, but Harpreet: [00:06:39] I just loved things that were Dana: [00:06:40] Mathematical, and all my thought I had to do Harpreet: [00:06:43] Was tell me what a factorial was, and then Dana: [00:06:47] I was off, you know? And so I did always love mathematics. And I think the writing came maybe a little Harpreet: [00:06:53] More from my mother, so Dana: [00:06:55] She encouraged me on that. And in fact, when I was as young as five years old, I would write stories and she would type them up for me and a little book. And I still have a Harpreet: [00:07:05] Couple of little books Dana: [00:07:06] Of stories that I wrote, so I like to tell people. My first published book was written at age six called The Littlest Inchworm, Harpreet: [00:07:15] And I think that I just grew up Dana: [00:07:18] In this Harpreet: [00:07:19] Way that encouraged Dana: [00:07:20] Learning in so many different ways. And so I loved all my subjects. Harpreet: [00:07:25] I love math, I loved writing, and it was Dana: [00:07:27] Really hard for me to really decide in school, what do I want to do? Because I like all the courses. But you know, I think that this love of mathematics did win out Harpreet: [00:07:36] And partly because there was a clear career path. You know, Dana: [00:07:41] I could see myself going to graduate school in mathematics and and becoming a teacher, and it sounded to me something I'd like to do. Harpreet: [00:07:48] So I kind of, you know, and Dana: [00:07:50] I also sort of fell into it by path of least resistance. I did create in all my math Harpreet: [00:07:55] Courses as a math major and Dana: [00:07:57] And got so much encouragement from my teachers. Harpreet: [00:07:59] So, [00:08:00] you know, in some ways, you know, in college, Dana: [00:08:03] You don't know where you're going yet in life and and so you just do what your teachers tell you to do on other students. People you admire do. And so, so math just seemed like the natural thing to do for me. Harpreet: [00:08:14] And whereas writing Dana: [00:08:16] Writing was something I did for fun, you know, it wasn't really a career. I didn't see it that way. And it's too bad. One thing I'd love to do is anyone listening to this podcast who's thinking of writing is a wonderful career path. And it's it's a great way. You feel like you're a generalist. You're interested in everything. Our academic system Harpreet: [00:08:39] Doesn't like that Dana: [00:08:40] You're forced to specialize, you declare a major and then you go to graduate Harpreet: [00:08:44] School and you get Dana: [00:08:45] Even more Harpreet: [00:08:46] Narrow, narrowly focused. And basically, to get a Ph.D., Dana: [00:08:50] You need to become the world's expert on one little thing Harpreet: [00:08:53] You know that nobody Dana: [00:08:54] Else maybe cares about. Everything in our educational system is saying specialized, specialized, specialized. But if you feel Harpreet: [00:09:01] Like you're a generalist generalist at heart, Dana: [00:09:03] Being a writer is the way to be a generalist. You know, what I do is I get stories, get ideas for stories. I'll interview the people. I'm going to be several people. And I just have to learn really quickly what's going on in this subject that maybe I've never studied before. I need to be able to pick it up quickly and then write something at a level that ordinary people like me will understand. And so if you're a generalist, being a writer, a journalist Harpreet: [00:09:29] Is just the Dana: [00:09:30] The greatest thing. I'm so glad I found it, but it took a long time to find it. Harpreet: [00:09:35] Yeah, that's that's really awesome. And like, I consider myself to be a generalist, and I think not to say that I'm against specialization, Harpreet: [00:09:41] But I'm definitely pro Harpreet: [00:09:42] Generalization. I think Harpreet: [00:09:44] That that is the way you Harpreet: [00:09:45] Should be, should just be interested in a wide variety of things and feel like I kind of get to exercise that generalist type of thing with. The podcast is talking Harpreet: [00:09:53] To people Harpreet: [00:09:54] Having to quickly read through their book and absorb from the main concept and then talk to them Harpreet: [00:09:59] About that. But I don't [00:10:00] know, man, like Harpreet: [00:10:00] Something like writing, like, I get something I could see myself doing, but I find it Harpreet: [00:10:04] So difficult to do. And I Harpreet: [00:10:06] Don't know if that's because I just can't think clearly Harpreet: [00:10:09] Enough to get the idea across Harpreet: [00:10:11] Simply or Harpreet: [00:10:12] Or what it is. Harpreet: [00:10:12] But I mean, if anybody that's listening out there wants to Harpreet: [00:10:15] Kind of develop and flex Harpreet: [00:10:17] This writing muscle. Do you have any tips for them on how they can develop and cultivate this skill? Dana: [00:10:23] That's a good question. So certainly one way to learn to write is to write. And there's there's just no way around that the more you do it. Data, you get at it. Harpreet: [00:10:34] So I did go, as you Dana: [00:10:36] Mentioned on a formal science communication program at UC Santa Cruz when I decided to change careers and it's a one year Harpreet: [00:10:44] Program, very innovative Dana: [00:10:46] First of its kind. Still the best of its kind. And it's amazing. If you look at the world of science journalists, probably, you know, a good quarter to a third of the science journalists out there came through the Santa Cruz program. You know, it's really, really incredible. So a lot of what I learned, I sort of formally I learned Harpreet: [00:11:08] From that program and we Dana: [00:11:10] Learned Harpreet: [00:11:10] Skills like how to pitch a Dana: [00:11:11] Story. We learned what we started with. Harpreet: [00:11:14] We started the fall Dana: [00:11:16] Semester with with news writing, How do you write a newspaper type article? And it's kind of a that's kind of a funny thing because I don't particularly enjoy writing that style. These articles are Harpreet: [00:11:29] Very condensed and very Dana: [00:11:32] Superficial, which is when you're writing about science, Harpreet: [00:11:35] It's hard to do. There are many Dana: [00:11:38] Things in Harpreet: [00:11:38] Science that that just need a Dana: [00:11:40] Little more explanation. Harpreet: [00:11:41] So most people Dana: [00:11:43] Who go through the science communication program don't go Harpreet: [00:11:45] Into newspaper writing, Dana: [00:11:47] But it's a great place to start because that's what drives a lot of journalism and media in our country even today, even though we're now in a brand new world where we have podcasting and everything else. Still, it's [00:12:00] the it's the newspapers that kind of pick up the stories first and well, Harpreet: [00:12:04] And of course, TV TV is Dana: [00:12:05] Huge, but that's kind of world I haven't gotten in to very much. Harpreet: [00:12:09] But anyway, I think really Dana: [00:12:12] Good to start with newspapers because you learn how to distill a story to its Harpreet: [00:12:16] Essence. You learn how to Dana: [00:12:18] Pick something up really quickly. I did an internship at a newspaper. I would be assigned a story in the morning and I have to file it Harpreet: [00:12:26] In the evening and so Dana: [00:12:27] You don't have much time. Harpreet: [00:12:29] And the great thing Dana: [00:12:30] About that is for the rest of your life as a journalist, you can never say this is too quick for Harpreet: [00:12:37] Me, you know, because you Dana: [00:12:38] Have that experience of turning a story around in one day. And so if they asked me to do it in the week, yeah, that's tough. But I've done that before I can. Harpreet: [00:12:47] I can hack it, you know? Dana: [00:12:50] So I think that the program taught me stuff like that that was you can't really Harpreet: [00:12:56] Put in a book is Dana: [00:12:57] This you have to experience it. So just doing it takes the fear out of it. I think that's Harpreet: [00:13:03] That's a big Dana: [00:13:04] Aspect of succeeding as a writer. Harpreet: [00:13:07] Yeah, definitely. We'll have to check that out because I mean, this is something that that I do see myself Harpreet: [00:13:12] Doing is writing more. Harpreet: [00:13:15] Hopefully, they have an Harpreet: [00:13:16] Online component to Harpreet: [00:13:17] That program because I'd love to check that. Yeah. Dana: [00:13:20] I think what you're doing is incredibly important, though, because the world is moving away from print and moving towards Harpreet: [00:13:27] Video and audio and multimedia. So I envy you. Dana: [00:13:32] I'm kind of a dinosaur a Harpreet: [00:13:34] Little bit because I Dana: [00:13:35] Grew up in a world where there was only print and only writing. And so that's what's natural to me. It's what I'm good at. But if I were growing up today, I would certainly want to learn some of what you're doing. And I think the UC Santa Cruz program actually does. I think they have a fair amount of training now in video and audio journalism as well Harpreet: [00:13:56] As for the cool. Harpreet: [00:13:57] I mean, it's cool. Harpreet: [00:13:58] The fact that we Harpreet: [00:13:59] Can even [00:14:00] write, right, I mean, Harpreet: [00:14:01] Evolution has been very, very kind to us. Dana: [00:14:03] Yes, yes. Harpreet: [00:14:04] A number of different abilities, you know, to the point where we can engineer our lives. But as you guys mentioned very early on in the book of why it hasn't given those same endowments to Harpreet: [00:14:15] Say R or chimpanzee Harpreet: [00:14:17] Cousins. So what is this? Harpreet: [00:14:19] This computational cognitive Harpreet: [00:14:22] Faculty that humans suddenly acquired Harpreet: [00:14:24] That are Harpreet: [00:14:25] Chimpanzee cousins Harpreet: [00:14:26] Did not. Dana: [00:14:27] Yeah, it's a wonderful question. Harpreet: [00:14:29] So so our book Dana: [00:14:31] Is about Harpreet: [00:14:31] Causation. Dana: [00:14:32] And but, Udaya argues, is that humans are unique among all the species we know of in our ability to perceive the universe closely and by what that means. There's a lot Harpreet: [00:14:46] Wrapped up in that, but Dana: [00:14:48] One thing is that we can, for example, make tools and Harpreet: [00:14:53] Tools are an expression Dana: [00:14:55] Of our ability to modify our environment. Harpreet: [00:14:57] The fact that we can we can envision Dana: [00:15:01] Doing something we can envision making a change in order to Harpreet: [00:15:04] Achieve certain Dana: [00:15:05] Outcome. That's something that very, very few other species know how to do. And there are some there are some that can. There's some really interesting work on tool usage among crows, which is kind of amazing that they can actually crows in New Caledonia feed on termite and they need to fish the termites out of the branches that they live in, Harpreet: [00:15:28] And they actually will take a Dana: [00:15:30] Twig and they'll bend it so that they. And reach in there and fish out the termites. Harpreet: [00:15:35] So this is arguably Dana: [00:15:36] A form of tool Harpreet: [00:15:38] Usage, and I had a little bit of debate, you know, does this Dana: [00:15:42] How much does this translate to to some understanding of causality? But in the context of our book, this is what we call level two or Harpreet: [00:15:50] Run two causal Dana: [00:15:52] Understanding, Harpreet: [00:15:52] Being able to visualize Dana: [00:15:54] An intervention and then carry that out, Harpreet: [00:15:58] Then even more Dana: [00:15:59] Interesting Harpreet: [00:15:59] Stage [00:16:00] of causal reasoning we think Dana: [00:16:02] Is unique to humans Harpreet: [00:16:03] Is counterfactual Dana: [00:16:04] Reasoning. In other words, imagining worlds or universes that don't even exist. And this is the realm of imagination, and it's something that that's unique to humans. It's something that we develop at a fairly early age, Harpreet: [00:16:18] And I Dana: [00:16:19] Believe that we developed this through play. So when kids Harpreet: [00:16:22] Are playing there, Dana: [00:16:25] Think of what they're Harpreet: [00:16:25] Doing, they're like playing at Dana: [00:16:27] Being a princess or playing it, being this or the other. They're thinking, What if? What if I were a princess? What would it be like? And you act that out and you see some things Harpreet: [00:16:37] Are different and stuff like that. Dana: [00:16:39] And I think this display is incredibly important Harpreet: [00:16:42] For human development, and it puts Dana: [00:16:44] Us in this frame of mind of being able to envision other worlds. As far as we know, no other species have this ability. And it's what enables us to to invent things Harpreet: [00:16:55] And to also to a lot of like our moral Dana: [00:17:00] Understanding of the universe comes about this as well because we do something and we regret it, then we say, I shouldn't have done that. If I'd only hit the brakes, I wouldn't Harpreet: [00:17:09] Have hit that animal and killed it Dana: [00:17:12] Or something like that. And so our ability to envision a world where something different happened, where we acted differently Harpreet: [00:17:20] Is a hugely Dana: [00:17:21] Important part of our, of our humanity and of our understanding of the universe. Harpreet: [00:17:27] Yes, that's the concept of counterfactuals is is quite quite interesting. Thank you so much for talking to us about that. So why does it seem? I mean, at least to me, that that statisticians haven't been able to wrap their heads Harpreet: [00:17:40] Around this this concept? Yeah. Dana: [00:17:43] It's something really Harpreet: [00:17:44] Surprising. And you know, Dana: [00:17:47] I sort of Harpreet: [00:17:48] Imagined that for the Dana: [00:17:51] Layman reader of our book, one of the weird, weirdest or Harpreet: [00:17:54] Strangest things. One of the hardest Dana: [00:17:56] Things may be to to persuade them of is the fact that [00:18:00] the science has been so blind to causation. When it's something that's so natural to us is something that we acquire, as Harpreet: [00:18:08] I said in our earliest Dana: [00:18:10] Years of infancy. We're constantly experimenting, playing, you know, when your baby drops the plate on the floor, Harpreet: [00:18:17] He's not being disobedient. Dana: [00:18:19] He's trying to understand how plates and falling and floors and gravity Harpreet: [00:18:24] Work, you know? Dana: [00:18:25] And so this understanding of causation is something that we develop at an early age. We understand if we drop the plate, it'll hit the floor and break. Why is it that scientists have abandoned this language? Why is it so hard for them to understand something that's so easy that's so fundamentally human? So that's a very interesting story in Harpreet: [00:18:43] This one that we Dana: [00:18:44] Tell in the book. So it really goes back to the very early days of science of statistics. When Francis Galton, who was in some ways the inventor of statistics, was trying to get at questions like inheritance. So, I mean, he was very interested. He was unfortunately a eugenicist. So he was very Harpreet: [00:19:05] Interested in inheritance Dana: [00:19:07] Of intelligence and justice and not even just intelligence, but talent or excellence in some broad sense. To what extent is this inherited in families? And he grew up in a family that had some very famous people in it Charles Darwin, Harpreet: [00:19:26] Etc. He obviously would love to have been able to Dana: [00:19:29] Prove that smart people have smart children or that that accomplished people have accomplished children. So he he actually collected data for many years Harpreet: [00:19:39] On people and Dana: [00:19:41] Their their offspring. Harpreet: [00:19:43] And, you know, Dana: [00:19:44] Intelligence is hard to measure. So he would look at simpler things like height. Harpreet: [00:19:50] And what he discovered, not Dana: [00:19:52] Surprisingly, is that the tall people tend to have taller children, but it's not perfect Harpreet: [00:19:57] When you plot the heights Dana: [00:19:59] Of fathers [00:20:00] and sons on the graph, you'll see a big fuzzy blob of, you Harpreet: [00:20:03] Know, some tall fathers Dana: [00:20:05] Have really tall sons, Harpreet: [00:20:07] But some don't. Dana: [00:20:08] And it's not a perfect one to one correspondence there. And so he came up with the idea Harpreet: [00:20:16] Of association or Dana: [00:20:19] Correlation, Harpreet: [00:20:20] Which was a word that Dana: [00:20:21] He invented. He actually called it Harpreet: [00:20:23] A hyphen relation. Dana: [00:20:25] And then eventually it got sort of turned into this word correlation. And so he realized that there's a correlation between fathers heights and sons heights. Harpreet: [00:20:35] But at one point he realized Dana: [00:20:37] Something very Harpreet: [00:20:38] Fundamental and Dana: [00:20:39] Remarkable, which is if you switch to fathers and sons and you plot Harpreet: [00:20:44] The Dana: [00:20:45] Sons side as Harpreet: [00:20:46] The dependent independent Dana: [00:20:48] Variable and the other side is independent variable, you get the same thing, you get the same fuzzy thing and you get the same correlation. And so correlation is something that is completely independent of causation. Harpreet: [00:21:00] So this was Dana: [00:21:02] Like an Harpreet: [00:21:02] Amazing insight for him Dana: [00:21:04] That, you know, you can what he called regress fathers heights and some porn sites, or vice versa. And so in his early stages, this word regression, he had also invented the word regression. Harpreet: [00:21:17] So which is Dana: [00:21:18] Still used in statistics, although in a different meaning, Harpreet: [00:21:21] He realized that Dana: [00:21:24] The city had this sort of blob of Data Harpreet: [00:21:26] And you realize that sons heights the sons of tall Dana: [00:21:30] Father. All fathers Harpreet: [00:21:31] Are taller than than average, Dana: [00:21:34] But not as tall as the fathers in general. So and he called this regression to the mean. Harpreet: [00:21:40] He saw it as and he thought that there was some, some actual Dana: [00:21:43] Physical process going on here that Harpreet: [00:21:46] The genes of Dana: [00:21:48] Outstanding people are somehow getting watered down. Harpreet: [00:21:51] And so in the successive Dana: [00:21:53] Generations, you regress towards the mean. That's why he used the word regression. Harpreet: [00:21:57] And it's a very Dana: [00:21:59] Valuable word, [00:22:00] which is an interesting story in its own right. But but then when he did this little thought experiment with the switching the fathers and sons, he realized that the sons, the the tall sons have shorter fathers. Harpreet: [00:22:15] And so which way Dana: [00:22:16] Is it going? You know, fathers having shorter sons Harpreet: [00:22:19] Are Dana: [00:22:20] Having shorter fathers. And he realized there's just no causation. There's no Harpreet: [00:22:25] There's there's no Dana: [00:22:26] Physical thing going on that's making the son shorter or this making the father shorter because he realized there's no way that the height of the sons could be causing the fathers. Harpreet: [00:22:38] So regression or association Dana: [00:22:41] Or correlation, whatever word you want to use is a non causal concept. And so once he had this insight, he Harpreet: [00:22:50] Then said, Well, we, you know, we have to banish causality of Dana: [00:22:54] Forget about causation. Harpreet: [00:22:56] This correlation is Dana: [00:22:58] Really what science is all about, and it's really what statistics is all about. And he was encouraged in Harpreet: [00:23:05] This belief by Dana: [00:23:06] Carl Pearson, Harpreet: [00:23:07] Who is one of the other Dana: [00:23:08] Co-founders of statistics who was, if anything, even more zealous anti causation Harpreet: [00:23:15] Than than Galton was. Dana: [00:23:17] And so between the two of them, they they put together the mathematical foundations to statistics as a subject which we still use today, Harpreet: [00:23:28] And they did it Dana: [00:23:29] Very well. But in the process, they expunged this concept of causation. They said, you're not allowed to talk about causation. And they viewed causation really as as only meaning Harpreet: [00:23:44] A deterministic form of causation Dana: [00:23:46] Like like in Galileo or in Newton. So that there's an absolute precise. You know, relationship between Harpreet: [00:23:57] Like the orbit of Dana: [00:23:58] A planet Harpreet: [00:23:59] And or [00:24:00] Dana: [00:24:00] Dropping Apple and having it land, so so they took this very, very limited version of what they said and they said, OK, but in. Public in public health applications or in biological applications that never happened. And fathers never precisely determine heights and sons. And so this idea of causation is just not Harpreet: [00:24:24] Relevant to Dana: [00:24:25] To the subject we're talking about. So it took many, many years to get over this sort of restriction or this taboo on causation. And I'd love to talk more about that, but I'd like you to give you a chance to ask a question. Harpreet: [00:24:38] So yeah, and it seems like ever since then, people have been have been touting the slogan Correlation does not. Dana: [00:24:44] Yes. Yeah. Yeah, yeah. Right. So you learn that in Harpreet: [00:24:47] Statistics 101, every Dana: [00:24:49] Statistics one book says correlation is not causation. And they forget to tell you what is causation. They never tell you that. Yeah, yeah. So that's what your day is all about, is saying. I agree. Yeah, correlation is not causation, Harpreet: [00:25:04] But causation Dana: [00:25:05] Is too important to throw out. It's part of how we perceive the universe Harpreet: [00:25:10] And how we begin life from age Dana: [00:25:12] Two or three. We understand the universe in terms of causes and effects. And this week, we shouldn't be throwing this out. And so it's his genius to realize, OK, look, there's something here that's worth preserving and how can we talk about it? What are the mathematical laws that govern causation? And that's what you'll read about in our book. Harpreet: [00:25:31] Yeah, I mean, I guess I still don't get why these guys are so against causation. Like, I don't I don't understand why. Why did it become so taboo, do you? And do you think you can help me understand that because it's still so unclear to me. It's like it's it's causation, just like something that's not scientific, just because we see something happen. Harpreet: [00:25:49] Like, I don't I don't get it. Yeah, that's you know, they well, you Dana: [00:25:53] Know, in fairness, causation is this been a philosophical issue for Harpreet: [00:25:58] Millennia? You [00:26:00] know, Dana: [00:26:00] Aristotle, the ancient Greeks Harpreet: [00:26:02] Were talks about Dana: [00:26:03] Causation or trying to figure out, and it is a slippery Harpreet: [00:26:06] Concept. Dana: [00:26:07] It's hard to define causation. Harpreet: [00:26:09] And as a mathematician, I am comfortable with that. It's something that maybe not Dana: [00:26:15] Mathematicians don't realize. Harpreet: [00:26:16] But in mathematics, if you go Dana: [00:26:18] To a geometry book, it'll never explain. It'll never define what a point is or what a line is. Then you quitting geometry. These are actually undefined terms, and they are defined in a sense through axioms through describing how they work. Two lines always intersect in the point that's an accident. And you say they work that way and you have other axioms. And then from these, you can prove new results about them without ever actually defining Harpreet: [00:26:45] What the point Dana: [00:26:46] Is or align it. And so we can take a similar approach to causation. And I think this is again ideas genius to say, let's get away from these enormous philosophical debates that have gone on for hundreds of years as to what is causation. Let's just describe how it works, Harpreet: [00:27:02] And we know Dana: [00:27:04] It's there. We know we can perceive the universe this way. Harpreet: [00:27:07] Let's try to figure out rules for reasoning Dana: [00:27:10] About causation and rules for taking data and saying and answering a causal question like if I take aspirin, will it make my headache go away? So I'm not asking your question directly, but I think the step one is to acknowledge that it's a difficult question. The causation is, is a difficult concept. And I think the step two, I think the historical approach helps you understand that the statisticians were able to understand this idea of correlation because you can you can write a formula for it. You can take the data and it's just in the data. And so this this Harpreet: [00:27:49] Was actually the fatal Dana: [00:27:51] Mistake of statisticians that's been going on for one hundred years is to think Harpreet: [00:27:56] What it's all in the data. Whatever we need to know, [00:28:00] it's in the data and it they say no, no. Dana: [00:28:03] Causation is not just in the data causation. Data correlation is the lowest level of causation is what we call run one. Not really causation at all. Harpreet: [00:28:14] And so if you want Dana: [00:28:17] To talk about causation, you've got to accept that there's something that's not in the data. So it's not in the Harpreet: [00:28:23] Data is a Dana: [00:28:24] Story or a model for how that data came to be, and we call that a generative model. Harpreet: [00:28:30] So to talk about causation, you need to Dana: [00:28:33] Put that in and we have lots of examples in our book. I'll be glad to talk about some. Let me let me just give you actually one example that I was just thinking about the last couple of days. We have a chapter on paradoxes in our book, Harpreet: [00:28:46] Which I think is probably Dana: [00:28:47] A lot of Harpreet: [00:28:47] Fun for readers because Dana: [00:28:49] It's it's not as it's not as hard. It's sort of fun to think about these paradoxes. And one of the paradoxes we write about is the Monte Hall paradox, Harpreet: [00:28:58] And this is the paradox you probably Dana: [00:29:00] Know about the game. Let's make a deal. And in this game, Monty Hall, who is the host, would Harpreet: [00:29:07] Let Dana: [00:29:07] The contestants open the door. There are three doors. Harpreet: [00:29:10] There's a prize hidden Dana: [00:29:11] Behind each one of them. Two of them are worthless, Harpreet: [00:29:14] Like they're a goat. Dana: [00:29:16] And then the third one is something really cool, like a car. And so you're supposed to open the door and so you open the door, you see a goat. Harpreet: [00:29:24] Darn it. You know, Dana: [00:29:26] I don't know. I'm sorry. I'm getting. Harpreet: [00:29:29] He doesn't open. You pick a door. Dana: [00:29:31] He doesn't open it. Sorry, I messed that up. Yeah. So you pick it or pick a he doesn't open it, but he opens one of the other doors like maybe Dorsey and behind door. See, there's a goat. And he says, OK, would you like to switch doors now? And so glad to be here. Should I go to a and well, this sort seems 50 50, I mean, it could be either one. So and I picked a to begin with, so I'll stick [00:30:00] with that. That's the way most people think. You know, so. So they really Harpreet: [00:30:05] Figure it's a 50 50 chance, Dana: [00:30:07] But people are stubborn and they like to stick with their original choice, so they'll stick with it. Harpreet: [00:30:12] So what's amazing is that, in fact, Dana: [00:30:15] You can prove. Harpreet: [00:30:17] That you should switch. Dana: [00:30:20] So in fact, your chances, the chances of getting the right one in your first guess where one out of three and they're still one of three, that opening that door has not actually changed anything, so it's a one third chance that you Harpreet: [00:30:32] Had the right door. And it's a two Dana: [00:30:33] Thirds chance that B is the right time. So this is this marvelous paradox called the Moneyball Paradox. It was written about in Harpreet: [00:30:42] In a Dana: [00:30:43] Column called Ask Harpreet: [00:30:44] Marilyn, which Marilyn Joseph had this Dana: [00:30:47] Newspaper column. She bills herself as the smartest person in the world because she had an IQ test of one hundred and ninety or something like that. Who knows? Harpreet: [00:30:56] I don't. Dana: [00:30:56] I don't have an opinion on that. Anyhow, she had this this newspaper column where people would pose puzzles and then she would pose puzzles, and then she would give answers. And this was like the one puzzle Harpreet: [00:31:08] That arose the Dana: [00:31:09] Greatest furor. Harpreet: [00:31:10] Because she posted this, she wrote Dana: [00:31:12] About it, and then she said, You should switch. You have a two thirds chance of winning. Harpreet: [00:31:17] And, you know, absolutely, people went Dana: [00:31:20] Bananas over this. And statisticians went bananas. Harpreet: [00:31:23] And they wrote in and said, You idiot Dana: [00:31:25] Know you don't know anything about statistics and blah blah blah blah. It turns out she was Harpreet: [00:31:30] Right and stations were wrong. But what was really interesting about this paradox Dana: [00:31:36] Is Harpreet: [00:31:36] That I don't think anyone has explained it in the way Dana: [00:31:39] That you'd have as a causal paradigm. So the the answer really depends on what Harpreet: [00:31:45] Are the rules Dana: [00:31:46] Of the game. And there's a tacit rule here that the producers are following, which is that you make your pick and then they're going to Harpreet: [00:31:54] Open the door and they're not going to Dana: [00:31:55] Open the door with the car because then there would be no drama, there'd [00:32:00] be no suspense. So they always pick a door. They know where the car is. They will always open the door that doesn't have the car. And so there's you can actually draw a causal diagram. So your choice affects what are they open because they're not going to open door A.. But they're also not going to Harpreet: [00:32:21] Open the door where Dana: [00:32:23] The car is. And so there's two causal arrows heading into this variable of which door got open. So there's a causal diagram and you can actually Harpreet: [00:32:34] Use your DES procedures Dana: [00:32:36] For computing the probability and you come up with there's a one third probability that's behind your door and two thirds that is behind the door. B, You could also imagine a different game where the producers don't actually care about whether it's a good show or not, and they just pick a door at random to open and sometimes it's the door with the car. Harpreet: [00:32:58] So in that world which I call Dana: [00:33:00] This, let's fake a deal in the book. Harpreet: [00:33:02] In that world, it doesn't matter. Dana: [00:33:05] So the intuitive idea that it's either A or B is actually correct and there's no benefit gained to change it. So, so the moral of the story is Harpreet: [00:33:15] That the causal Dana: [00:33:16] Diagram Harpreet: [00:33:16] Matters the the Dana: [00:33:18] Generative process behind the Data matters. Harpreet: [00:33:21] And so the fact that the Dana: [00:33:24] Producers know where the car is and pick a pick a door that doesn't have the car behind it. That's one model. Harpreet: [00:33:31] The let's make a deal Dana: [00:33:32] Model where they don't know, and they just pick a door at random to that's a different causal model. And what you do with the Data Harpreet: [00:33:39] Depends, crucially, Dana: [00:33:40] On which model you have. And so if you have, let's make a deal. You should Harpreet: [00:33:45] Change. Dana: [00:33:46] You should switch to Part B. If you're playing, let's make a deal. It doesn't matter. So the Data will never tell you same Data different answer. Harpreet: [00:33:56] And to me, that is as great an Dana: [00:33:59] Example [00:34:00] as any that Harpreet: [00:34:01] That causal models matter and the generative model matters. Harpreet: [00:34:05] Yeah, absolutely love that chapter in the book Harpreet: [00:34:07] Book of Why Guys Go Get this Harpreet: [00:34:09] Book, definitely. This is one of the most highly recommended books from from people that have come on my podcast. But yeah, that that chapter about Paradox is one of my favorite ones. I have to read that one twice back to back because it was a kind of mind bending stuff in there. Harpreet: [00:34:24] That's neat. I'm glad. Really glad to hear that. Harpreet: [00:34:26] Yeah, but fun fact about Monty Hall. He's actually so. I live in the city of Winnipeg. Monty Hall is actually from Winnipeg. He grew up here and Harpreet: [00:34:34] Everything, and Harpreet: [00:34:36] Just not too far from my home. There's a street called Monty Hall Drive. I hope that's that's great. Dana: [00:34:44] Yeah, he did pretty Harpreet: [00:34:45] Well for himself. Yeah, yeah, yeah, definitely. Dana: [00:34:47] I actually read an interview of when we were Harpreet: [00:34:51] Working on this book, and Dana: [00:34:53] Nothing from the interview really got Harpreet: [00:34:54] Into the book, but it was sort of fun to read his take Dana: [00:34:57] On this, and he was just so amused by the whole Monty Hall paradox Harpreet: [00:35:01] Thing, you know? Yeah. So he's had, of course, they didn't Dana: [00:35:06] Put any deep thought into it, you know, they just wanted a good show, you know? But it's just funny, actually, that the wanting to have a good show actually led them into this fascinating philosophical statistical problem. Harpreet: [00:35:21] Though you mentioned it a couple of times, we kind of alluded to it this metaphor for the ladder of causation. Harpreet: [00:35:27] So talk to Harpreet: [00:35:28] Us about what this metaphor is all about and maybe if you can walk us up the rungs Harpreet: [00:35:32] Of the matter of causation? Yeah. So this is, you know, Dana: [00:35:36] As I like to say, it's really the central metaphor in our book. So the ladder of causation, basically, there's three Harpreet: [00:35:42] Rungs on the ladder Dana: [00:35:44] That correspond to progressively more sophisticated understanding of causation and actually mathematically distinct questions. Harpreet: [00:35:52] So the bottom Dana: [00:35:53] Level is association, and that's where you just that's the level of statistics and just the level of Data where [00:36:00] you just learn which things are associated Harpreet: [00:36:03] With each other, things. And so Dana: [00:36:05] Like you could, for example, an owl learning to hunt Harpreet: [00:36:09] Learns that at this time Dana: [00:36:11] Of day, the mice come out and he doesn't know why. He doesn't know what's causing them to come out, but he knows that this is a good time to go hunting. So this is a level that many, many animals are at. Harpreet: [00:36:24] And I would Dana: [00:36:24] Say also that machine learning Harpreet: [00:36:26] Is pretty much at at this stage. Dana: [00:36:29] So it's a descriptive level. That's why you see associations and Data. So the second rung Harpreet: [00:36:35] On the ladder of causation Dana: [00:36:37] Is intervention. And so that's where I was talking Harpreet: [00:36:39] A little bit earlier about Dana: [00:36:41] Tool usage. So intervention is when you you change the system, Harpreet: [00:36:47] So you change the the rules Dana: [00:36:49] Under which the data were generated. So good example Harpreet: [00:36:53] Of this is in randomized Dana: [00:36:55] Controlled trials where you. Harpreet: [00:36:59] Will have a control Dana: [00:37:01] Group and you'll have a treatment group Harpreet: [00:37:04] And you decide you intervene Dana: [00:37:06] To to give Harpreet: [00:37:07] The treatment group this drug while the control group Dana: [00:37:11] Will get a placebo. And then you see what's what happens differently in the two groups and if you see a difference, then you say, you say, possibly with extreme reluctance if you're a statistician, that the drug actually caused Harpreet: [00:37:24] The control, the treatment Dana: [00:37:27] Group to improve. So being able to talk about interventions changing something about the causal relationships Harpreet: [00:37:35] Is level two of the light of causation. Dana: [00:37:38] As I said before is already pretty unusual. There's not very many species that Harpreet: [00:37:43] We feel Dana: [00:37:44] Can use tools, Harpreet: [00:37:45] And so it's Dana: [00:37:46] Fairly special ability of humans. Then level three is the counterfactual level or the imagining level. And this is where you aren't just envisioning an intervention, you're envisioning a different world. [00:38:00] So you're envisioning something that didn't happen and or that isn't the actual case and you're asking Harpreet: [00:38:07] What would be different Dana: [00:38:09] About that world. And so there you have questions like if I had taken aspirin, would my headache or if I had taken it with the headache have gone away? Or if I hadn't take it, would it not have? So that's a counterfactual question. Harpreet: [00:38:24] And that's a lot Dana: [00:38:25] Of how we learn about the world. And it's it's a great quote. I want to. I read just last year about Counterfactuals, which a guy named Jonathan Harpreet: [00:38:36] Vicens in Nature Dana: [00:38:37] Communications said diagnosis is fundamentally a counterfactual inference task. Every time you make a diagnosis, you're saying, OK, you're saying, if this person Harpreet: [00:38:50] Ok, I think it's tough to switch from talking off the top Dana: [00:38:56] Of my head to reading Harpreet: [00:38:57] Off a card. Dana: [00:38:58] So you have you have a symptom. You want to know what disease caused it. So you want to say if this symptom were not present, then I don't I Harpreet: [00:39:10] Wouldn't have the disease. Dana: [00:39:11] But because this is present, then I Harpreet: [00:39:14] Conclude that I have this disease. Dana: [00:39:16] So you're asking, given that I don't have the disease or the symptom? What's the probability that introducing this disease would cause the symptom? So that's what we call probability sufficiency, and it's tough. You need Harpreet: [00:39:30] To have a you need to have Dana: [00:39:32] A formula to follow it. So that is a counterfactual inference. So you're saying, given that I know that that without disease, I wouldn't have the symptom. Harpreet: [00:39:41] What is the probability that introducing Dana: [00:39:43] The disease causes symptoms? So that's probability sufficiency, and that requires you to have data Harpreet: [00:39:50] About a counterfactual Dana: [00:39:51] World where I didn't have the disease, but now I introduce the disease. So, so as this person was saying, diagnosis [00:40:00] is actually a counterfactual inference, and there are other examples we could give. So you I know you're Harpreet: [00:40:06] Going to ask me Dana: [00:40:07] On your list of questions you're going to bring up about Sherlock Holmes. Harpreet: [00:40:11] And again, so what a Dana: [00:40:14] Detective does is he works Harpreet: [00:40:16] From effects to causes. Dana: [00:40:18] And so this is again, this is called this is induction. So it's funny. I was reading about Sherlock Holmes for this chapter, and a lot of people think that Sherlock Holmes was this great detective because he was good at deducing. Harpreet: [00:40:32] That's what I thought as well. Dana: [00:40:33] Yeah, but it's the opposite. What a detective does is Harpreet: [00:40:36] Work from the the effects Dana: [00:40:38] And go backwards to the causes, and that's induction. And so what you and what he would do is is induce the possible causes. And then once you have Harpreet: [00:40:47] The list of possible causes you, Dana: [00:40:49] He has the saying that. Yeah, yeah, that's yeah. Harpreet: [00:40:53] That's the one about once you have Dana: [00:40:55] Eliminated the impossible, whatever remains, no matter how improbable is, that is the truth. Harpreet: [00:41:02] So, so here he's Dana: [00:41:04] Inducing what are the possible causes? And then he's saying, OK, well, you deduce that this one could not have this guy could not have been the murderer. And so the other guy, even though it seemed incredibly improbable at the beginning, the other guy must be the murderer. And so, so that's it's really a combination of induction and deduction. Harpreet: [00:41:24] But the really interesting part is Dana: [00:41:28] The induction because that's that's the creative Harpreet: [00:41:31] Part where you're Dana: [00:41:32] Picking out what might have caused this. And then the deduction is something you learn in math, you know? Yeah. So that was a long digression. But what was I trying to say? Yeah. So again, I was saying that right? Harpreet: [00:41:43] So what Sherlock Dana: [00:41:44] Holmes is doing is Harpreet: [00:41:46] Is very fundamentally Dana: [00:41:47] Causal. And in fact, refers to counterfactuals because you're you're looking at hypothetical universe. Harpreet: [00:41:54] And so the ladder of causation, if I remember, right, like. First rung, that's all [00:42:00] about seeing what we can see. Dana: [00:42:03] Yes, seeing and doing right. Yeah. So yeah, so the first two Harpreet: [00:42:08] Rungs, I think Dana: [00:42:10] Really interesting. I spent most of the time in the book Harpreet: [00:42:12] On them, but so so the the Dana: [00:42:15] First rung with association is the realm of seeing if I see this, what do I expect if I see? Trying to think of Harpreet: [00:42:24] Some like stock market Dana: [00:42:25] Examples, which, you know, like if I if I mean, stock market people are full of this sort of thing. If we see the support is at a certain level, then Harpreet: [00:42:36] We expect that the price is going Dana: [00:42:37] To go up and stuff Harpreet: [00:42:38] And they'll still wave their Dana: [00:42:41] Hands and all this Data. But that's that's all, just association. And it's not predictive. It's not causation. So I don't buy any of that. So that's that's the seeing level doing is the level where you or you change something. And so one example Harpreet: [00:42:59] Doing in our book is we give an Dana: [00:43:01] Example Harpreet: [00:43:01] You own a grocery store and Dana: [00:43:04] You sell toothpaste and you sell toothbrushes and you ask yourself what will happen to the sales of my toothbrushes if I raise the price of toothpaste? Harpreet: [00:43:15] And so they're intervening, you're Dana: [00:43:17] Changing the price of the toothpaste, so you might go back into your your Data, you have Data over many years about the price of toothpaste and the price of of toothbrushes and how many you sell under each circumstance. And so you might say, OK, well, I don't expect to hurt too much, you know, when toothpaste got more expensive. The sales of toothbrushes didn't change much. Harpreet: [00:43:43] So but the difference is that that Data was Dana: [00:43:47] Conducted under conditions of seeing. So it's observational data you just saw what were the prices in the sales? Harpreet: [00:43:54] But when you arbitrarily when you do something and you change Dana: [00:43:58] The price and your competitors [00:44:00] don't change the price, then that's a different situation. So your previous Harpreet: [00:44:05] Data was maybe Dana: [00:44:06] Collected when the price, everybody's price went up. And so, OK. So of course, it didn't hurt your sales, but now you're unilaterally changing the price and so your sales are going to drop. Harpreet: [00:44:18] So that's a case of where Dana: [00:44:21] Changing the rules changes, Harpreet: [00:44:23] The changes, the outcome. Dana: [00:44:25] And that Monty Hall example I gave you shows you why it's so important to know what the rules are and the way you, the way you express what the rules are. But it's by drawing a causal Harpreet: [00:44:35] Diagram and you'll see Dana: [00:44:37] Zillions of causal diagrams in our book. Harpreet: [00:44:39] If you change the causal diagram, one thing Dana: [00:44:41] I like about this example is you could think about another example where instead of running a mom and pop grocery store, Harpreet: [00:44:49] You have like a Dana: [00:44:50] Big chain store and you're really the market Harpreet: [00:44:53] Maker in Dana: [00:44:54] Your city. And so if you're a store like that, you probably could raise the price of toothpaste and not have any bad effect. And in fact, all the other stores are going to copy you. So there's a different causal model where if you're the chain store, you set the prices Harpreet: [00:45:11] Or you you control the Dana: [00:45:12] Market. Whereas if you're the mom and pop store, the market controls you. And so there's an arrow that's pointing in the opposite direction, and you can see that in the causal model. And the causal model tells you how to analyze your data and it tells so you can say, OK with this model, I better be careful. I better not raise the price with that model. Ok, go ahead. Harpreet: [00:45:31] Raise the price. Dana: [00:45:33] So not in the Data, it's in the bottle. Harpreet: [00:45:35] Yeah, I like that a lot. Harpreet: [00:45:38] The just the simplicity of these causal diagrams are really, really interesting gets, but some of the the different types of like you talk about the forks and you talk about colliders and Harpreet: [00:45:47] You talk about these Harpreet: [00:45:49] Types of relationships. Definitely. Check this out. Harpreet: [00:45:53] Yeah, it's it's really Dana: [00:45:54] Cool and it's actually almost Harpreet: [00:45:56] It's almost Dana: [00:45:57] Fun. I think it is fun, you know, [00:46:00] because you have these very basic steps Harpreet: [00:46:02] With just three Dana: [00:46:03] Variables and the simple relations, like you said, colliders, forks, mediators is the third one, and these three ingredients occur in every causal diagram and more. Complicated diagrams are built up from these very simple ones. And if you understand them, you're you're on the way to becoming a bona fide causal reasoner, which is cool. That's what we're after. Harpreet: [00:46:28] Yeah. Harpreet: [00:46:29] So talk to us about the gap between rung one and rung two. How is it that we can bridge that gap? How can we get Harpreet: [00:46:36] From one to, yeah, Dana: [00:46:38] Right? So in the context of traditional statistics, the way it's done is with with the randomized controlled trial. So and that's I alluded to that earlier, you talked about having a control group and a treatment group. And it's very important for it to be Harpreet: [00:46:54] Randomized so that the so that there's the people Dana: [00:46:57] Who get the treatment are picked out of by a random number generator or whatever. Actually, the guy who came up with randomized Harpreet: [00:47:06] Controlled trials, Dana: [00:47:07] Ray Fisher, used to do it with cards. Harpreet: [00:47:10] He actually used playing cards Dana: [00:47:11] To Harpreet: [00:47:12] Randomize who got the treatment Dana: [00:47:13] And who didn't. Harpreet: [00:47:15] And the purpose Dana: [00:47:16] Of this randomization is actually to break the causal diagram. Harpreet: [00:47:20] So, so you might suspect that say a person's Dana: [00:47:24] Socioeconomic status will make them more likely to take Harpreet: [00:47:28] The drug or less likely Dana: [00:47:29] Or whatever. And you want to break that chain of Harpreet: [00:47:32] Causation so you Dana: [00:47:33] Assign the drug randomly for the whole point of randomization is to intervene and change the diagram. So under these circumstances? Harpreet: [00:47:42] And then you also do double Dana: [00:47:44] Blinding Harpreet: [00:47:44] Stuff so that no one knows Dana: [00:47:46] Whether they're getting the drug and the the researchers don't know who's getting the drug and everything. There's this whole methodology that's grown Harpreet: [00:47:52] Up around randomized controlled Dana: [00:47:54] Trials. And if you conduct such a trial, then as I said, [00:48:00] the Harpreet: [00:48:00] Medical and statistical Dana: [00:48:02] Establishment will cautiously let you talk about causation, although usually people will hedge their statements even then. So one thing you're never allowed to talk about causation Harpreet: [00:48:14] When you're doing an Dana: [00:48:15] Observational study where you're just looking at C Data instead of do so, that's sort of the the classic statistical point of view. If you do a randomized controlled trial, you can go up to level two. But you and I are saying, Ah, there are other cases where you can you can get to run to. You have to be able to draw a causal model. But if you have the causal model, there's many times you can Harpreet: [00:48:39] Use observational Dana: [00:48:41] Data Harpreet: [00:48:42] To figure out the answer to Dana: [00:48:43] Your due question. Harpreet: [00:48:45] And one Dana: [00:48:46] Beautiful example Harpreet: [00:48:47] That your dad discovered in the mid nineties kind of the beginning Dana: [00:48:51] Of all of this was the example involving smoking. So we we believe now that smoking causes cancer. Harpreet: [00:48:58] And I have a whole chapter Dana: [00:49:00] About this in the book, which I really am passionate about. So it was very, very hard for for medical scientists to agree on the simple statement that smoking causes cancer Harpreet: [00:49:15] Because status statistical Dana: [00:49:18] Methodology doesn't let you do that. You need to do a randomized controlled trial. But how can you possibly do a randomized controlled trial? Harpreet: [00:49:25] I'm smoking. Dana: [00:49:26] You need to tell someone for the sake of science. Please smoke for 30 years Harpreet: [00:49:31] And probably Dana: [00:49:32] Catch cancer. Get cancer from it. This would be extremely unethical, and no one ever even dreamed of doing anything like this. So, but unfortunately, statistical, you know, mainstream Harpreet: [00:49:46] Statistics Dana: [00:49:47] Without let you talk about causation without such a trial. And the thing they are worried about and the thing that Ray Fisher was worried about Ari Fisher, by the way, was a pipe smoker. So he had a little bit of a vested [00:50:00] interest in this. Harpreet: [00:50:01] But his concern Dana: [00:50:02] Was that there was some Harpreet: [00:50:03] Kind of genetic factor that predisposes Dana: [00:50:06] You to smoking and also predisposes you to cancer. Harpreet: [00:50:10] And this is what statisticians Dana: [00:50:12] Would call a confounder. And, you know, statisticians have had an incredible hard time figuring out what is a confounding, really. Harpreet: [00:50:19] But anyway. So Ari Fisher's Dana: [00:50:22] Argument was, if you have this, this common, this common cause, although you would not want to Harpreet: [00:50:27] Say that word, then you couldn't. Dana: [00:50:31] Just because you have an association between smoking and cancer, you can't say smoking causes cancer, because maybe it's this gene, this smoking gene that's causing both of them. Harpreet: [00:50:41] So, so that's called confounding. Dana: [00:50:44] And what do you do about it? Well, so one thing you can do Harpreet: [00:50:47] Is if you have a Dana: [00:50:49] Confounder, you can collect data on it and you can control Harpreet: [00:50:53] For that confounder. Dana: [00:50:55] So in the example of socioeconomic Harpreet: [00:50:57] Status, for example, Dana: [00:50:58] If you think that that's a confounder, Harpreet: [00:51:00] You could Dana: [00:51:01] Collect data Harpreet: [00:51:02] On the socioeconomic status Dana: [00:51:05] And then you can relate the data Harpreet: [00:51:07] So that if you observe that Dana: [00:51:09] The rich people are more likely to take this drug, you can say, OK, well, I'll relate the number of rich people in my study, so I will. I'll wait. The people who didn't take the drug a little bit higher because they're not as many of these people and the people who did take the drug, I weight them lower. So this weighting Harpreet: [00:51:26] Procedures various standards Dana: [00:51:27] Called controlling for a variable, Harpreet: [00:51:30] Totally normal Dana: [00:51:31] Statistical procedure. But when you do this, your suspicions won't let you talk about causation. And there's another great quote for you. This is from the Journal of the American Medical Association blog in 2017, Harpreet: [00:51:47] A year before Dana: [00:51:49] Our book came out. If it's a report of an observational study, then all cause and effect language must be replaced. So this is like an official position of the American Medical Association [00:52:00] that you're not allowed to talk about causation Harpreet: [00:52:03] In an article about an Dana: [00:52:04] Observational study. And this is even if Harpreet: [00:52:06] You've done the controlling Dana: [00:52:08] For confounders and this is what you and I are saying is wrong. If you have Harpreet: [00:52:14] Identified, if you try to Dana: [00:52:15] Causal diagram, Harpreet: [00:52:16] You've identified everything, you Dana: [00:52:18] Could possibly imagine us as a confounder, then you have the right and Harpreet: [00:52:24] I say you Dana: [00:52:24] Have the responsibility to say this is causal. We've controlled for every confounder we can imagine and there's nothing left. This must be causation. And the example I like to cite and do in the book was a study of walking that was done in nineteen ninety eight. So these researchers had a data set of Japanese men in Hawaii. Harpreet: [00:52:47] Some of them walked Dana: [00:52:49] While they had various exercise profiles, but some of them walked more than two miles a day. Others walk less than one mile a day, and they collected data on these two groups. Then they found that the Harpreet: [00:53:00] Death rate among Dana: [00:53:01] The people who walked more than two miles a day was about half the death rate of people who walk less than a mile a day. And they controlled for everything under the sun. They controlled for drinking. They controlled for socioeconomic status. They controlled for overall health, Harpreet: [00:53:17] Which has a very reasonable thing to do. Because you Dana: [00:53:19] Could say, maybe these Harpreet: [00:53:20] People who walk a lot do it because Dana: [00:53:22] They walk, they're healthy and so they can walk a lot. So definitely you need to control for four for the overall Harpreet: [00:53:30] Health and Dana: [00:53:31] All the variables Harpreet: [00:53:32] You can come up with. Dana: [00:53:34] So they controlled for everything under the Sun, and they still had a strong effect that the people who walked a lot still had almost half the death rate of the people who didn't as much. Harpreet: [00:53:47] So. But they would not they would not say that what Dana: [00:53:51] They said, they specifically said we cannot say what would be the effect of a program of deliberate exercise. So [00:54:00] to me, this is just it's it's irresponsible. So here you have compelling data Harpreet: [00:54:08] That regular exercise Dana: [00:54:09] Will improve your health and it will reduce your death rate. Harpreet: [00:54:13] And if if I go Dana: [00:54:15] To my doctor, I'm going to ask them, Doctor, if I exercise, Harpreet: [00:54:18] Will it cause me to have Dana: [00:54:20] Better health? Will it reduce my, my risk of heart attack? And I want him to feel Harpreet: [00:54:25] That he can answer that and say yes. Dana: [00:54:28] And instead, Harpreet: [00:54:29] What the the authors of the Dana: [00:54:30] Study are saying. You can't. Harpreet: [00:54:32] You can't say that it's just association. To me, this is extremely, extremely Dana: [00:54:37] Important message Harpreet: [00:54:38] That if you controlled Dana: [00:54:39] For Harpreet: [00:54:39] Everything you could think of, Dana: [00:54:40] Then you should say that subject to this causal model, we believe walking Harpreet: [00:54:45] Will reduce your risk of heart attack Dana: [00:54:48] Because that's the information that patients want. And that gets back to the fact this is the way we organize our understanding of the world. We don't organize. We don't understand what association is, but we understand if I walk, I will not have a heart attack or I will have a lower risk of heart attack. That's a message that we can understand. Harpreet: [00:55:09] So I forget what your question was. I said, Dana: [00:55:13] I tend to go on these things, Harpreet: [00:55:15] But that's that's the Dana: [00:55:16] The most have to Harpreet: [00:55:18] Remember. You're asking about about Dana: [00:55:20] Going from level to level two. And so one way to do it is by controlling for confounders. So you need to you need to write down a causal model, identify what all the conceivable confounders are. If you believe you've done a conscientious job, then then you have the right to say and the responsibility to say Harpreet: [00:55:40] That A causes B. Here's the causal Dana: [00:55:42] Effect of walking on the risk of heart attack. Now someone could come along Harpreet: [00:55:47] Tomorrow and say, You missed a confounding, Dana: [00:55:50] Ok, if that's true, then then we have Harpreet: [00:55:53] To admit it. We say, OK, we didn't see that. Dana: [00:55:56] The conclusion was subject to this causal model. Harpreet: [00:55:59] Now we'll gather [00:56:00] data on this confounder and we'll try to. Dana: [00:56:03] We'll tell you now if we control for that confounder Harpreet: [00:56:06] Is do we still Dana: [00:56:07] Have this effect? But at least it's honest, you're saying no. That's another great thing about causal diagrams, they force you to be honest. You put down in the cost diagram, what do you think are the causes and effects Harpreet: [00:56:20] If you miss something? Ok, you go back and then you redo the analysis, Dana: [00:56:24] But it's better than Harpreet: [00:56:25] Than not saying no. And so they get so controlling for compounder is one thing. Now, one of the Dana: [00:56:34] Cool things that I wanted to get Harpreet: [00:56:36] With the smoking Dana: [00:56:37] Example is that in the smoking example, Ray Fisher is saying, there's this confounder, it's the smoking gene. And guess what? You can't measure it because it's the nineteen fifties, and we don't even we can't Harpreet: [00:56:49] Sequence the genome. Dana: [00:56:51] You know, we've never even actually seen a gene. And so how can we possibly control for this confounder? Well, you can't. And so does that mean then you're stuck? You can't say smoking causes cancer. Well, they have that the Harpreet: [00:57:06] Answer is no. In fact, Dana: [00:57:08] What one thing you can look for is a variable like tar deposits. So we believe Harpreet: [00:57:14] That smoking Dana: [00:57:15] Produces tar deposits in the lungs and that those cause cancer. So if you if Harpreet: [00:57:21] You believe Dana: [00:57:21] That if you believe that causal model. Harpreet: [00:57:24] And furthermore, Dana: [00:57:25] If you believe that that the gene does not cause the Taj deposit. Which makes sense, because how Harpreet: [00:57:32] Does a gene cause tar Dana: [00:57:33] To accumulate in your lung? Harpreet: [00:57:35] Ok. You can argue about this. Dana: [00:57:36] And in fact, Harpreet: [00:57:37] Jeff Sessions did argue he pointed this out. But if you can find this intermediate, this mediator tar deposits. Dana: [00:57:44] And if that mediator is not affected by the confounder, then you can actually control for the compound. And he had add derived a mathematical formula which had never known no statistician had ever seen before. And as he calls it, the front door formula. And [00:58:00] so and if you have this this this intermediate variable you can Harpreet: [00:58:04] Control for the hidden Dana: [00:58:06] Confounder and then you can actually Harpreet: [00:58:08] Tell if smoking causes cancer. Now, there wasn't Dana: [00:58:12] When he came out with this, as I said, there were statisticians who said, Well, look, it's hard deposits could be caused by the gene because it could be the interaction of the genes and the smoking. Ok, fine. All right. But the insight here that those tests had ever had was that if you have Variable X that you think causes Variable Y, you're worried about a confounder. Harpreet: [00:58:36] If you can find Dana: [00:58:36] An intermediate variable z that is not affected by the composite, then you can actually get the causation. Harpreet: [00:58:43] So this is Dana: [00:58:44] An example of what I'm talking about, the sort of mathematical rules of causation. Harpreet: [00:58:50] It's like we talked about the rules of geometry. Dana: [00:58:52] You don't define points of lines, but you set down the axioms and then you can make conclusions about them. And similarly, what your dad did with the causal models was the same, he said. Harpreet: [00:59:02] Ok, if I can draw this Dana: [00:59:03] Causal model, I have this intermediate variable. I can get a conclusion from it. Don't need to Harpreet: [00:59:09] Spend thousands Dana: [00:59:10] Of years arguing what causation is. Harpreet: [00:59:13] We just observe. Dana: [00:59:14] Here's the way it works. Here's what you can get from it. Harpreet: [00:59:17] And it's beautiful. Dana: [00:59:19] It's absolutely mind boggling to me. And you want to know why I want to work on this book. Harpreet: [00:59:24] That's why, you know, it's Dana: [00:59:26] It's a person who had this earthshaking insight and still 30 years later, still was having trouble getting accepted. And so that's why I wanted to write the book. Harpreet: [00:59:37] And the good Dana: [00:59:38] News is that Harpreet: [00:59:38] It is it is being accepted. Dana: [00:59:41] So even since our book came out, I read this quote from the American Medical Association two years later, after our book came out, there is a study or an article in the annals of the American Thoracic Society with forty seven authors. And they said the scientific, mathematical [01:00:00] and theoretical underpinnings of causal inference have evolved sufficiently to permit the everyday use of causal models. So to me, this is like, you know, it's like the the the Earth opening. Harpreet: [01:00:13] So people are actually accepting Dana: [01:00:16] Causal models now, and this is so exciting to me to see that happening. Harpreet: [01:00:22] Yeah, I'm excited to see this, you know, start being cut in grad programs and things like that. Harpreet: [01:00:26] I think it's definitely Harpreet: [01:00:27] Some really interesting Harpreet: [01:00:28] Stuff. Yeah, that's right. Yeah, and Dana: [01:00:30] That's what I wanted to accomplish. You know, he wants the education of grad students to change because he said he always talks about this. You'll never change the minds of the older Harpreet: [01:00:39] People, you know, Dana: [01:00:41] The traditional statisticians. Harpreet: [01:00:44] But if you can change, if you can teach the Dana: [01:00:47] Younger ones, the ones who are in graduate school, you can expose them to these ideas. Harpreet: [01:00:51] Then within a generation, the whole landscape Dana: [01:00:54] Is going to change. Harpreet: [01:00:55] Oh yeah, absolutely. Harpreet: [01:00:57] And there's a couple of things that he talked about both guys talk about in the book. Harpreet: [01:01:00] One of which is called the do operator. Yes. And I found that Harpreet: [01:01:04] To be interesting. I mean, I was like, Oh, it just looks like the conditional probability formula with Harpreet: [01:01:08] A do Harpreet: [01:01:09] It mixed into it. So what is what is the do operator all about? What makes it so revolutionary and special? Dana: [01:01:16] So so the do operator. You know, it's what we're trying Harpreet: [01:01:19] To get at when Dana: [01:01:20] When we talk about going from level Harpreet: [01:01:22] One to level two. Dana: [01:01:22] So we're trying to see the difference between seeing and doing so. So the do operator will have a mathematical formula with it that Harpreet: [01:01:32] Looks just like, you know, like conditional Dana: [01:01:35] Probability, as you said, Harpreet: [01:01:37] You just got Dana: [01:01:38] The same vertical line and stuff like that. But the Do operator involves modifying the diagram. So you have your causal diagram, you want to see the effect of raising the price of toothpaste. So we're doing. Price equals two dollars instead of one dollar. But I'm not getting it just from the Data, I'm looking at the whole causal [01:02:00] Harpreet: [01:02:00] Diagram, and when I Dana: [01:02:01] Do the price equal to dollars, I am erasing whatever arrows lead into the price. So if if the Harpreet: [01:02:09] Outside market is normally Dana: [01:02:11] Affecting Harpreet: [01:02:12] My price, I'm the mom Dana: [01:02:13] And pop store. So I follow what Wal-Mart does. So if I unilaterally change the price, then I'm breaking that arrow. Harpreet: [01:02:21] I'm changing a mutilating Dana: [01:02:23] The diagram, as Yoda says. Harpreet: [01:02:26] So that's the do operator, erasing Dana: [01:02:28] All the arrows Harpreet: [01:02:29] Going into that variable Dana: [01:02:31] And then setting the very value of that variable at a certain point and then working through the rest of the causal diagram to see what is then the effect on the outcome variable, which is sales. So, so you mutilate the diagram and then you use your data in prescribed way Harpreet: [01:02:48] Using the Dana: [01:02:49] Formulas we've we've used has developed Harpreet: [01:02:52] And you then figure out Dana: [01:02:54] What is the outcome in that mutilated model? Harpreet: [01:02:57] Yeah. And they definitely Harpreet: [01:02:59] Check this this book out for more information on that. There's the Do operator and there's the whole do calculus behind it. Which, right, Dana: [01:03:05] Which gives you the math Harpreet: [01:03:06] Formula. Yeah, yeah, absolutely. Dana: [01:03:09] It's, you know, part of the value of this is just seeing the Harpreet: [01:03:13] Do operator having Dana: [01:03:14] A language for it because that's what statisticians didn't have Harpreet: [01:03:18] And lacking Dana: [01:03:20] The language for do. It makes it so hard to talk about causal, anything causal, even the randomized controlled trials. Ask a statistician, why does that let you talk about causes? You know, they they won't be able to explain it. It's just sort of tradition. Harpreet: [01:03:36] Yeah. So just having Dana: [01:03:37] That word do makes a big difference. Harpreet: [01:03:40] And then, of course, as you said, there is in Dana: [01:03:42] Fact the mathematical apparatus behind there. Harpreet: [01:03:44] Yeah. And I spent five years Harpreet: [01:03:46] Of my career Harpreet: [01:03:47] As a biostatistician, so primarily working on randomized controlled Harpreet: [01:03:50] Trials and things of that Harpreet: [01:03:52] Nature designing experiments. And I found it interesting that Harpreet: [01:03:56] With the do operator and the do Harpreet: [01:03:58] Calculus, you [01:04:00] could eliminate the need for having to Harpreet: [01:04:03] Do a Harpreet: [01:04:03] Randomized controlled trial. You know, we resort to it if you actually have to. Harpreet: [01:04:08] That is that's correct. Yes. Dana: [01:04:09] And I mean that that's earthshaking Harpreet: [01:04:12] For a statistician to hear that. Yeah, no, it's Dana: [01:04:16] Like against the entire orthodoxy of the subject. Harpreet: [01:04:19] So that's what this book is about. Yes, you can. From observational data, Dana: [01:04:25] If you have a causal model that you believe and you know, and if it's the right, Harpreet: [01:04:32] If it is a model Dana: [01:04:33] That lets you identify this, this effect that you're trying to identify, Harpreet: [01:04:37] Then you can talk about what is the effect. So yeah, it's it's Dana: [01:04:43] It's very big. Harpreet: [01:04:45] So we talked a bit about confounders and you know, I've been confounded about confounders since my days was biostatistician, partly because not a single one of my Harpreet: [01:04:55] Textbooks gives a consistent definition, right? Yeah, that's right. Again, yeah. You know, the Dana: [01:05:00] Statisticians know that confounding is the problem. And yet they can't say what it is. Harpreet: [01:05:05] They can't agree on a Dana: [01:05:06] Definition of confounder, whereas Harpreet: [01:05:09] If nothing Dana: [01:05:10] Else, you Data is causal. Diagrams make it visibly all you need to be to is read the math and then you can tell that this is a confounder and that isn't, you know. And that's also a nice message, too, Harpreet: [01:05:22] Because because statisticians Dana: [01:05:24] Don't know what a confounder is, they control for everything under the sun. They control for things you don't need to control for it, and sometimes they control for things that take away the causal effect. So that's the cost of not having Harpreet: [01:05:37] A clear Dana: [01:05:38] Definition of what a confounder is. Harpreet: [01:05:41] And just just in case anybody is wondering when you say control for it, talk to us a bit about what that means just for the label. Dana: [01:05:47] That's that's that's this this mathematical procedure I was talking about where you relate the Harpreet: [01:05:53] Data in order to Dana: [01:05:54] Take account of the fact that your you have observational data you haven't been [01:06:00] randomizing, so you can't control who chooses the treatment and who doesn't. And so maybe people from those higher socioeconomic status chose the treatment preferentially. And so you have to relate the data so that Harpreet: [01:06:13] You give more weight Dana: [01:06:14] To the people who are unusual, who aren't getting represented enough in the study and less weight to the people who are getting overrepresented. So, yeah, yeah, and that's all classical statistics. We haven't changed any of that. We're just changing the interpretation to why. Yes, why are you doing that? The statistician won't be able to tell you why, because he doesn't have a word for why he doesn't have a word causation. The reason for confounding, for controlling, for compounder is do you want to get the causation, Harpreet: [01:06:45] The causal effect? That's the only reason Dana: [01:06:47] For the controlling. Harpreet: [01:06:49] Yeah, yeah. And I mean, Harpreet: [01:06:51] It wasn't earth shattering to me Harpreet: [01:06:53] Reading that in this book because I've been doing clinical trials for. I'm not I'm not a clinical trial list anymore, I was for a while, but yeah, it was really, really interesting to see that frame. Dana: [01:07:01] So, you know, I should probably say that there are caveats. And I'm Harpreet: [01:07:06] I mean, I am a Dana: [01:07:07] Huge advocate of this. Harpreet: [01:07:09] But one thing Dana: [01:07:10] You can naturally ask is who determines what is the causal model? What's the right causal model? And what happens if people disagree? Harpreet: [01:07:20] And that that is a problem. Dana: [01:07:22] But that's exactly the sort of thing that that you should be talking about. If no famous scientist AIs says this is causing that famous scientist says no, that some other thing or it's not this, that's the sort of thing you should be talking about. The causal model directs your questions to Harpreet: [01:07:42] To what's important. And then I Dana: [01:07:44] Did. An interview on another podcast is the Science Science podcast for Science Magazine. The only one we've done jointly Harpreet: [01:07:52] Where someone asked the interviewer, Dana: [01:07:54] Ask this question How do you determine what's the right model? And it gave this Harpreet: [01:07:59] Wonderful example, [01:08:00] this Dana: [01:08:00] Wonderful answer, he said by argument. Harpreet: [01:08:04] Yeah. And that was all he said. Dana: [01:08:07] And I just thought that was so wonderful, because first of all, Harpreet: [01:08:11] You Data loves to argue Dana: [01:08:13] That's him in a nutshell. Harpreet: [01:08:14] And I think it's part of his upbringing. I think Dana: [01:08:17] It's part of the the Jewish tradition Harpreet: [01:08:20] Is to argue about Dana: [01:08:21] Things about the. What does the Torah mean by this? The other Harpreet: [01:08:26] So it's I think part of part of his nature Dana: [01:08:30] Is to love an argument. And it's what science is Harpreet: [01:08:33] About when you're arguing Dana: [01:08:35] About what are the causal effects? You're arguing about the important thing. And so it's OK. It's OK. If you can agree, Harpreet: [01:08:43] Then maybe Dana: [01:08:43] You can do experiments, you can start or you can try to gather data and so forth, but you shouldn't paper over that disagreement. What the models are doing is Harpreet: [01:08:52] Bringing out what the disagreement is Dana: [01:08:54] So that then you can maybe address it and and Harpreet: [01:08:57] Then by argument actually solves the problem. So it may Dana: [01:09:01] Be then that you won't be able to use your day, as Harpreet: [01:09:04] Do calculus Dana: [01:09:06] To answer your question because you're not agreeing on the model. That's OK. That's what science is about. You got to then experiment. Go to the lab, see if you can agree on a model. And then once you do agree, then you can you can use a causal calculus. Harpreet: [01:09:23] Speaking of arguments, I feel like base there tends to divide Harpreet: [01:09:26] People in a philosophical sense. People have been arguing about that for for a while. Talk to us about a couple of big objections that people have about Bayes theorem. You guys, you guys spent a fair amount of time talking about Bayesian stuff in the book, Harpreet: [01:09:39] And I'd like that. Dana: [01:09:41] Yeah. So the one thing I think is that Bayesian statistics is less controversial than Harpreet: [01:09:48] It once was. So basically, statistics was Dana: [01:09:53] Like a revolution before they cause a revolution. So it used to be also very much [01:10:00] heterodox you. And now I think the you got this meeting and you won't see the arguments that you once did over Bayesian versus frequent this. The other main category. So but it took that took a Harpreet: [01:10:15] Long time to and Dana: [01:10:16] With causal the causal revolution where, you know, at an earlier stage. But anyway, the came from the Harpreet: [01:10:24] Background of Bayesian statistics. Then it in fact Dana: [01:10:28] Uses a lot of Harpreet: [01:10:29] Bayesian tools the sort of Dana: [01:10:32] Nuts and bolts. Harpreet: [01:10:33] The mathematical details are all Bayesian statistics. Dana: [01:10:37] So what is basically the idea of Bayesian statistics is the probability is reflecting a degree of belief. So when you say, what's the probability that I'll have a heart attack if I take this drug? Harpreet: [01:10:51] So it's it's a degree of belief where as in classical Dana: [01:10:56] Frequency statistics, it's a percentage of time. So you have, you know, a hundred thousand people in your population and you give them this drug Harpreet: [01:11:06] And twenty thousand have a heart attack. And so that means Dana: [01:11:09] The probability Harpreet: [01:11:10] Was 20 percent Dana: [01:11:13] For that. So there are many troubles with the frequencies approach, one is that there's only one me, I like to know my Harpreet: [01:11:21] Chance of having a heart attack. I don't care about that several Dana: [01:11:24] Hundred thousand people. Harpreet: [01:11:25] So but putting that aside, what were the problems, the Dana: [01:11:28] Bayesian point of view? So the Bayesian point of view is that this is a degree of belief. And what Bayesian statistics lets you do is update that degree of belief in the light of new evidence. So and there's again, a straightforward mathematical procedure for doing this. And so the trouble is that where do you start? So I said Bayesian statistics lets you update your degree of belief. But what's your degree of belief to begin with? Harpreet: [01:11:58] And so you might initially [01:12:00] begin Dana: [01:12:00] By believing your Harpreet: [01:12:01] Risk of heart attack is 20 percent. But how do you know? And so that's been Dana: [01:12:09] That was one of the sticking points. Oh, for for many years. Harpreet: [01:12:13] Is that it seems Bayesian Dana: [01:12:15] Statistics required you to make a subjective statement right at the beginning. Harpreet: [01:12:20] My original degree Dana: [01:12:21] Belief in this is X percent. And unfortunately, science has this mythology that is supposed to be one hundred percent objective. You're not supposed to have any subjectivity in science. This is kind of another thing that Harpreet: [01:12:38] We've really been instilled Dana: [01:12:39] With in our training. So it's very difficult Harpreet: [01:12:43] To persuade a scientist that, no, you have to have Dana: [01:12:46] The subjective degree of belief here. So that's that's one sticking point. Now there are various ways around this. And one is to take what are called uninformative priors. This is a jargon word I'm sure you've heard many of your listeners have heard. So you just kind of like it. There are five possibilities. Harpreet: [01:13:07] Just assume each one Dana: [01:13:08] Has equal probability, something like that. So, you know, still, it's questionable whether that really gets around this subjectivity Harpreet: [01:13:15] Problem, but that's one way Dana: [01:13:17] It's dealt with. I think a better way, though, is to Harpreet: [01:13:19] Realize that Dana: [01:13:21] Science does have subjectivity and this. This also is part Harpreet: [01:13:25] Of the way we reprocess Dana: [01:13:27] Reality and we start. We might start by Harpreet: [01:13:31] By adamantly believing Dana: [01:13:32] Something's not true, but then we see lots and lots of evidence, and eventually we are persuaded by that evidence. And so that's how human reasoning works, and that's how Harpreet: [01:13:42] Science ought to work, too. Dana: [01:13:43] So. So I think that Harpreet: [01:13:46] Scientists should admit that that they are, Dana: [01:13:49] That they have subjective priors. That's a whole other debate. Then another another objection people had to to statistics to Bayesian statistics [01:14:00] was whether degree of belief really is expressed by probability. Now is, do I believe this? I have a 20 percent chance of heart attack because I've Harpreet: [01:14:11] Observed it a hundred Dana: [01:14:12] Thousand times. Well, no, no. I don't think I have done that. And so. So there must be something else going into this degree beliefs. Harpreet: [01:14:20] So that's really Dana: [01:14:21] An issue for philosophers. I sort of feel that I have maybe less patience for philosophy than your dad does. I feel like you could talk about till you're blue in the face. And, you know, I think that Harpreet: [01:14:34] In the end, Bayesian Dana: [01:14:36] Statistics is a Harpreet: [01:14:37] Set of rules that Dana: [01:14:38] Lets you deal with with degrees of belief using the laws of probability. It is a consistent, mathematically consistent Harpreet: [01:14:47] Framework for Dana: [01:14:48] Doing that. And if you don't believe that degrees of belief are probabilities, OK, then don't use it. Bayesian statistics but I think you're Harpreet: [01:14:56] You're you're missing out Dana: [01:14:58] On something very useful. Harpreet: [01:15:00] Yeah, yeah. Yeah. I mean, I for one, I feel like, you know, math and statistics and philosophy. I feel like we need to instill more philosophy back into our training. As you know, mathematicians and statisticians like, Harpreet: [01:15:11] They just a little bit more because back in the days, they were all philosophers. Right? Yeah, that's right. Yeah, it's Dana: [01:15:16] All. Science was natural philosophy in the beginning. And yeah, that's right. So one thing I really admire about you is that he takes the philosophical debates about causation, about Bayesian statistics very seriously and knows a lot about them. Harpreet: [01:15:32] So that, I think, is really cool. They're not Dana: [01:15:36] Very many scientists that do, and we talk about them in the Harpreet: [01:15:40] Book like that. Dana: [01:15:41] So it's good to to recognize that these questions have been around with for around us for a long time, and we have to respect what other people have have said about them. Harpreet: [01:15:53] So let's do a last formal question before we jump into a real quick what I like to call the random round. And [01:16:00] it's this it is one hundred years in the future. What do you want to be remembered for? Wow. Dana: [01:16:07] So, yeah, certainly in the context of this discussion, if my work made, it has really revolutionary insights, more mainstream, more understandable, I would be thrilled with that. Harpreet: [01:16:21] And so if causal models became Dana: [01:16:25] Part of the standard, the standard practice in science and statistics, that would be great. And then if it actually mattered, if it made a difference, that would be the best thing of all. So one thing we haven't talked about, I know you wanted to, was the role of causal models in machine learning. And, you know, I strongly believe that any artificial intelligence to live in a world with humans is going to need Harpreet: [01:16:57] To be able to think causally Dana: [01:16:59] With causal models. Harpreet: [01:17:01] And so right now, Dana: [01:17:02] Machine learning is still at one Harpreet: [01:17:03] One and that Dana: [01:17:04] Of causation. We need to get up to run two and three. Harpreet: [01:17:08] So if a Dana: [01:17:09] Hundred years from now we really do have Harpreet: [01:17:12] Strong artificial Dana: [01:17:13] Intelligence, then to me that can only come about if that intelligence is using causal concepts. And so I would be delighted to have had a tiny. Tiny, tiny role in that process. Harpreet: [01:17:28] Yeah. I mean, I'm definitely looking forward to Harpreet: [01:17:30] Using some of the stuff laid out in this book Harpreet: [01:17:33] At the first opportunity that presents itself because it's fascinating stuff. Harpreet: [01:17:38] I'm looking for ways to apply this. So, so thank you so much for Harpreet: [01:17:43] Contributing to that and writing the book, and I'm sure you guys will definitely one hundred years from now when. Statisticians have changed their viewpoint, though they'll be Harpreet: [01:17:52] Referring to to this book as to the reason why. Harpreet: [01:17:55] So let's jump into the real quick random round here. So what are you currently reading? [01:18:00] Dana: [01:18:02] So I am currently reading a book called Tears and Amber, I hope that's the name of it. So this is I Harpreet: [01:18:09] Literally got this book because Dana: [01:18:11] It was a free Kindle book. I am a subscriber to Amazon Prime and you get a free one free book a month. There's like a list of of eight books each month that they pick and usually usually I'm not interested Harpreet: [01:18:25] In any of them. But this month Dana: [01:18:27] There was this book called Tears and Amber, which is a novel about basically about families, women and Harpreet: [01:18:34] Children who were Dana: [01:18:35] Caught up in the World War Two, German families and the effect of the war on them. Harpreet: [01:18:42] And the tears and amber Dana: [01:18:44] Refers to Harpreet: [01:18:44] The fact that Dana: [01:18:46] This is taking Harpreet: [01:18:46] Place in East Prussia, which Dana: [01:18:48] I didn't even really know where it was. But it's on the Baltic Harpreet: [01:18:51] Coast, and Dana: [01:18:53] I guess that Ember is is a sort of a jewel that's amber washes up on the beaches there and turned into jewels and stuff like that. So that's one of the world's big sources of Ember is Harpreet: [01:19:05] The Baltic Sea. Dana: [01:19:07] Anyway, so what? I really like this book, otherwise I'd make some some other answer to your Harpreet: [01:19:14] Question, but Dana: [01:19:15] I'm not done with it yet. What's really interesting to Harpreet: [01:19:17] Me is that that Dana: [01:19:19] Seeing the the very profound negative effect of war Harpreet: [01:19:26] On this and the way Dana: [01:19:27] That the German people were actually victims. Harpreet: [01:19:30] And this is kind of unfamiliar Dana: [01:19:33] Territory for me because I've always seen the Germans Harpreet: [01:19:36] As the the Dana: [01:19:38] Offending party Harpreet: [01:19:39] That they caused the water, they Dana: [01:19:41] They deserved it or something like that. Harpreet: [01:19:44] What happened? But you know, Dana: [01:19:45] You read the story, these women and children, they did nothing to Harpreet: [01:19:48] Deserve what happened to them or if they did Dana: [01:19:51] Anything to deserve it, it was no more than the rest of us deserve. No adverse consequences [01:20:00] for belonging to the societies we belong in. I mean, they they were in a society that made it impossible to not subscribe to Harpreet: [01:20:07] To this warped Dana: [01:20:08] System of beliefs and they didn't want to have. You know, they resisted it, but they were powerless in some ways, and then when they lost the war and tire. What the book is really good at is showing how complete the collapse in society is. And it just it just makes you realize how Harpreet: [01:20:31] In Dana: [01:20:31] Every war women and children are victims. And, you know, to say they deserved it because they were the Nazi aggressors, I think is it's too Harpreet: [01:20:43] It's it's too bad Dana: [01:20:45] Or Harpreet: [01:20:45] Two to Dana: [01:20:47] Two something I don't know. Yeah. So so this this this this book really opened up something that I had hadn't really thought about before. And I think that I often think, Harpreet: [01:20:58] Oh, they deserved it, and I really shouldn't Dana: [01:21:00] Be thinking that Harpreet: [01:21:02] I definitely have to check that one out. I've got Amazon Prime as well. So hopefully in Canada we get the same selections. Dana: [01:21:08] Well, it's a one for April, so it's probably not free anymore. Harpreet: [01:21:11] But anyway, Harpreet: [01:21:12] Yeah. What song do you have on repeat? Harpreet: [01:21:16] Yeah, they just Dana: [01:21:18] Sent me the list of questions in advance. This is Harpreet: [01:21:20] Hilarious because Dana: [01:21:21] I'm from an older generation Harpreet: [01:21:23] Than you, and so I don't Dana: [01:21:25] Even know what it means to have a song on repeat. Harpreet: [01:21:28] So like, I mean, I Dana: [01:21:31] Assume that means I'm listening to a lot of stuff, Harpreet: [01:21:33] But maybe you have a playlist Dana: [01:21:35] That you I don't know what it means, but Harpreet: [01:21:40] I'm generally Dana: [01:21:41] Generationally handicapped. So. Ok, so first of all, I'm not listening to as much music as I used to, and I feel really sad about it. I don't really know. I'm not listening to the Harpreet: [01:21:53] Radio anymore, and I don't really Dana: [01:21:54] Know where else people go to to find the music they like. So sometimes [01:22:00] I'll look for stuff on YouTube, and I think that's one route that people find. Harpreet: [01:22:04] But but actually, you know, the two Dana: [01:22:07] Songs most recently that that I like the most I I found out about them because I read about them. Harpreet: [01:22:14] So which is Dana: [01:22:16] Here? I am a creature of the printed word. I read an article at the end of last Harpreet: [01:22:21] Year about Best Songs of Dana: [01:22:24] 2020, and there are these two songs I read about that sounded interesting. One of them was John Prine's I remember everything that was the last song he recorded before he died of COVID. And, you know, Prine was a guy. I'd heard some of his stuff. Not a huge pine fan, but I kind of like Harpreet: [01:22:44] Them, you know? And he's very irreverent, and that's something Dana: [01:22:47] I liked about him. And so so Harpreet: [01:22:51] He was one of the, you know, Dana: [01:22:53] Probably the first person who died of COVID that I knew. Oh, I know who that person is. I've listened to songs and stuff. And so that was a little bit of a wake up for me. Harpreet: [01:23:05] So I I listen to this song. Dana: [01:23:07] It's a great song and the the last line is just fantastic. Harpreet: [01:23:12] I won't Dana: [01:23:13] Bore you by Harpreet: [01:23:14] By revealing it, but it's just great. It's the Dana: [01:23:17] For the last song of your career and the last line you sing Harpreet: [01:23:21] To be this one. It's just Dana: [01:23:23] Fantastic. So so that was one of the two songs, Harpreet: [01:23:25] And then the other one is its total opposite. So I read Dana: [01:23:29] Also about the song called The Blessing, Harpreet: [01:23:31] And it's it was. Dana: [01:23:34] I don't listen to Christian Music Evangelical music, but it came out of that. Nearly a and it was. This is this basically very devotional song. All the words are from the Bible as basically saying, You know, may the Lord bless you and be with you and and your children and their children and their children for a thousand generations. And it's very inspiring words. And [01:24:00] so I listen to the Harpreet: [01:24:02] Song and I'm like, Dana: [01:24:03] Yeah, that's a very simplistic song. But then they just get more and more into it and more passionate. And I just really love the song by the end, and what I really loved Harpreet: [01:24:14] Was these the words Dana: [01:24:16] They sing He is for you. So they're talking about Harpreet: [01:24:19] God, of course, is for you. Dana: [01:24:21] And I don't know. I'm not really a Christian never paid that much attention to the Bible. I've never had someone say, God is for me. You know, it's like I'm supposed to be for Harpreet: [01:24:33] God, you know, like, Dana: [01:24:34] You know, for to be a Christian. This is saying God is for me. Harpreet: [01:24:38] And that really that Dana: [01:24:40] Surprised me Harpreet: [01:24:41] Know that. Dana: [01:24:42] So it was such a great line. Anyway, so I love those two songs. One is just a guy with the guitar, one guy with the guitar, totally reverent, you know, just a dry sense of humor. Harpreet: [01:24:58] The other one is this chorus of passionate Dana: [01:25:01] Religious people who love their God Harpreet: [01:25:03] And who are trying to tell you, Dana: [01:25:05] I want our God to bless you forever and ever and all your generations to follow you. And it's just so moving and. Such a complete opposite way. So, so those are two great songs, Harpreet: [01:25:17] And I love listening to them both. Harpreet: [01:25:20] Definitely a check. Check, check them out for sure. So let's go to a random question. Generator will do a couple of songs, a couple of questions off of this one. First question is what is one of your favorite comfort food comfort foods? Harpreet: [01:25:32] I like chocolate cake. Dana: [01:25:35] So and I recently discovered it's really good Harpreet: [01:25:38] To follow the chocolate cake with peppermint tea, Dana: [01:25:42] Because then you get the taste Harpreet: [01:25:43] Of chocolate and the taste of peppermint at the same time. Dana: [01:25:46] And it's just really a great mixture. Harpreet: [01:25:50] I'd love to try that one because I like both chocolate and peppermint. So what have you created that you are most proud of? Dana: [01:25:56] Well, you know the book of Harpreet: [01:25:57] Y, you know? Dana: [01:25:58] Yeah, I mean, I [01:26:00] can't think of a Harpreet: [01:26:00] Better answer than that. Harpreet: [01:26:03] Who inspires you to be better? Harpreet: [01:26:06] Ok, I'll tell you something, I my other Dana: [01:26:09] Hobby besides chess is hula dancing and started this Harpreet: [01:26:13] Fairly recently. Dana: [01:26:14] Well, 15 years ago, not that. And I'm inspired by my fellow hula dancers. So this is a group of all women, except for me and mostly older. They're mostly in their sixties and seventies. And I just love their way of Harpreet: [01:26:30] Looking at the world. Dana: [01:26:32] And one thing that I really like is that in this group, everyone supports everyone else and really cares about how other people are feeling. Harpreet: [01:26:42] And this is Dana: [01:26:43] Something that I hadn't really seen to the same extent Harpreet: [01:26:48] In any other groups I've Dana: [01:26:50] Belonged to. But in this hula group, somebody is feeling depressed down about something or other, Harpreet: [01:26:57] Which has certainly happened a lot in the last year Dana: [01:26:59] Because of the pandemic. Other people will come to their aid. They will they'll try to say something to make them feel better. And the the group as a whole will not proceed unless everybody in the group is together. And I just think that's so admirable. And I even wrote a little essay about it that I haven't published anywhere but just Harpreet: [01:27:25] Saying for men like me, it's really good to join a group Dana: [01:27:30] Some some time where women call the shots. Harpreet: [01:27:33] Mm hmm. Just look, just participate Dana: [01:27:35] Just and don't dictate to see how they do things. And it's just really different. It's gentler Harpreet: [01:27:41] And it's more considerate, Dana: [01:27:43] And I just Harpreet: [01:27:44] Think it's really, Dana: [01:27:46] Really inspiring. Harpreet: [01:27:47] Yeah, that's really cool. I really like that. Harpreet: [01:27:49] Definitely have to get it on some Harpreet: [01:27:51] Type of hula dancing. I could I could use some movement nowadays. Dana: [01:27:55] Hello. Harpreet: [01:27:56] Yeah, it's great for Dana: [01:27:58] Actually for older people [01:28:00] like me getting there. You know, it's really a Harpreet: [01:28:03] Good form of activity Dana: [01:28:04] And exercise because it's very low contact. Harpreet: [01:28:07] You know, it's not like running or Dana: [01:28:09] Something like that where you're going to hurt your knees, you're going to hurt your your ankle or something like that. Harpreet: [01:28:14] So yeah, it's Dana: [01:28:16] As you get older, if you like dancing, a really good thing to do. Harpreet: [01:28:20] Then how can people connect with you and where can they find you online? Harpreet: [01:28:24] Thanks. I do have a Dana: [01:28:25] Website, Data McKenzie, and I don't know if you can, Harpreet: [01:28:30] I could spell it out. The main thing is McKenzie Dana: [01:28:32] Has an a in it. Yeah, I actually have a blog about chess. If you're interested in chess and willing to just Harpreet: [01:28:39] Read about that, it's Data Dana: [01:28:41] At Data McKenzie slash blog, but probably most of your Harpreet: [01:28:45] Listeners are that into it. Well, the Harpreet: [01:28:48] Netflix show what was it called Harpreet: [01:28:50] Queen's Gambit, which seemed to Harpreet: [01:28:53] Ignited a lot Dana: [01:28:53] Of. Yes. Yes. It did. And yeah, I got to write an article for The New York Times. So inspired Harpreet: [01:29:01] By that because Dana: [01:29:03] There is a this article was about a woman who is the three time blind U.S. blind champion. Wow. And I thought was really interesting that here's this woman who is a national champion. Harpreet: [01:29:15] And you know, there's Dana: [01:29:16] A TV show about a fictional woman being the national champion. Here's a real one. And so I wrote an article about her, which was really well received. A lot of people were moved by that. And so that was a great experience. Harpreet: [01:29:31] Well, I'll definitely link to all of that right there in the show notes. So for everyone that's interested, definitely check the show notes out. You'll see all those links there. Dana, thank you so much for taking time to come on the show. Appreciate it. Harpreet: [01:29:42] Thank you. Thank you for inviting me. Dana: [01:29:44] Thank you for your patience. I know that I tend to go on and on Harpreet: [01:29:47] And on, Dana: [01:29:48] But that's just me. Harpreet: [01:29:49] So well, absolutely love it. Nobody's nobody's tuning in to hear me talk. Good to hear you hear my guest talk. So thank you. Ok, thank you.