2020-06-10-santona-tuli.mp3 Santona Tuli: [00:00:00] You get to shape your path. You get to do what you want to do. If you're curious, if you are excited and want to try things and want to learn things, things will fall into place and you will figure it out. And that is not to say that I haven't had setbacks or failures. It's just about how you look at it and it's about getting back up and keeping going. Harpreet Sahota: [00:00:37] What's up, everyone? Welcome to another episode of The Artist of Data Science. Be sure to follow the show on Instagram @theartistofDatascience and on Twitter at @ArtistsOfData. I'll be sharing awesome tips and wisdom on Data science as well as clips from the show. join the Free Open Mastermind selection by going to bitly.com/artistsofdatascience where I'll keep you updated on biweekly OpenOffice hours. I'll be hosting for the community. I'm your host Harpreet Sahota. Let's ride this beat out into another awesome episode. And don't forget to subscribe, rate, and review the show. Harpreet Sahota: [00:01:21] Our guest today is a physicist and Data scientist who loves delving deep into Data to learn insights that may be hidden by noise. She's earned a bachelors in physics and mathematics from Trinity University and has gone on to a PhD in physics specializing in nuclear science and quantum chromodynamics from the University of California, Davis. She currently leads a team of five doctoral and post-doctoral physicists studying a new plasma phase of matter in the elusive nuclear effects in high energy Proton and nucleus collisions at the Large Hadron Collider at CERN in Geneva, Switzerland. She's got a knack with thoughtful feature in engineering to extract maximum value from Data while simultaneously reducing the Data significantly. She also emphasizes avoiding overfitting, identifying systematic bias and validating all results. Her favorite part of Data science. It's all of it, and she enjoys end to end project oversight and everything from designing and developing to testing and productionizing using statistical data analysis pipelines. So please help me in welcoming our guest today. A woman who is excited by decision intelligent Data science, Dr. Santona Tuli. Dr. Tuli, thank you so much for taking time out of your scheduled to be here today. I really appreciate you coming on to the show. Santona Tuli: [00:02:37] It's an absolute pleasure to be here. Thank you for having me. Harpreet Sahota: [00:02:40] Talk to me a bit about your path into Data science or what sparked your interest. Where did you start? How did you get to where you are today? Santona Tuli: [00:02:49] Yeah, absolutely. So I was on a path to studying the fundamentals of our universe, but it turned out that I was also on a path to Data science. So since a lot of physics is trying to explain phenomena that we observe, it involves making a lot of observations. Right. So that's collecting data and then identifying patterns in the data. And that's the analysis aspect. So by doing physics with massive datasets from particle collisions, which we will go into, I have been doing data science for the last several years and that's been a lot of fun. Harpreet Sahota: [00:03:20] Ok, so I gotta ask, what the heck is quantum chromodynamics? Santona Tuli: [00:03:27] So it's a theory of the strong nuclear force. So there are four fundamental forces in nature. And one of them is nuclear physics. Just like gravity and electromagnetism strong nuclear force is another force and it acts at very, very short scales like inside a proton. So quantum chromodynamics is the model, if you will, the theory that describes the behavior of this force. Harpreet Sahota: [00:03:54] That is very, very interesting, very fascinating stuff. How does all this tie into Data science? How do you see Data science affecting the study of nuclear forces in, you know, the next two to five years? Harpreet Sahota: [00:04:05] So our field of data science has been around since before Data science was kind of hot right. So we've had these mysteries and particle and nuclear physics trying to understand why the universe is as it is. And we built these large particle accelerators and detectors and produced massive amounts of data. So without very clever Data science, there's no way we could be looking into that data and coming up with the answers to these questions. It plays in we go through and as we evolve our techniques and we borrow from what other practitioners are practicing around the world. We can only get better at being at going through that data and answering those fundamental physics questions. Harpreet Sahota: [00:04:49] So what do you think it's going to be the next big thing in Data science in the next two to five years? Santona Tuli: [00:04:56] That's a really tough one. So I can think of three ways of responding to that in terms of technology. I think NLP natural language processing and like semantic understanding graphs and using machine learning is booming. And so we're sort of in a similar way to how computer vision was booming maybe five years ago. So in terms of like field of data science, that that's where my thought goes in terms of the big picture and tech, that's going to be the next big thing. I think it's going to be empathy, particularly for data science, ethics, making sure we mitigate our biases. That's gonna be a very, very important thing going forward. So there have been more and more discussions recently about thinking through data science projects holistically and not just as a chain of discrete engineering tasks. And so when you think about things holistically, you're better able to think about their impact, the impact that they're going to have on real people on this planet. And then third, and finally, in terms of applications, I am really excited about data science and AgTech and sustainability. Course, there have been lots of advancements in healthcare, which which has been wonderful to follow, and an EDtech, another, you know, very in other fields where there's a lot of ongoing work, especially with NLP, like with a lot of text documents that can be used to make education better and more accessible. So those are very, very interesting as well. Data science applications are often about increasing efficiency by teasing out more value from current system slate systems that are already in place. We apply Data science, supplement these systems with insights from Data and make them more efficient. And there is huge potential to do this in the space of agriculture globally. I think so that's the thing that I'm most looking forward to. Harpreet Sahota: [00:06:42] So we've talked about some of the positive applications of data science in the next two to five years. You mentioned a few right there. But what do you think would be the scariest applications of data science and machine learning in the next two to five years? Santona Tuli: [00:06:54] I'm going to answer that by sort of weaving together my previous two answers. So data science will have a positive impact in the immediate future by not having negative impacts. So by becoming more introspective, Data science applications can have a ton of positive impact everywhere. As I said, all of these different fields. So we've seen examples of machine learning and AI that try to mimic sort of human behavior from from Data, by learning from data and in that they have amplified some of our isms right. We've seen those those things happen and that's that's sort of one of the worst things that can happen is the AI or the machine learning algorithm sort of learns our worst aspects of us as society and then somehow magnifies that. So in my mind, that's the scariest timeline, right, that if we collectively fail to be responsible and conscientious human beings. And so by being cognizant of that, by being empathetic and ethical, I think we can definitely avoid that. And you'll notice that this is I'm talking about us humans as Data scientists. Not about you know AI, I'm not I'm not scared of the artificial AI overlords. I'm scared about how we approach this field. Harpreet Sahota: [00:08:17] What can we start doing today to become more empathetic, more conscientious, Data, scientists. Santona Tuli: [00:08:24] Introspection, just in very general terms. Every project that we work on, just thinking through where the data's coming from, why the Data looks as it does so, identifying our biases starting from the data collection level like, am I... This problem that I'm trying to solve, I have to be able to solve it for everyone, let's say for in a certain you know, for the product market fit there is a certain subset that I'm targeting. But at the same time, within that population, I shouldn't be I shouldn't discriminate in any way. So thinking about where my training Data is coming from and why it looks the way it does. Balancing classes if that's something that needs to be done. I'm going out of our way to get more balanced and less biased Data. Starting from there and then just throughout the entire process, really thinking through, asking ourselves the question, why are we making this choice? Does it make sense? Is it going to have a positive impact, if not rethink? Harpreet Sahota: [00:09:27] So you may have covered this in your response right now, but I'm curious, you know, what you think will separate the great Data scientists from the good ones in this vision of the future that you have? Santona Tuli: [00:09:38] We often think of greatness in terms of success, metrics like, you know, if you're if you've gotten five promotions in the last five years then you're a great data scientist. But in order to separate truly great data scientists, we have to think about what they're doing, their actions and their willingness to say no when asked to work on something, you know, that doesn't jive with their values and views. So, yes, I think this the way I'm framing things. It's a whole, we have to think about it holistically and we have to sort of be introspective and look forward. And by doing those things, not only can we benefit society, but we can also set ourselves apart as great scientists or great data scientists. Harpreet Sahota: [00:10:31] Absolutely love that response, that 100 percent agree with that as well. Need to be more thoughtful about the way we're doing our work and the implications that will have downstream. Even if you don't think that your end result, whatever product it is that you're building, is actually gonna have an effect. You should still consider the effects that it has. Harpreet Sahota: [00:10:48] Really interested to get into the work that you're doing at CERN. First, tell us your what is CERN? Santona Tuli: [00:10:53] So CERN stands for the European Organization for Nuclear Research. It's a weird acronym, I know, because it's an acronym from the French term. And this is a huge research complex. You can sort of think of it as a parallel to NASA only to the extent that it is a huge research research lab facility that is focused on one sector of physics. So like NASA is that space and sending astronauts out. We at CERN are interested in going sort of inward and going to the smallest scales and figuring out what fundamentally things are made. We have a Large Hadron Collider. It's called the Large Hadron Collider. It is a circular particle accelerator. Its 27 kilometers in circumference. So it's very large and there are towns that live on top of this. So the accelerator ring is underground, about roughly 100 meters underground on average. So, you know, it crosses, it straddles France and Switzerland. So there are no Swiss towns and French towns just like above ground. And people have no idea that this thing is under underneath them. In some cases. But so what we do is we accelerate particles, use this massive ring in order to keep adding speed to little particles. And until they get to very high energy is that's when we collide them. Hence collider. And then we study the product of those collisions. So we try to take a snapshot of the collision by building particle detectors around the collision point. So those constituted of hundreds of millions of sensors. Very simple sensors in the sense that, you know, it's like a bit reading out, whether there was something that the sensor picked up or not. And, you know, when you have hundreds of millions of those, you can put different signals in different sensor readings out in order to get deeper knowledge about what was actually happening in that collision. So that's what we tried to do. We tried to capture what happened and then analyze that data downstream and really figure out why particles interact the way they do, why forces interact with each other. Or forces are applied in the way that they are. Harpreet Sahota: [00:13:13] So speaking of particles, while I was doing research for you that came across this concept or this Y particle? What is this Y particle? Santona Tuli: [00:13:22] The Upsilon particle. It's called Upsilon. And so the Greek symbol for that looks like a Y. So we often just write Y this. So this Upsilon particle is a meson, which means something to physicists. But it's basically it's a particle that's made up of two quarks that are oppositely charged. And more specifically there to Quark in an antiquark. So quarks are fundamental particles. If you go inside a proton, which we don't really think of going inside a proton because, you know, it's like us elementary particle and it makes up the nucleus along with neutrons, makes up an atom, atom, are building block. So that's sort of how we learn about how things are made up. But if you actually look inside a proton, you'll find that that also has building blocks. And those are quarks. So there are a few different types of quarks and a lot of they have interesting names. There's top and bottom and charm and strange. So the Upsilon particle is made up of a bottom quark and an ant bottom quark. And it's a very heavy particle. So it doesn't exist. Normally, it's only created in these very high energy collisions. Then we try to really leverage that in order to see how this particle behaves and what it can tell us about the early universe. Harpreet Sahota: [00:14:50] Are you an aspiring Data scientist struggling to break into the field well then check out DSDJ.Co/artist to reserve your spot for a free informational webinar on how you can break into the field? This will be filled with amazing tips that are specifically designed to help you land your first job. Check it out, DSDJ.CO/artist Harpreet Sahota: [00:15:15] That's super fascinating. What does what do all these things have to do with data science and machine learning. Like, you know, talk to us a bit about your how are you applying it to solve these problems? Like what type of problems are you solving, how are you using Data science to make progress against them? What's your workflow like in this space? How do you go from data to decisions? Santona Tuli: [00:15:34] Yeah, absolutely. So the we collect these particles and we have all this data. Right. So we sort of covered that. The data is readouts from the sensors. And in real time, when the particle collisions are happening, we're actually generating petabytes of data per second. So that is a lot. We're, of course, not able to store that, all of that or even process it in real time. We have, you know, bandwidth limitations about how much data we can siphon off. So a lot of data science, more on that data engineering side sort of goes into building out those filtering algorithms that are going to decide what to siphon off and filter on in real time from these collisions. And the simple example, in one particular collision event, we might get thousands of ordinary particles, let's say electrons being created. Now, that's not a super interesting event because we see electrons everyday. That's not what we're after. Right. Remember, the Upsilon particle is, in my case, my team, that's that's what we're after. So what we try to do is figure out what sort of signature and epsilon particle will leave behind and use that to filter on this massive data so that we're only keeping the events where the Upsilon particle may have been created. Now, one of the cool things about this field, the physics applications of data science, is that we never have labels Data. So it's kind of a we have we have huge data, but none of it is ever labeled because the whole idea is we're trying to figure out what what was created and, you know, what's going on. We have to have a very thorough understanding of both the physics of what we're looking at, but also the data. So in terms of what is my capability with my sensors, with my particle detector, what am I going to pick up? What am I going to leave out? How do I compensate for what I'm not able to collect? What data I'm not so thinking really think through, like how the data is biased in the first place and working to compensate it and correct for those effects. So to me that that's one of the interesting bits. And then so starting with building these algorithms to collect the data all the way through so we reduce the data. Still, even after we filtered on the right right Data like there's as a data scientist, you'll know that the signal and noise still have to be sort of separated. And there are going to be regions where the signal is more. There is the signal to noise ratio is greater and there can be selection cuts that we can make to strongly leverage that. Santona Tuli: [00:18:08] So all of that is as one aspect to me. I really enjoyed the feature engineering aspect. Actually, personally, like ML modelling is fun and sort of that the end product is fun as well. But feeding what you feed into the machine learning model, of course, is extremely important. And to spending that time feature engineering, really looking at my Data thinking about the physics behind it. Thinking about my constraints and finding the best features that are going to be useful to me and like putting different features together to make better features and so on and so forth. So that's actually I would say probably 60 to 70 percent of what I do on one of these projects. And then ML models. So for machine learning at CERN, we typically tend to use simple models, more like what is called traditional machine learning. So a lot of regression, classification, clustering. But more recently, we have had efforts to look into like deep learning techniques for and for the same sorts of analysis. But again, the fact that our data isn't labeled throws in a wrench. And then the other thing is that we tend to only say we have seen something when we're absolutely sure we've seen it. So you may have heard of this, the five sigma rule and particle physics. So it's only when we have very high statistical confidence that we're going to go ahead and actually say that, OK, we've seen the Higgs-Boson or this particle or that particle. So that's another reason thinking through the machine learning algorithm and being having it be very explainable. It is important to us. So a lot of thought goes into the machine learning models as well. And we try to air on the side of simple rather than a complicated and then processing whatever is put out again, just just like you can't provide any Data into a machine learning model. You have to put forethought into how you're going to shape the data. On the other end as well, when something is spit out by that model is still that doesn't mean anything. Right. You have to transform that into an actual indicator that that means something that other stakeholders are going to care about. So that's the other end of the process. And as you said in my intro, I really enjoy, you know, the whole end to end process and every part of it. Harpreet Sahota: [00:20:25] So fascinating, and so awesome. You can really hear your passion for the subject as you're describing everything. I'd really love to delve a little bit deeper into this. It's pretty interesting, right? Because, typically when we're working and let's say we're doing a fraud detection type of problem. We have a situation where we're trying to sample the class that's not as frequent, but you're in a situation where you have to kind of downsampled the noise in a sense. So it's kind of like the reverse problem. Is that what you mean by Data reduction? Like, can you talk to us about what Data reduction is? How important is in the work you're doing? You touch on some bottlenecks that you're facing? Santona Tuli: [00:21:02] Yeah, you're absolutely right. It is sort of sort of the reverse in some sense of that problem. We are not really able to amplify the signal just because we have to sort of a lot of the times the item that we want to report is how much of a particle was created. And if we up sample, you know, the production of the particle, that's that sort of brings it into question. Like, you know, we don't actually see this or why are we sort of basing our ML models on on this upsampling? So, yes, you're absolutely right. What we instead have to do is prune the noise away. A lot of the times some of the techniques that have worked well, at least with my analysis, with my analysis, is looking at the feature space in its entirety and looking at and placing various selection cuts and really seeing what effects they have. Moving around the selection cuts and seeing what it does to the signal class and the and the noise class and sort of optimizing those selection cuts. And, you know, and also the other aspect of that is rules based clustering is also something that I use a lot. Just because I know something about how this particle should behave. And again, what my detector is like, I'm able to set certain rules about, OK, if you see this, this, this, this, this criteria being checked off, then, you know, you can probably classify it as this Upsilon particle or this other particle. So those are some of the techniques. And. Yes. So because the data is big and the. So in my particular this last analysis that that I've been leading, we're just about to put the results out. In the end, I was able to extract about 5000 upsilon particles. But the data that I started with was tens of terabytes. Right. So it's really we've identified the events where this particle is produced. And then from that, thousands of particles created in that event I have to identify that the upsilon was created. And so it's it's really, really interesting work. And again, we try to use really simple techniques whenever we can. And really what I'm what I mean when I say simple is explainable. Does this make sense? Is this something I can defend and explain to someone else? And they will be convinced that, yes, this choice makes sense. So that's been the secret to dealing with various data bottlenecks and Data reductionist problems. Harpreet Sahota: [00:23:43] Isn't that beautiful? Even when trying to explain what makes up the universe, parsimonious models are the best. Questions based on what you're saying here, just kind of for definitions. What do you mean by selection cut? And can you also kind of maybe, you know, namedrop a couple of these rule-based clustering algorithms that you're using so that you know, our audience can go look this stuff up on their own time. Santona Tuli: [00:24:06] Sure. So our Data is event Data because the collision events are separately sort of we tried we can we're able to take a picture of a single collision event, which is really cool. And the other ask the other descriptor that we use a sensor data because, oh, yes, we are getting these readings out from the sensors. So using what my sensors are reading out per event, I can reconstruct. So it's sort of like back propagating. We can I can reconstruct what was happening in the collisions to some extent and with some degree of confidence. Right. So I will say let's say that I saw an electron in these few sensors that tell me together that when I combined that information together, it tells me that this electron was traveling in such and such a direction, let's say, you know, upwards north and what not. So that's one point of information. And then from a different set of sensors, I'll be able to say, yes, that was an electron because it was a negatively charged particle. And I think it had this momentum or this mass. So all of these things. So a lot of cool stuff goes on really at that hardware software interaction level. And really reconstructing back to what happened in these collisions, which to me is is always really interesting because we're trying to look back and answer questions. Right. A lot of physics is cosmology or early universe physics is OK. What? What happened? You know how the Big Bang happened? What happened after that? And so on and so forth. So this looking back is sort of very, very ingrained into physicists. The way for this is think. So I really think it's super cool that we're able to have this data and be able to really answer the questions of what was going on that created this data. And a lot of data science problems are the same. Right. I mean, we tried to do predictive, we tried to build predictive models in certain spaces. And that's really useful as well. But a lot of the time, the way we achieve that is by looking at the data we have today and thinking back or looking back and discovering what caused that data to exist in that way. And based on that, we do the predictive, you know, the algorithms to figure out what's going to happen next based on the data we're getting today. So that's a this is something I wanted to mention. I think it's really cool. And then when I look into this this feature space. Right. I have one. And once I've processed the sensor data, what I have is reconstructed particles and various attributes that these particles had. Again, going back the direction in which it was going, the charge it carried, its mass, its momentum. All of these different attributes. So that's my feature space, essentially. I mean, there's a lot of other metadata as well, like, you know, what was energy in this collision? How head on, did the collision, how head on was the collision? So you can when you're if you imagine accelerating to, like, just throwing two tennis balls at each other. Right. They can either be completely head on and jump off or they can scrape and so on and so forth. So that becomes an interesting aspect of this problem as well. So I have all this data. My goal is to see if through the this and these data and find my Upsilon particle. So it's it's like finding a needle in a haystack in terms of how how much data I have available. And the right signal is what I do a lot of the time in terms of the selection cuts is I will look at these various attributes of different particles and I will see where I can draw my analysis box. Santona Tuli: [00:27:56] I don't I don't want to swim in a swim in this massive amount of data and just, you know, go move around in a random walk, too, until I hit an upsilon. What I want to do is systematically, strategically figure out what my most effective analysis box is going to be, where my upsilons are hiding, in which area of the feature space. So I'll put selection cuts on let's say I have muon, muon is a heavier electron, essentially that I've reconstructed and I'm able to pair it with another one, another muon and say that OK, this two muons probably came from an upsilon. So what I, what I'll do is I'll play around with the momentum range. You know, the mass range or something like that until I see that there are other way. So there are ways in which I can classify my upsilon. right. The rules phase clustering that I talked about a little bit. So basically I'll say if these checkboxes are checked, then it's an upsilon. So I will have a hold those attributes constant and then pick one attribute that I'm going to play around with, that I'm going to sort of change my value. The cuts that I'm going to place in order to define this box, I'm going to move. And it's like a knob. It's like turning a knob. I'm going to move it around and see. Well, statistically speaking, how does turning this knob affect whether I'm identifying this particle or this event as having produced an upsilon as opposed to just other things that might look like an upsilon in certain ways or just be noise. So that's that's what I really mean when I say selection cuts. It's fine tuning the whole space of my analysis so that I'm not just randomly looking. Harpreet Sahota: [00:29:43] That's really, really interesting. And just like you, like feature engineering. So my favorite part of the process, I was wondering if you can maybe share some tips with our audience so that we can be more thoughtful in our feature engineering. And maybe if you're able to provide us like an example of how you're doing feature engineering in your work. Santona Tuli: [00:30:03] So the way I approach it is that I know very little and I want to learn from the Data. Of course, there are assumptions that I do go in with. So if it's a particle physics problem, I do bring in my domain knowledge. If it's an NLP problem, then I do know something about how languages interact. But when I look at a particular dataset, I try to learn as much as I can from the data. First and foremost, even before I start thinking about, you know, what my features are going to be and what my ML algorithm is going to be. So for me, this includes slicing, slicing and dicing the data, looking at lots of plots of different variables against each other, looking at their correlations and so on and so forth, even studying why certain features might have null values and why others don't know how to sort of interact with that. So that that's sort of where I begin. And then as I as I learn more from the data, I'm able to say, OK, this is a feature that makes sense for my ultimate goal. Right. So that's the other aspect of it is I should always have in mind from the beginning what I want at the end. So that's that's also part of the holistic sort of decision process, is knowing not not trying to manipulate the data in order to get a result that I want, but knowing what I wanted to test, knowing what a positive result would be and what a negative result would be and what a null result would be. And that's when I start thinking about the features and context of that. Like really which of these features are what combination of these features is really going to add value to that fundamental question that I'm going to answer at the end and just set a high level of these are the ways in which I approach feature engineering and why I enjoy it so much, because it's it's I feel like it's where you get to be the most creative. You get to exercise those muscles, those muscles more the most because ML algorithms. Yes, you can are pros and cons to different ones. And you can definitely you know, that's also fun sort of figuring out which is the best fit and so on and so forth. But the math is sort of fixed, right? You get to again and get to play with knobs, but you don't really rewrite any fundamental ML algorithm when you apply it. You're just, you know, fit predict. But so the really creative part of the process to me is figuring out what features are going to help me. So that's how I tend to think of that. Harpreet Sahota: [00:32:31] I absolutely agree with you. Like feature engineering, I think is only really limited by your own creativity. How do you view data science? Do you view it as an art or a science? Santona Tuli: [00:32:42] Both. Definitely both. So the science versus arts divide often boils down to procedural versus creative processes. So like a science is sort of regarded as, you know, you take step one then you do step two and so on and so forth is very procedural. Whereas with art you can be, you know, sort of more creative and randomly swing your paintbrush. But of course, there is plenty of creativity in science and plenty of procedure in art as well. So it's not really a fair divide. Right, because you can you can be as creative as you as you want with science and even an art, you have to if you're trying to paint a picture of let's say you're drawing a portrait, there are rules about how you go about doing that. You can't just start, you know, arbitrarily. So, yeah, for me, it's it's very, very much a mix of both art and science. I mean, it's called Data science because there are certain scientific techniques that we often use. It borrows heavily from a decision, science from statistics, which is math and science. So there is definitely that aspect of that. But the thing that we just talked about, right, with feature engineering and being able to imagine what how the Data is going to tell a story, not strangle and not, you know, like wrangling it into telling a story that you have preplanned in your head, but really showing the creativity in bringing different pieces of the data together to tell its story. I think that's a very creative process. Harpreet Sahota: [00:34:17] How does the creative process come to life in Data science? Santona Tuli: [00:34:20] So for me, the creative process is thinking outside of predefined paths. So being able to step back from approaches that are known to work and come up with approaches that haven't necessarily been tried before, but that could work. And then, you know, it will be so cool if they did work right. This process that you just thought of this approach you just thought of. So, you know, trying that like trying to apply that to your specific Data science project and just seeing it through. Whether maybe it doesn't work out quite the way you thought it might. But just being able to step outside and think of alternative approaches, stepping outside the predefined paths. To me, that's how the creative part of my brain is really engaged when I'm doing Data science. And the other aspect of that is in science in general. We do a lot of jumping around, then thinking about different approaches, approaches and trying to pull them together. Right. So we learn we're trained in how. So this is a little bit redundant. It's just it's just emphasizing the same point that I made earlier. But so we learn we're trained since we're a little on the history of of how knowledge has evolved and how people have thought about things and why certain ideas were good and why certain ideas failed. Right. When you're problem solving, it's about you. You're the one who is thinking through this problem. You have these guidelines based on what you've learned and what you've seen other people do. But in the end, the canvas is yours. So that's you get to have as much of a creative impact on this particular project as you want. Harpreet Sahota: [00:36:10] What's up, artists? Be sure to join the free, open, Mastermind slack community by going to bitly.com/artistsofdatascience. It's a great environment for us to talk all things Data science, to learn together, to grow together. And I'll also keep you updated on the open biweekly office hours I'll be hosting for our community. Check out the show on Instagram at @TheArtistsOfData Science. Follow us on Twitter at @ArtistsOfData. Look forward to seeing you all there. Harpreet Sahota: [00:36:39] You're featured in an IMAX movie, the first movie star I've had on my show. Tell us about this movie that you're a part of. Santona Tuli: [00:36:45] Such an amazing experience. So the vision of the film was to depict scientists doing science, real scientists doing real science. So my PhD advisor was asked to be on the advisory board of this film funded by, among other sources, by the National Science Foundation. And at this point, many, many other and so CERN, the LIGO experiment, the Perimeter Institute, UC Davis and various other educational institutions sort of rallied together to fund one of the advisors was my P.I.. And as the team started to develop the narrative, the creators of the film got increasingly excited about the idea of having the atom. And so the fundamental particle, the atom to be the protagonist, which is a really interesting approach. And you don't have a person as a protagonist. You really want to study the journey of this atom. And so that's where it began. We went to. So once they figured out exactly what our research group works on, there are like, yes, this is this is what we want to follow. So it's me and a couple of my team members and then a couple of the other people from my research group as well. We all spend time at CERN while the data is being collected. It's very much a team effort. Very collaborative. We're all there, you know, staying up late together. I mean, not to year around, but when when our collisions are happening, it's a it's a very hands on and, you know, all hands on deck kind of process. So they were like, that's perfect. We want to be there and just film you as you're doing this work when making a film. You have to make some adjustments, like they put these lights sort of on top of a computer so that our faces would be lit up and all of that. But at the end of the day, they were just filming us as we weren't. So all of the excitement that showed up on our face, you know, the curiosity, all of that is authentic and real. And that was extremely rewarding for us to be a part of. And then they interviewed us. So we did the whole green screen kind of talking about our journeys and how we made it to CERN. And you know what we're really hoping to answer. That was one aspect of it. And then at the end of the Data taking period when we were shutting down the experiment, we always go to a bar and just, you know, hang out and celebrate. So they wanted to follow us there as well. So there is some shots that were just like, yeah, this went really well. And we're just drinking and saying that. And so it was very cool. And the whole objective, as I as I look back on it now, picking the atom as the protagonist, following us as we collected data at CERN, all of that really feeds into the eventual goal of this film and this project, which is the audience should be able to look up at this screen and see themselves reflected in it, being able to understand that the physics that's going on, it's not about going into the depths of the physics. Of course, we don't do that and this in this film. But just to understand that physics is very much within their reach. Science is very much within their reach. I hope that the audience will see me and think, oh, yeah. I mean, there's nothing special about that girl. She's just, you know, had this has this background and is doing this really cool stuff, so I can too. Harpreet Sahota: [00:40:11] That IMAX movie is called Secrets of the Universe. And when does that get released or is already released? Santona Tuli: [00:40:17] It's called Secrets of the Universe. So it has premiered at the Smithsonian in D.C. We we all got to go to that red carpet screening. And that happened last summer. And since then, we were actually supposed to have another red carpet screening here in California in April, but it got canceled because of covid. But very soon, I mean, when things get back to normal, we'll have that second screening. Harpreet Sahota: [00:40:43] I'm looking forward to a chance to watch it out here. You speak about this a bit earlier about the need for interpretable models. I saw a really well-written post from you on LinkedIn around interpretable and explainable machine learning and how they're different, which might speak to that point. Santona Tuli: [00:40:59] Yes, this is something I've been thinking about a lot recently. So to me, the distinction is an explainable machine learning model can be explained before the fact and interpretable machine learning model can be interpreted after the fact. So let me divert a little bit deeper into that. When I am building a model to extract some particle physics phenomenon for some nuclear physics phenomenon because of things we've already talked about, it's very important that I can explain my choice. We have so many levels of reviews within our collaboration because we really don't want to make any claims that we can't take back. So, you know, by contrast, like, in industry, of course, it's it's more sort of it's driven a lot of the times by bottom line, like the actual revenue. Right. So if you if you do something wrong, it's terrible. You might have lost, you know, millions or billions of dollars or whatever, but you can sort of change learned from that and really change your approach and hopefully make up for that. In our field, it's I mean, it's, of course, still true. Where we're learning about something. So we don't. It's not like we can't ever say that. Oh, yeah, we were wrong. We scientists go back all the time. But even even with that being true, we really have a lot of imperatives in place to make sure that we really understand the things that we're claiming. So that's why explainable part is really important to us. So even before that, my model to Data I have a good sense of what the different parameters are going to represent, why I'm setting my certain hyper parameters to certain values and so on and so forth. The other end of it is I apply my machine learning model to Data and I get some results. Maybe it's a classification. I have my records classified and then I can also look at the importance of the various features. Right. So which features played a part in this decision which were most important and which were at least important. And look, being able to look at that and understand why certain features ended up being important in this decision, even if when we started we didn't really know the why, it might be too late to be able to interpret that and understand that this to me is interpretable machine learning.That's that's a way that I see it now. Of course, they're not exclusive. These two things are not exclusive. And they also don't like one doesn't imply the other. So just because starting out I have a explainable model doesn't mean that at the end I'll be able to interpret my results necessarily. Similarly, the other ways, oftentimes when I talk about explainable machine learning, people will sort of think that I'm saying interpretable and think that I'm making a statement about why traditional machine learning is better than deep learning. But that's not the point at all. Deep learning can be highly interpretable. You may not be able to explain exactly why or exactly how something is going to. It's some deep learning algorithm is going to work ahead of time based on your data. But afterwards you may be able to, you know, interpret the heck out of it. So they're they're really not the same thing. They don't imply each other and they can simultaneously be true or untrue. Harpreet Sahota: [00:44:22] And another thing you've been thinking about is decision science. Can you show your thoughts around that first? Can you maybe help us understand the distinction between decision science, data science and what have you been thinking about? Santona Tuli: [00:44:32] To me, data science is a part of decision science, but decision science of some is in some sense a bigger like it's it's more of an umbrella and Data science fits underneath that umbrella. One of the ways we can distinguish those two is quantitative and qualitative aspects. So actually it's not a distinction. It's more that data science tends to be more quantitative by definition, but by nature decision science comprises both that quantitative part and a qualitative approach where you're making value judgments. You are as a decision maker, you are thinking through things and not just relying on data. You're not your you have to be able to interpret what the data is telling you. And based on that, you make decisions in some sense calling it qualitative isn't even that fair, because there's so much science that goes into into making good decisions. Right. So, again, these are all very fluid. Science and art are so fluid. So as a decision, scientists have to check so many different sources of bias. And that's why I've been reading up and learning a lot about the decision science recently, because, again, feeding into what we started this conversation with, mitigating our biases, being able to make sure that our decisions and our processes have a positive impact going forward. It's a picture that you can only paint once you have all of these pieces in place. And when you're making a decision apart from Data science that you tend to rely the most on is sort of the statistical significance. So you have done some analysis on Data and you're claiming to have seen some phenomenon with certain confidence level. And that's very important to the decision scientist, right. OK. Not only what you're claiming to have seen, but also the confidence you have in this claim. And that's going to factor in into the decision, scientists outlook on this particular decision. And only when they incorporate that in with their sort of decision framework and having thought through all the different biases, like, you know, confirmation bias, et cetera, that they could have they when once they had sort of put everything together, are they able to make good decisions? So what we can do is Data scientists, even if we're not making, you know, big decisions, even if we're not the ultimate decision maker on a certain project, what we can be doing, how we can be getting better at Data science is by incorporating some decision science into our work. The way we can do that is by thinking about the whole Data science project and to end when we start. Santona Tuli: [00:47:07] We have to think about the end goal and we have to do with we have to like put ourselves and the decision makers shoes and test all of our assumptions. Does it make sense that I'm assuming this about the Data when this is the thing I want to decide using the Data like, am I making a fallacy and make this assumption? Am I twisting my results at the very beginning? Because I can't just it can't just be about numbers. I can't just, you know, give someone. OK. This is what I'm seeing, X percent, Y percent. And they go and make this. I have to be a conscientious person as well. And think about what that number means. So thinking through that process is, to me, the best way a Data decision. The marriage between Data and decision science can happen. And the other aspect of this is that if these two individuals or these two rule, are completely separate Data science and decision decision science, there is no back and forth and there is no collaboration. There's no iteration of the process. So you might be working with a decision maker consistently year after year. But if you don't have a back and forth, you'll be making the same mistakes you've been making. Right. So the data scientist has to have a way to speak to the decision and the decision scientists have to be able to. Has to be able to sort of look at the data to inform they're not just like the final result of the data, but the data in it in its entirety. So that's that marriage, I think, is extremely important. Both sides should be able to talk to each other, collaborate with each other and iterate on the process to get to better, better processes and better understanding from Data. Harpreet Sahota: [00:48:49] Switch gears here now and try to pick your brain on another couple of things for the people out there who are trying to break into Data science that maybe they feel like they don't belong, but they don't know enough or they don't think they're smart enough or they're just intimidated by everything you have to learn. Do you have any words of encouragement for those people? Santona Tuli: [00:49:07] Yeah, to the same thing I say to anyone who is trying to break into anything new and lacks confidence, which is that I genuinely believe that it's not a capability thing, that it's never a capability thing. If you wanted to do Data science or science or tech or coding or, you know, language, learn a new language, you absolutely can. There are some skills that you'll need to pick up, but we're used to that. We have been learning new skills every day since birth. And beyond that, what else would you need, especially to break into Data science? Did you meet other Data scientists to observe how they think, what they do, how they interact with each other? And to me, that's the real skill set. Earlier this year, I started thinking about product management in addition to Data science, because I like sort of taking holistic approaches to things and really sort of strategizing from the beginning, you know, interacted with some product managers. I tried to pick their brain credit, get coffee with them and and really understand what this field was. And one thing I kept hearing over and over is think like a product manager. But what does that mean? Now, having spent more time sort of thinking about it and learning from product managers, I, I understand what that means. And I have the same advice for Data scientists used to think like a data scientist. And that's not just a Weismann saying. And that doesn't really have meaning. It has absolute absolutely has meaning. The way you can achieve this is by meeting data scientists, observing them, learning from them, thinking how they think, watching how they speak. The reason I say that is it's not the speaking may not be an important aspect of being a data scientist, but it definitely gives you insight into what's going on in their head, like the content they put out there on LinkedIn, for example, or various other sources. I think that's that's one of the best things you can do to pick up this sort of the more nebulous skills around Data science beyond the more harder skills, so to speak. Harpreet Sahota: [00:51:04] What does it mean to think like a product manager? I think a lot of our audience would love to hear your take on. Santona Tuli: [00:51:10] So I took this sort of short program. It's called She Aspired. So I guess I'm giving them a shout out now because I have the platform. But so it's really close part of the first cohort. And it was a number of people who wanted to either break into product management or just learn about product management and what we did in this program is we approached sort of the transition to product management as a product itself. How this helped is. It's hard when you talk about things in a nebulous way like you, someone might advice you, OK? You want to be a product manager at X, Y, Z company. Go pick a product that they actually have out there and then see how you can improve it and maybe write up something on that and then, you know, try to sell yourself in that way, which makes a lot of sense. But it's still difficult if you don't know what product management is. Right. You can't just go and pick up a product and, you know, tear it apart. But so by having a more tangible product that you're working on, AKA your career or your transition or you can think of you're breaking into Data signs as your product and then you dismantle that. Right. You can sort of really take it apart and think how different parts of it play with each other. Like, so we were just talking about the skills that you need to break into Data science right to get that data scientists job. So being able to separate or distinguish what the end goal is and the steps that you need to take in order to get there. So making the roadmap, which is that's a common term used in product management, setting the key objectives and then sort of having a timeline for it for those or having these checkmarks about like, when do I say that, OK, this aspect of this of this project or product is done. And then I can move on to the next next part. So to me, product management is very much as is very dependent on how you can how it can break apart a bigger project, a bigger idea into smaller bite sized chunks. And then you can strategize whether you can complete them in series or in parallel. And, you know, you have to pull your resources cleverly. If you let's say I'm working on something now and I have this resource that helps me do it. And then tomorrow I'll work on something else and it might require the same resources if I had noticed this before,I might be able to pull that resource at once and work on these wanted tasks, one and two at the same time. And that would take me a lot of time and make it more efficient. So thinking like a product manager or being a product manager is really about strategizing and figuring out the most efficient ways of approaching a problem or a product and being able to also have the flexibility to sort of iterate, learn from what you're doing and keep applying for it. And truly, I truly believe that that applies to so much in life as well. Harpreet Sahota: [00:54:14] Very well put. And I think we can all end up being product managers after that really great explanation. I like what you said. Observing data scientists not necessarily downloading their brain, but absorbing the way they think about things, how they tackle particular projects, particular problems, how the vocabulary they use. So a lot of it is just developing the right mental models for yourself so that you can apply those mental models in the right scenarios. I think, you know, a lot of scientists are out there working on projects and they might feel like a bit of fear or hesitation trying to make the project perfect. Right. They don't release it until it's perfect, whether that's professional or personal project that they're working on. Do you have any tips for anyone who is in that type of mindset to. Santona Tuli: [00:55:01] Try to get over it? I used to be a perfectionist when I was in college. It's there's still some aspects of that that I carry. I can really get hooked on as one thing and sort of spend time on it sometimes. But I've also been actively trying to not do that. So get over this idea that it has to be perfect before, you know, I push it out or send it or or what not. So I would just try to give the same advice that I give myself to anyone else out there who is struggling with this. Just put it out. The worst? What's the worst that can happen? Maybe someone criticizes in some way. You know, I'd never be heartbroken about purely negative criticism, purely, you know, like some criticism that does isn't meant to help you. But it might turn out that this criticism that you're receiving on it is actually going to help you iterate on that project and make it better. And you would have never gotten that feedback if you didn't actually put it out there. So, yeah, just just put it out. Ask the people that you are most you'd, feel most embarrassed about if they were if they thought that it wasn't good enough. Like you have in your head. Right. Like, oh, it's this person sees it and then they're going to think that it's not good enough. And I'm embarrassed. So just go and ask them, you know, build a report, talk to them. People love being asked to like do things in the sense of like help you out to like, look over your projects, look over your resumes. And there's so many kind people out there that are genuinely interested in helping in that way, you know, build that relationship, you know what, they'll they'll take a look at it. Absolutely, and they're they're never. They're not going to be mean about it. They're going to give you that feedback. And all of a sudden, you know exactly what you need to do it to make exactly that person or that kind of person think that the product is as good. So our project is good. So it's very much a the more you step out of your comfort zone, the bigger your comfort zone will get. And the more feedback you'll get on your work and you can keep iterating on it. Harpreet Sahota: [00:57:00] So as someone who's a recovering perfectionist, sometimes when we're working on a project and we're getting some feedback, we're getting some criticism, we might feel down or feel like a failure. We might want to give up. So what can we do to kind of get over that, get through that feeling? Santona Tuli: [00:57:19] For me, it's a couple of different things. One is just being able to step away, take a break, take a breather, and that can be long or short. But that really helps me refocus and sort of just it puts things into perspective. You know, the thing that you're sort of sweating over, you know, staying up nights and working on it may actually not be that important. So it sort of helps put things in focus. And that went into perspective. And then the other aspect is just keeping at it, which might seem like intention with the first thing. When you're working on something and you're not fully confident about it, rather than just getting disheartened. Give you some self, some slack, figure out what it is about this that is giving you that anxiety or that stress. What's causing it? And, you know, face it head on and really dig at it until you are able to beat it. So that's so to speak. So you can always you can take over any hurdle as long as, you know, if it has to be a worthwhile hurdle. But once it is, you can definitely do it. Having that confidence in yourself and in your process is very important. Harpreet Sahota: [00:58:31] Because the most uncomfortable part about that would be just sitting there with the fear, digging through trying to understand. What is causing it? On the other side of that, that's where the most growth happens, right where you're sitting in that uncomfortable phase. you talked a lot about, you know, the technical skills that are needed to be a data scientist. What are some soft skills that you think Data scientists are missing? Harpreet Sahota: [00:58:51] I'll use the word soft skills as well, because that's often used. But I do want to just quickly say that I think that these are all skills like as important as it is any of the quote unquote, hard skills. And I would if it were up to me, I would just rephrase it all as a skills. But I know what you mean. So the top ones that come to mind are communication. So I think being able to communicate clearly, it doesn't mean it has to be in perfect English or in perfect sentences or that you have to express all of what's in your head in one go. But it does mean that you are comfortable talking about what you're doing with others. Communication is a two way street. It's not just about talking at someone. It's about giving space, listening and taking feedback. I think that really, really helps. It helps with a lot of processes, but especially with Data signs. I think when you think through things like, you know, we have the rubber duck idea of, you know, talking at the mirror and so on and so forth. Santona Tuli: [00:59:55] It is when you talk out loud, speak with someone about something that you're working on and that you get the smartest ideas, brightest ideas, or you realize why some approach you're trying isn't great. So that's that's a huge one for me. That relates to the second one directly, which is presentation skills, I think are very important. So, yes, gradually honing that skill of expressing yourself and what you're working on, getting some buy in. This is like one way of putting it is when you're sold on your idea, learning the skill set to sell others on it through presenting. And that has hard and soft aspects to itself as well. Your actual data visualization, your actual slide deck, et cetera, is a part of it. And also how you present yourself and how you speak about what you've been working on. Those two things are extremely important. Let's see, soft. Yeah, I mean, I feel a little bit like it's emphasizing the same point again, but triaging is another really important one, rallying people, building, getting. Getting people to buy into your ideas so that you can maybe push it to someone, you know, at the skip level or the C suite level or something. So like being able to have that engagement and being able to triage people into basically sharing your view with others such that they can actually see it from your point of view is an extremely important skill. Harpreet Sahota: [01:01:29] Yeah, definitely influencing type of skills. I guess another way to put that. I totally agree with you. Like soft skills, like don't really like that name either. I think they're really the hardest skills because these are skills that you can't nobody can teach you these skills. You have to learn them yourself. You have to actually learn them for yourselves through experience, but through experimentation and, you know, through self reflection and trying things that works and what doesn't work and again, being uncomfortable. I was wondering if you could speak to experience being a woman in STEM and if you have any advice or words of encouragement for the women in our audience who are breaking into tech or currently in tech. Santona Tuli: [01:02:07] The first thing I would say is that I feel you. I know that it's hard. It's always hard to be the only person in a room that is different in some way. But if you don't push through that, then you will always be the one person in the room. And it's only when you're able to overcome those insecurities and make a environment that's more welcoming is what you're gonna get that second person in the room and then the third person and so on and so forth. So, yeah, I mean, I think there is a general recognition that diversity of all sorts is important and good and ultimately helps everyone's bottom line. There's still some. Of course there is. There's a lot of pushback as well. And, you know, we have systems that work in certain ways, but find yourself a network of people who believe in diversity and in encouraging and supporting people of various kinds of minorities, be it gender or race or whatnot. And that network really that support system, if you're able to build that up, that really helps. Even if in that room you don't have that other woman in tech, if you're connected to women in tech through other organizations or elsewhere, then you can reinforce your beliefs and they will support you through that. So I'm part of a women in machine learning and data science. It's a nonprofit organization. We have a strong network of women. Same thing with this this product management program that we did. We now have that cohort that a connection, a community of women were there to support each other and to help these overcome these barriers that are that we see across the board. I think that that's a big one. And I've done that for myself. Finding that community, those communities, it's not just about one. It's sad that we have to resort to this. But that's that's definitely an aspect of it. Just a different way of looking at it. I have. I grew up in a family who likes lots of encouragement and no distinction ever based on based on gender. So I never walked into. So I went on to major in physics right I mean, I was first of all, it was a small physics, cohort. Not too many people go to physics. But like there are even fewer women, of course, in my in grad school, my graduating class, I think were really seven out of thirty five or something women. So it's those numbers are staggering and they will exist, you know, further into the future. There are not vanishing tomorrow. But going in, knowing that you're equal to everyone else, you're just as good, if not better. I mean, there are definitely things that can that can make you put you apart from from other people. Maybe, you know, your New to. Of course, you can be a woman, let's say. And the smartest one and a physics graduate program like that is a very true reality. Sometimes, oftentimes. And so just being able to walk into those spaces, having that confidence in yourself, it doesn't matter what you look like, what you know, what the other what the population is made of. You have to have confidence and faith in your ability in the thing that you're trying to do, be that Data science, tech, physics, what not. That's the only thing that matters. The system that's in place does not matter. And we're working every day to try to overcome those barriers anyway. But in the meantime, you have to be confident about yourself and your abilities. Harpreet Sahota: [01:05:48] Thank you so much. This is a very empowering message. I'm sure audiences really took away a lot from that. Thank you so much. So talk to us about the My Hero Award that you were recently awarded. How do you hope to be a hero for women in STEM? Santona Tuli: [01:06:02] That is a dream. This award was for a short video that we did. Partly the goal was to have promotional video for Secrets of the Universe, the IMAX movie, we decided to do it because they wanted the audience to get more than just me and a few shots during the movie, but also like understand my story and my journey. So it's this video is played before screenings of the movie and will continue to be played on. The movie is out more officially before the screening of the movie to set context around who I am in this movie. And so that's it's really the thing that's meant to frame it. So in this video and you can look at it, you can find it on the Secrets of the Universe website. There's character profile on me and the videos included there. So I just talk about yeah, I talk about where I come from, my journey into physics, how I went into CERN, what it means. What we're doing there is it's it's just a frank conversation about me. And I think because I am a woman and because I'm a person of color and it potentially has even broader reach. So that My Hero Film Festival is a yearly festival and they try to recognize individuals and groups that are setting an example and somewhere another something that other people can sort of aspire to or be inspired by and, you know, sort of, again, see themselves reflected in that. So they picked it in the sciences category as one of one of the top submissions or whatever you want to put it. So that was that was really cool. I never expected, you know, as a physicist. Yeah. I mean, getting to go to CERN was super cool. Getting to be in this IMAX movie. It was. Well. And then on top of that, you know, this video gets this award`. So it's yeah, it's it's definitely very cool. I'm very, very appreciative of all of this and very happy about it. But at the same time, the goal throughout all of this has been about reaching out to people and showing them that it is very doable. You can do it if I can do it, and you can do it, too. And you should definitely be thinking that. I'll share one quick thing as well. I just remembered about this movie. So the second screening that I was at, high school, middle school kids were were included in the select audience group. And I was sitting I happened to be sitting next to this 11 year old or so. I mean, I didn't ask him how old he was, but, you know, looked about eleven. And so at the end of the movie, he just looked up at his or whoever he was with, and he was like, mind blown. And to me, that was such a like empowering moments. Like this kid is getting inspired by this. Getting inspired by the physics, by the people. And so that is extremely rewarding. And I just hope that more and more people and more women are going to watch this and be like, yep, I'm gonna go do that next. Harpreet Sahota: [01:09:08] What can we do in the Data community? What can men do, in particular in the Data community, to help foster the inclusion of women in STEM, in tech and Data? Santona Tuli: [01:09:17] People are learning to be better allies. There's, of course, a long way to go. It's difficult to have conversations about supporting minorities in tech because it often gets intertwined with ideas of productivity and intelligence and things like that, which is of course, extremely sad and it's part of the systemic problem that we have. But, you know, like because of the way that these fields have been dominated by a homogenous set of people, there is this idea that anyone else is somehow inferior and doesn't belong to this to this field. So these conversations are tricky. But the only way that we can move forward is by having them. I mean, a lot of people don't know about the biases that exist around them and how they're propagating them. They don't understand that, you know, standardized tests like, you know, just as an example SAT or GREs is systematically discriminating against, you know, women or people of color. These are facts that I think the more we write about these, the more we discuss them, the more awareness we can spread about what exists. And, you know, it's, again, a privilege thing. Like, you don't you don't realize what your privilege is, but your life has been set apart from the day you were born. From others because of certain factors. They relate to just, you know, being able to spread awareness on that, I think can go a long way to my male colleagues. I think the one thing I would say is never, ever go in with presumptions about your female colleagues. I think most people will be like, oh, yeah, I'd never do that and stuff. But I think a lot of these can exist subconsciously. So just think to yourself, take a moment to think like, okay, I'm about to review this. This person, this female colleagues code, am I subconsciously or am I thinking something already about what I'm going to see? And if you are, then just stop and, you know, maybe revisit or maybe excuse yourself from from from doing it. If you think that you can have that harmful effect, the reflection is always a very positive thing. Harpreet Sahota: [01:11:28] Thank you for that. So last question. Who briefly jumped into a lightning round? What's the one thing? People to learn from your story, Santona Tuli: [01:11:34] You get to shape your path. You get to do what you want to do. If you're curious, if you are excited and wanted to try things and want to learn things, things will fall into place and you will figure it out. And that is not to say that I haven't had set backs or failures. It's just about how you look at it and it's about getting back up and keeping going. Harpreet Sahota: [01:11:57] I love it. Let's jump into a quick lightning round here. What's your data science superpower. Santona Tuli: [01:12:01] Curiosity. Harpreet Sahota: [01:12:02] What would you say is the most fundamental truth of physics that all human beings should understand that physics is a model, that it's approximations, that the whole idea is we try to explain why things are happening, but we don't know anything. Harpreet Sahota: [01:12:19] So what do you think is the most mysterious aspect of our universe? Santona Tuli: [01:12:24] I think that the physicists I have to say dark energy, but personally, I think that it's really cool that we exist, that, you know, this small planet has the right conditions for human life to exist and we're able to ponder on our own existence.I think that's mind-boggling. Harpreet Sahota: [01:12:43] So it's an academic topic outside of Data science that you think every data scientist should spend some time researching or studying on. Santona Tuli: [01:12:50] Social behavior, maybe human interactions. Harpreet Sahota: [01:12:53] So what's the number one book? Fiction, nonfiction? Or if you want to pick one of each that you would recommend our audience read. And what was your most impactful takeaway from it? Santona Tuli: [01:13:02] I think even to lead, my favorite book is 1984 by George Orwell. I read it very early and I was very sort of enamoured with it. What I do like about or well is he has some essays on communicating and writing, and those can also be very, very helpful in figuring out best ways to communicate things. So that's more of an endorsement of an author than a particular book. Most recently, I read a book about or set in World War II, Germany, which to me was it was a really cool way to observe something that we learn about in history books. But this was a like it was a narrative, right. So another way of what I'm trying to say is historical fictions based in historical context. I really enjoy it. And maybe others can benefit from from that as well. It gives a human touch to something that actually happened and people suffered and then so on and so forth. Harpreet Sahota: [01:14:02] So if we can somehow get a magical telephone that allowed you to contact 18 year old Trenton, what would you tell her? Santona Tuli: [01:14:10] Keep doing what you're doing. You can do it. So I you know, the thing when you look back on what you did, like six months ago, you're like, oh, my gosh, I was so stupid. So that that happens and that is real. So there are, of course, things that 18 year old Santona did that I would very much be like, oh, my gosh, what? But no, the real thing is that things turned out fine. Right. Things turned out well. So just keep doing what you're doing. This is the same advice I would have for any 18 year old is sort of like have faith in yourself. Be curious and learn and do what you do. What motivates you learning? It's a bit cliche, but I am excited to learn more in different fields. So it's it's something that yeah, it ties into curiosity and learning. It's not just about OK, honing my Data sign skills or my physics skills or something. It's about learning what other people are are thinking about. And it's really exciting. Harpreet Sahota: [01:15:09] So what song do you have on repeat. Santona Tuli: [01:15:12] I really like Resilience by Raising Appalachia. It's, it has a very strong message. I listen to it almost almost every day, especially in times like these. A couple of the lines are we are resilient. We trust a movement. We negate the chaos, uplift the negatives. Harpreet Sahota: [01:15:28] I definitely have to check that out. So how do people connect with you? Where can they find you? Santona Tuli: [01:15:32] I am available on LinkedIn. That's probably the best way. Just my name Santona Tuli. That's without an H. Although you pronounce it Shantanu, it's Santona its spelt Santona. Yeah. So LinkedIn is this best way to reach me. Please connect with me. I love meeting new people. Harpreet Sahota: [01:15:49] Dr. Tuli, thank you so much for taking time out of your Scheduled to be on the show today. Really, really appreciate having you here. And I know there's so much here that our audience will learn from. Thank You. Santona Tuli: [01:15:58] True pleasure is so much fun. Thank you.