Today’s show is sponsored by strongDM. Transitioning your team to work from home? Managing a gazillion SSH keys, database passwords, and Kubernetes certs? Meet strongDM. Manage and audit access to servers, databases, and Kubernetes clusters, no matter where your employees are. With strongDM, easily extend your identity provider to manage infrastructure access. Automate onboarding, offboarding, and moving people within roles. Grant temporary access that automatically expires to on-call teams. Admins get full auditability into anything anyone does: when they connect, what queries they run, what commands are typed. It’s full visibility into everything. For SSH, RDP, and Kubernetes, that means video replays. For databases, it’s a single unified query log across all database management systems. strongDM is used by companies like Hearst, Peloton, Betterment, Greenhouse, and SoFi to manage access. It’s more control and less hassle. strongDM. Manage and audit remote access to infrastructure. Start your free 14-day trial today at strongdm.com/GTC. CHANTE: Hello and welcome back to another episode of Greater Than Code. This is episode 178. This is Chante Thurmond and I'm introducing my awesome co-panelist, Rein Henrichs. REIN: Hi, Chante. I don't really believe that it's episode 178, but people were telling me that that's true. I'm here with my friend, Jacob Stoebel. JACOB: Hello. And I'm here with Avdi Grimm. AVDI: Hello. And we are here today with Emily Robinson, who's a senior data scientist at Warby Parker, maker of very nice glasses, I might add, where she works on a centralized team tackling some of the company's biggest projects. Previously, she worked at Etsy and DataCamp as data scientist, holds a master's degree in management and gives talks across the country, as well as writing on her blog, hookedondata.org, about A/B testing, programming in R, and data science careers. And she published a book, Build a Career in Data Science, with Jacqueline Nolis about all the non-technical knowledge and skills you need to get and succeed in a data science role. Welcome, Emily. EMILY: Thank you. Happy to be here. AVDI: So, Emily, what is your superpower? EMILY: I would say my superpower is finding all the dogs to pet. I'm a big fan of dogs and my husband has a photo album of me petting all these -- I always ask permission when petting all dogs. And we have this like on a trip we take, there's like petting dogs in France, there's our trip to Japan, there's across the country. Unfortunately, in these times, with Corona virus, I've had to do a scale back, but I'm looking forward to resuming my dog petting. AVDI: What a rewarding power. CHANTE: Yeah, says something about you if the dogs can trust you, huh? EMILY: Yeah. Normally, I've had pretty good luck, except for one yappy dog actually in our building that does not like me very much. But mostly, it's always really fun to see the different personalities of dogs and to chat with the owners. I actually have a part time dog myself as I like to call her. My parents have a King Charles Cavalier Spaniel and we'll keep her sometimes. But when I don't have her. It's nice to find the other dogs around New York City. AVDI: Nice. REIN: So you're in New York City usually? That's a whole thing. EMILY: Yes, that's a a whole thing right now. Actually, as of recording date, we came out about a week and a half ago to Utah to stay with my parents. So, we've been isolating in the guest house since we arrived for the full 14 days. But yeah, it's definitely scary times across the country and the world right now. CHANTE: Yes, indeed it is. It's interesting because I think, I'm in Illinois. You sound like you're in New York. Jacob, you're in Kentucky. Rein over in Seattle? Or Portland? REIN: Portland. CHANTE: Avdi, where are you? AVDI: At the moment, I'm in St. Louis. CHANTE: Are we all on lockdown? AVDI: Yeah. JACOB: Yep. EMILY: Yep. REIN: More or less. CHANTE: All right. REIN: Well, we are. The city should be more than it is. CHANTE: Yeah, I know. I think the people probably don't necessarily agree with me, but I think that that's probably true. I just see a lot of people. I hear lots of activity. I'm like, "I've been cooped up in the house for more than a week now." REIN: So obviously, this is on everyone's minds, but we already did a Corona virus podcast. I feel like I've taken this towards [inaudible]. So maybe we can talk about data science or all this stuff around data science. [Laughter] CHANTE: Yeah, data science and Corona virus? That sounds great. [Laughter] EMILY: No, there are too many of those on Twitter already. Too many graphs. AVDI: Yeah. CHANTE: Well, maybe you could tell us about how you got into data science, that will be awesome. EMILY: Absolutely. My data science journey, I would say, started somewhat in college. I went to Rice University in Houston from 2010 to 2014. And I minored Statistics there. And at that time, Hadley Wickham was a professor of Stats. And so for any of listeners who don't use R, Hadley is one of the most prominent R programmers. He's a creator of some of the most popular packages: ggplot2, dplyr, or all stuff you need in data science, visualizing, manipulating data, cleaning it, et cetera. I got a really good start there. And then I went on to get my Masters in Management specializing Organizational Behavior and that was doing social science research, which is a pretty similar process to data science. So, you have to come up with a question, find data to analyze, to figure out the answer to that and then present it to audiences ranging from professors who have been specializing in this field for decades to professors that maybe in totally other departments. I decided academia wasn't quite for me. So I was in a PhD program but decided to live with my masters and went on to do a data science bootcamp called Metis, so I could up my Python skills in addition to R, and then also some machine learning because that hadn't been covered that extensively in my undergraduate or Masters. And then since that, I got my first data science job at Etsy. Went on to DataCamp, now at Warby Parker. At Etsy and DataCamp, I ended up specializing a lot in A/B testing, but also doing general analytics, been working on some forecasting projects, dashboarding for Warby Parker, and just been really enjoying getting into this field, getting to know people since I moved back to New York about three and a half years ago. REIN: So I really want to talk about organizational behavior stuff. Maybe we can get to that later. EMILY: Yeah, absolutely. I found that helpful both in terms of kind of the traditional skills you would think like, I took like stats and math and other stuff. But actually a lot of what my research there has helped with like writing the book and things like negotiation and communication. Obviously there's a lot of teamwork, there's lots of studies around this and it was really helpful to have that background in the literature. REIN: I have a sort of specific question that I'm really curious. When you were designing these studies, were these all -- I'm assuming these were all quantitative studies? EMILY: Yes. Organizational Behavior, you could have either qualitative or quantitative. Some of my colleagues would work on things -- and even qualitative, though, you're doing usually some sort of data analysis. Like you record interviews and you look for [crosstalk]. REIN: Like coding stuff? EMILY: Yeah, exactly. Mine was quantitative and specifically mostly experimental. REIN: What kinds of things did you study? EMILY: I studied the experience of women in STEM fields. REIN: Cool. EMILY: Sometimes people hide in academia. People call it like me search. Very often people are studying experiences that they've had or they've had trouble with. Like one of my professors just published a book on dual-career couples. That's her research. And she is part of a dual-career couple, two-person academic couple. And so, specifically one thing I was interested in was the idea of passion. I feel like especially in technical fields, you often hear this idea like,"You have to be really passionate about your work." And I do think it can be dangerous when it's narrowly defined how people would tell that you're passionate. For example, I think one thing in computer science, there is a really great book. I think it's called Unlocking the Clubhouse, which was some Carnegie Mellon professors back in the 80's or 90's were looking at computer science students and they're finding women would often say, "Well, I don't know if I'm passionate enough because I didn't grow up doing this. I don't want to do this all the time. I have other hobbies." And so they got this idea in the head, that (1) you need to be passionate to succeed and to do computer science. But (2) what it looked like to be passionate was this very narrow definition that also men often don't fit either. But because women already were in the minority and had other signals like this was just something added on, it's like, "Oh, I guess I don't belong here." So it was an idea that really fascinated me. REIN: I feel like I could talk about this for another hour, but that would be selfish. EMILY: [Laughs] I think it's good because I think you see it in data science as well. And I do think it can be a form of gatekeeping. So in general, I'm very passionate about being against gatekeeping and being against these laundry list of like, "Oh, you have to know all of these things or you're a fake data scientist." I just think that's a very damaging viewpoint. Often the goal posts people put is based on where they are at. And especially because data science is such a broad field, you'll never know everything. Even the world's expert in natural language processing may know nothing about forecasting, and maybe worse at forecasting than someone who just came out of the bootcamp because they happen to study that for a week. And so, I think the idea of making this list, I mean, yes, you could try to get some foundations, but after that, it's just such a broad field that it doesn't really make sense to be like, "Oh, you need to do deep learning to be a real data scientist." No, there are tons of data scientists doing very impactful work that has nothing to do with deep learning. And deep learning wouldn't help at all with the work that they're doing. REIN: You have statistics, a field that has existed for [inaudible]. [Laughter] CHANTE: I'm coming at this from an executive search point of view. I definitely have referred to data scientist, but I have a Master's Degree in Organizational Leadership and Development where I spent a lot of time in Behavioral Science, just kind of examining why I do something or why others do something. And even before we talked about that, I would just love for everyone listening, I think that data science is one of those catchall terms now. And so, I would love to hear how you're defining data science so that we can kind of get that on the record and kind of set that container for people. REIN: Yeah, that way we will all know that people are wrong. [Laughter] EMILY: This is something that can be controversial, but I will define data science as like the process of making data useful. That's pretty broad. And I do think I want to distinguish, for example, like people can be doing data science. They may not be called data scientists. In some case, maybe they couldn't get a job to be data scientists. But similarly, all of us do writing as part of our jobs. We wouldn't necessarily say we're professional writers, but we can maybe work on the craft. It's helpful if we're better at it, and so on. And I do think one thing we talk about in the first chapter of our book is dividing up data science into three components, which I think is helpful. The three ones we see are analytics, decision science, and machine learning. Analytics is basically about taking data that already exists or collecting and just kind of presenting it as is. Analytics would be we ran a survey and we said, "Twenty percent of people responded to this. We got 800 responses. They were 40% women," and so on. Just sort of taking the data at face value, which is often very, very useful for a lot of companies. It's been surprising to me, especially in earlier stage companies, how often they don't have some basic metrics. And then also haven't looked at certain data that can be very illuminating for the path forward. So that's one part. And then decision science is kind of taking that, but adding on some statistics, adding on maybe some of that psychology, knowledge. "What do we do based on this data?" With the survey, for example, you could say, "What was the non-response rate?" There are methods to adjust for the fact that like, "Only 30% of people we sent it to responded. We know the demographics of who we sent it to. And it turned out, a high proportion of women versus men responded. So how do we deal with that?" So that's a second component. And then finally, machine learning, which I think is often what people think of when they think of data science. Like Amazon's algorithms for deciding what should show up when you search for Harry Potter, making predictions of how likely someone is to default [inaudible] and so on and so forth. Machine learning is really a lot about predictions. Because underlying that recommendation algorithm is they're trying to optimize for something. And I assume it's something like purchasing. And so, underlying all of that is a prediction of which arrangement is going to give us the best outcome. CHANTE: That was awesome. Thank you so much. I love how you just break that down. I think that that is like setting that container and kind of saying that there's sort of at least as of right now, these three buckets that you might fall into helps people understand, especially those who are considering a career in data science, how they might go about starting to actually get to that. And I want to say in your book, at one point I read, this is about building a career. So are you now doing any specific outreach to people or going on a book tour to kind of explain to different demographics on how they could actually get into this field? EMILY: Yeah. Unfortunately, some of that has been delayed because of the virus. We're going to do like a book launch with my coauthor, Jacqueline who's in Seattle. We're going to do a book launch party there and one in New York. Obviously, that for now is pushed off indefinitely. But I do hope that I can, I think, for example, bootcamps could really benefit from the book. The book is divided into four sections. The first quarter is for people kind of thinking about data science and wanting to know what it looks like at different companies? What are the options for getting the skills? The second quarter is, I have the skills. How do I start applying for jobs and what's the interview process like negotiating offer? The third quarter is, I've got the job, but what's it going to look like in the first months, how do I communicate with stakeholders, how do I make a good analysis? And finally, the last quarter is, I've kind of settled into the role, but how do I continue to grow? So things like dealing with failure, starting to connect to the community through things like speaking or contributing to open source, and then even thinking about leaving your job for the next one. So I think really, there's a lot even if someone's already maybe got their first data science job. We wrote this book as anyone who's written a book will know, this was not a financial decision. The number of hours we've put into it versus like what we'll get back. Jacqueline is an independent consultant. She could have made much more by doing consulting. We didn't write this book because we can't make a lot of money. We wanted to write it because this is a kind of resource we wanted or we would have liked to have when we were entering the field. So, we're definitely trying to think of how do we help get this into the hands of people who could who could really use this? Another example, a professor from Chile actually reached out and asked about using part of our book for his class. And so we're going to see what we can do there. CHANTE: That's awesome. Thank you. REIN: Can I just really quickly dig into the bit about dealing with failure? What did you write about dealing with failure for data scientists? EMILY: Yeah. Jacqueline and I split the book, so we each wrote half the chapters, this was Jacqueline's chapter. But obviously we added to each other's extensively. A big part, she actually gave a talk about some of her data science failures, which I think would be great to link to for people to look at. But in the book, one of the things we talk about is failure is pretty much inevitable in data science because often you're trying to do something that's never been done before. This is a little less true with analytics, because analytics, you may fail to find the data, but often you can find some data. You can work around it or you can or you can like add to the data. But something like machine learning prediction, there may just not be a signal. You may be trying to predict something, you're trying these different models and nothing's working really well. And it can be really frustrating because you wonder always, "If I was a better data scientist, would I be able to do this?" But often, there may just not be a signal. A couple of things we talk about is how to emotionally handle it. We also talk about how to keep from, for example, surprising your stakeholders with a big failure. So one thing to do there is like check in weekly, for example. So it's not like you disappear for months and they don't hear any updates. Or they hear like, "Oh, everything's great." And then it's like, "No, actually, we have nothing." You don't want to surprise. You don't want to give people bad surprises. And the final thing is, I realize actually is failing is good. It means that you're growing. And actually, I do think it's possible to have a data science crew without facing a lot of failures. But that in and of itself is a bit of a failure because it means you're not taking risks, you're not taking on projects that potentially are stretching your abilities or have the chance not to succeed and you're playing it safe. But ultimately, these failures are what's going to really help you grow in the long run. REIN: That almost sounds like advice that would be helpful for anyone. EMILY: [Laughs] Yes, definitely. There is certainly some of this. Some of this book is kind of like general good life advice or general career advice, like the negotiating chapter. There's some [inaudible] to data science, but a lot of the principles we talked to really apply to any job. REIN: Did any of your academic work help prepare you to write this book? EMILY: Yeah, definitely. Especially for example, the chapter on negotiations, there's a lot of literature out there which I drew on to write about it. And I think one thing that was interesting to me is, so I have a blog and the post that got the most attention, like I don't really pay attention to page views, but in terms of people on Twitter not just retweeting but like quote tweeting it and saying how it impacted them was a post on sponsorship. And it's funny because this wasn't something I had realized would be so impactful because I didn't realize how many people weren't familiar with the concept. And so, for your listeners, you're probably very familiar with mentorship. Like the idea of a mentor is someone who gives advice. You keep hearing mentorship is very important. But sponsorship actually is often the bigger driver of people's career success because sponsorship is about someone giving you opportunities. So that could be someone recommending you to give a talk at a conference. It could be your boss bringing up your name in a meeting for an important project. It could be getting funding from someone. And this is something that I've been very fortunate to benefit from. And I do think part of what I write about is what it means, how to look for it, and also how to be a sponsor. Because you might not realize, even if you're pretty early in your career, you can already start being a sponsor for other people. And the idea of sponsorship was something that I had learned about because that's something that's very studied. And for example, there's a lot of studies that women are over mentored but under sponsored. And actually the mentorship ends up not being that critical where the sponsorship really is for advancement. JACOB: You were talking a little bit about how there's data science as in the practice and then data scientists who are something you are. But lots of different people in many different roles can perform data science. Do you see the practice of data science being spread out across more roles in the future? Like would you see a full stack engineer dipping their toe into data science more or do you see it as the opposite of it being sort of consolidated among a core team of data scientists? EMILY: Yeah, absolutely. I definitely think it's going to be spread out more. Again, it depends. Some people would say like some basic numerical literacy or being able to work in spreadsheets or be like, "That's not data science." But I think [inaudible], that's getting more and more common. You see in marketing, for example, it's become more needed in the field or if you have them, it's very good if you have some quantitative skills. Because understanding how ads are performing, working with a little bit blackbox like Facebook's ad algorithms, I do think that's becoming more and more common, especially as you hear organizations like to say that they're data driven. And so, it's becoming more, like people, even if they're maybe not inclined to do it, they're getting this top-down mandate of, "All right, we need metrics." Personal goal setting, that's becoming more popular, and department goal setting. So I do think that we're going to be seeing more and more people upskilling on this. I think that's good because I think no one can really have a job now if they can't write and write fairly well. And that can really advance your career. I think we're going to see something similar with like the ability to work with numbers. It's going to become more and more important. CHANTE: That's just a really interesting way to think about it, I think, because like you're saying, the writing and reading, these are critical skills that anybody should have when they go into the workforce. I do wonder in terms of math, I've never been a person that excels at math, but I think now that there's so many tools out there available to me that perhaps my gaps in understanding of something can be augmented with me using a particular tool or resource. So I think as we have more and more emerging technology or technologies becoming available to everyone, that this should be a skill set or a competency, that we'll see people kind of be able to grasp a little bit more and perhaps use more strongly in their current roles. Do you have anything in terms of recommendations for somebody like me who wants to kind of strengthen this skill but wants to do it in a way that's not intimidating and indigestible, I guess, is probably the word? EMILY: I actually haven't read these books, but I've heard really good things about them from colleagues whose opinion I trust. So I would definitely recommend them. The Art of Statistics is one. I also want to clarify, a lot of this is not talking about you need to be able to do advanced calculus or something. But it's understanding things like, what are potential biases that could happen in the data? Like what is survivorship bias and why would that matter? When might I use a mean instead of a median? And why might one of them be misleading? So The Art of Statistics, I think, is a good introduction to different Stats terms. I've also heard good things about How Charts Lie, which again is trying to -- I think a big component is also being able to thoughtfully consume information. So this could be a popular science article like having a little bit of understanding of like, what's a randomized control trial like? Why would maybe I don't want to trust saying that as 20 data points, or looking at a visualization and being like, "I wonder, maybe they didn't control for this." So I think those are two books that would be good to get started with. And I see also someone recommends The Cartoon Guide to Statistics. JACOB: You know that book? EMILY: I've heard of it. I don't know it. And I would recommend starting with books like that rather than trying to pick up like An Intro to Statistics textbook. Because I do think there's some books that are written to be an excess, but not like dumbing it down. But just like helping you know what you need to know and being really excessively written. JACOB: Even though it is helpful and is useful knowledge, understanding the math and having an intuition around statistics are different things and serve different roles? EMILY: Yeah, exactly. I don't think you need to necessarily start with the math, especially if you're not necessarily planning to become a data scientist. You just want to be more savvy. REIN: I will say that having done a few different kinds of math for fun because I'm weird, statistics is by far the hardest math I've ever tried to do. And it's not because the math per se is hard, it's because knowing what the right thing to do is, is very hard. EMILY: Yeah, I think Stats is really hard because for example, statistical tests, they will always give you an answer. So maybe people like, for the T test like comparing two different means, are they significantly different? You'll get back a number, but it may not be at all meaningful for answering the question that you had if the underlying assumptions aren't met. And I do think that is something that is very, very challenging. Same thing with forecasting with all the other things. It's going to spit back a number, but it can be hard to tell sometimes. Is that actually representing the answer to the question or am I just going to fool myself? REIN: Yeah. Or like for example, I'm trying to model this system and it has this property that I want a statistical sort of representation of which of these 50 different distributions is the right one to use to model this particular part of the problem. EMILY: And I do think there's some things that are not very excessively written that can really and it's like, "Oh, you're like diving into this massive academic papers." One book I'm excited to read, which I just ordered, is Statistical Rethinking. I haven't done much Bayesian statistics, which you have sort of these two fields: frequentist and Bayesian. But I've heard really, really good things about this book that is basically an introduction to Bayes, like how you would use it. The person who wrote it, I think, actually is in the anthropology department. He says he's not someone who necessarily planned on writing a stats book. But I think that's part of what makes it very accessible is he doesn't come from this background of like, "Oh, this is something I love for the pure joy of the mathematics of stats." He's like, "I need this to be useful. Stats is a tool for me to answer these questions that I care about." And I think that's what really helped him write a book that a lot of people have benefited from. REIN: If you're interested -- this is for our listeners, obviously, not you -- in trying to parse studies and understand causation and confounding and all of those things, a book that I would definitely recommend is The Book of Why by Judea Pearl. It is extremely good. Avdi, I think you wanted to talk more about sponsorship. And I would also like to talk more about sponsorship. Do you want to lead us into that? AVDI: Emily, I want to return, if we could, to the concept of sponsorship. I really love that you brought that up and that you're making the distinction between that and mentorship. I think that's something that we don't think about enough. And I definitely have observed, especially as I've gotten older, I've observed the difference of like I am more and more aware that there's a lot of implicit power just in like who I know, the groups that I'm part of, the private Slacks that I'm part of, and like you said, who gets recommended to somebody else for an invitation to a conference. There's a lot of little things like this that we don't always think about that really do make the difference, I think, or can make the difference in people's careers. EMILY: Sponsorship is one component of generally how important your network is. And I think networking can be a term people really hate. A lot of people find jobs through their network, as you said. A lot of people, if you want to start speaking, it often happen from people you know, and that is kind of the snowball effect. It's interesting because I think traditionally in some fields, most of it was about your network and your company, especially if you were at a larger company or one like a law firm or something where it really mattered that you impress the partners, that's going to matter when you go up to partnership. But now I think especially in tech, because people move around fairly often between jobs, like I found my most important network has been often outside of my company. This is how I've gotten speaking opportunities. It's also been really nice just to have a community. I'm a really big fan of and a part of R-Ladies, which is a global organization to promote the advancement experience of women and gender minorities in R. Now, as I think more than 80 chapters around the globe, like 50 different countries, and there's a New York City group and that's been huge for me because sometimes you just need someone to talk to who's going through maybe a similar experience or having similar frustrations. And you might not have that at your company because maybe you're the only data scientist or one of a small team and your work is very different than the others. So, it can be really nice to have that outside community. AVDI: Something that I really love about sponsorship is it can actually be easier than mentorship because when I think about mentoring something, I think it is shouldering a huge responsibility. But taking a few minutes to think, "Okay, here's this person. I think, really, their voice should be heard more. Who can I introduce them to?" It doesn't take long at all. And I think you're right that a lot of times, it can have a disproportionate effect. EMILY: And I also think up on the downside for the sponsor is that it does require some "capital". What I mean by that is let's say you recommend someone to speak and they don't show up or they do a terrible job, that reflects on you. But on the other hand, if they're awesome, that also reflects very positively on you. You recommend, "Oh, I think my direct report, I think she should work on this project." [inaudible] that again, reflects very positively on you. But like you said, sponsorship can be something fairly easy. One thing I mentioned in my blog post is Hilary Parker introduced me to a former colleague at Etsy. I talked to him with little bit. He referred me and that's how I got my first data science job. Or RStudio gave me a scholarship to attend the RStudio conference back in 2017, which was a great opportunity. But I do want to close, for mentorship, I think it can sound like a really big ask, like, "Can you be my mentor?" But I do want people not to shy away necessarily from smaller acts of mentorship. For example, a reference in our book, actually Trey Causey's on reaching out to people. And one of things he recommends is have a specific ask, have done your research before. So, for example, it could be like, "Hey, Trey. I really liked your post on interviews. I was hoping you might have 20 minutes to chat next week about whiteboard coding. Can I buy you a coffee at Pike Place? I'm available at these times." That's certainly not sponsorship. It would be like mentorship when that talk happens, but it doesn't have to be the super extensive, like, "Now, we're in a relationship and we must meet weekly." It could even be this one off thing that's very helpful. AVDI: That is a fantastic point. Thank you for laying out that template for asking. EMILY: Yeah, I definitely recommend we can link to in the show notes, Trey Causey has that post and he actually shows how like, the first is sort of an example thing he'll often receive and then he shows how he reworks it and why. Like I said, things like offer to buy them coffee and lunch, give a specific question rather than like, "Can I pick your brain?" I think it's really important if the person does have some public work, which they probably do, since that's why you would reach out to them to show that you've read that or listened to their podcasts or whatever. And that you'll be asking them questions or asking about things that they haven't already covered in a place that you could find. REIN: Yeah, I love that you brought that up. I think one thing that's different about sponsorship is that it has a different focus on sort of what the advice is about or what you're trying to accomplish. But the other difference is it's often much more goal directed. It's not just, "Will you be my sponsor?" It's, "I'm going to help you get this raise." "I'm going to help you get assigned to this project." "I'm going to help you accomplish this concrete specific goal." And I think mentorship works better that way, too. Like you were saying, it's not 'will you be my mentor' the sort of general way. It's, "Can you help me with X?" EMILY: That's exactly it. And I think that can be really helpful for people to think about, too. And also to build it up gradually. To not necessarily start with like a big ask. I don't know. For a sponsorship, that might be like, "I want to be the lead on this project and I've never done anything like that before." Think of ways you could show that you're ready. One example might be if you want to start speaking, maybe you could start with a lightning talk, which is like a five minute talk. Lower pressure, at a local meetup group. If they don't do it, maybe ask them. It's less pressure for the meetup group, too, because you'll have maybe 10 people given five minute talks. It's not like they're all coming for one person, and you get that recorded and then you can show people that. Or if you can't do that, maybe write a blog post. It's not the same as speaking, but you can show that you can communicate well. And so, thinking about ways that you can help build up the confidence of people you're sort of asking are looking for sponsorship from that you'll do a good job if they do give you that opportunity. JACOB: I was speaking with a friend who is a social worker and she was telling me how in her field, it's kind of considered a norm that you mentor. There's a formality around it. And when you're in your formal training, it's expected that you're going to take people in for internships so they can get their field experience and graduate and get a job. And it's sort of considered, "This is just what we do." And I'm not really sure what to say in terms of how that translates to our field, because there's people who are freelance who might say, "I'm paid for what I do and my time is money. And I can't necessarily mentor for free." There's lots of different ways that could be -- there's not really a standard in terms of how would you get some kind of -- there's no such thing as like a formal internship necessarily. So I think it's kind of a really uncharted domain. EMILY: A couple of things here. One is, there's a website called DataHelpers.org. Angela Bassa put this together and basically it's people volunteering to help folks and to mentor or get into chats. And often, we'll have like short bios being like, "Oh, specifically, what could they help with?" What I'm reading now is like helping out with R, like releasing your first R packages. I am happy to help people who are looking to transition into a data career, interested in the nonprofit sector. That's one place because I do think many people want to help with that, though there are things that help, for example, like I said, is like coming prepared. One reason we wrote this book is it's a really scalable way to help and to offer a lot of the advice rather than like, people needing to find our email and having 30 minute chats with everyone. Often what I'll do now is I'll ask people to reach out. I'll be like, "Oh, I wrote this book." I don't necessarily expect everyone to read the book, so I'm like, "I also wrote these blog posts. Let me know if you have any follow up questions, because I think that's really the best use of both of our times." And I do think within companies, there can be a lot done for having mentorship recognized as part of the formal career ladder, because I think that can be something for people to think about. If it's like the team may be better off overall, but maybe you have fewer commits. So your other projects are a little bit slower because you're helping onboard someone or helping a junior in data science. But I do think a healthy team will recognize that that's important work for you to do. And that in the long run, obviously that helps that person, but it also will help the team. CHANTE: That's awesome. I was just going to say that I'm going to assume that writing a book not only has solidified some of these things and reinforced the learning that you're kind of recommending, but also it holds you accountable as a leader. It puts you out there more so than probably some of your peers who have never written a book or who write blog posts. How has it changed or made your career better? EMILY: Manning where we published it, they do this thing called an early release program. The first five chapters first came available in May 2019, but the final book only came out about a month ago. So, it's still pretty new. I think we'll see the effects longer term. But one thing that has helped is being able to get back to people who ask us questions and being like, "Hey, because I was doing this book, I was able to take hours to think about this." Jacqueline and I think made each other's chapters a lot better. So it's great for us both to write. Obviously, we got feedback from Manning, sent it out for reviews. We got feedback from them. I think it's like much stronger than if I had just written a blog post about that. And it's been really heartening to see because it was in this early release, people already responding, especially when they do about specific chapters that help them. Yes. I'm excited to see who else this can help or how it might impact my own career. But I think I wrote it less of like, "Oh, this is definitely going to advance my career," which I think maybe a more technical book might have more. If you become known as the person who wrote the book on natural language processing in R, might help you get natural language processing job. I don't think necessarily I'm going to get my next job because I wrote this book. But I do hope that it can help other people with their own careers. CHANTE: Was there anything that you learned that you weren't expecting out of the process of partnering with her to write this book or about yourself, feel of anything in general or specifically? EMILY: Yeah. I think one thing we definitely both learned was -- so at the end of every chapter, we have an interview with a different data scientist, which was we really wanted -- we also have like sidebars and blurbs from data scientists around the book. But we did this because we wanted to get more perspective than just the two of us. So we have engineering managers using data science. Amanda Casari, she's an Engineering Manager for Google Cloud. She was in the military. She has one perspective. We also have someone from Airbnb, we have someone working for the ACLU. So I think it was great to hear that diversity, but also that there were a lot of commonalities across it. And I think in our epilogue, we talk about three of them that I think we had an instinct for when we began the book, but really crystallized to writing the interviews. And that's like data scientists need to be able to communicate. That was a huge theme that came across the book because often people are saying like at the end of the day, it's not your technical skill. You do need a baseline of that, obviously. But at the end of the day, what makes the biggest difference is things like communication skills. Also our second part was being proactive, so you're not going to get handed this perfect problem [inaudible]. Fortunately, not like a [inaudible] competition where it's like, "Here's all the data that's available. And here's the well-defined question you need to answer." It's often much more messy than that. You need to be proactive in working through that. And finally, as we talked about earlier, community. And that's really important both for advancement but also just for, I think, your emotional health and your happiness as a data scientist. CHANTE: I love that. Those are in truth, probably data science fashion. You paid attention to the qualitative things and quantitative things. But I do think that those are interesting insights you gained after kind of stepping away and looking at the whole process. REIN: I wanted to circle back to the mentorship thing briefly. You mentioned that mentorship should be part of all formal career ladders. It should be something that's valued and incentivized at work. One of the things I've noticed is that very few companies actually train people to be mentors. And so for me, this is sort of you get "promoted" from an individual contributor to a manager, you're suddenly expected to know how to do management things and there's very little training there. And I think there's some awareness that that's a problem. But when you get promoted to a senior engineer and you're now supposed to be a mentor, I know very few organizations that actually do train senior engineers on how to be mentors. EMILY: Yeah, I think unfortunately, you're right. And I think that's a great insight. Because I do think, as you said, they're starting to get more recognition in the management part. One, there's a good amount of popular books out there that I think are really good, we recommend some in Rs. It's more common now to maybe have coaches available or to do programs, like this formal management training. But you're right, I haven't ever seen something of like, "Okay, we need to train senior engineers how to do it." But I think there's training. But I do think one thing that helps, though, is that the company could do is just recognition and telling people like, "This is an important part of your job." Because I think some folks do have that mindset or natural inclination that they're like, "I really want to help people and I'll sort of prioritize that first." Some others don't. And so for them, it can be helpful to be told like, "No, this is a really important part of your job." One of my former colleagues at Etsy wrote a great post about being a tech lead, which wasn't really a formal position at Etsy, but he basically took on. And he talked a lot about what that means to him. And a big part of that was basically like, "My top priority is unblocking other people. And this means I'm going to do a lot less committing and that sort of stuff on my own. And my primary job is to help my team through things like code review or talking things through." You're right. I'm not sure what a good solution to that is. And I think there could be more information like books out there about what's it like being a senior individual contributor? How do you mentor? I think also I saw some on Twitter. When you're changing orgs, what does it mean to come in as a principal or very senior engineer? Because I think that can be quite challenging. And not a lot of people have written about that experience. REIN: That's something I'm going through right now. And the resources that I can find to help me are very thin on the ground. So I'm mostly reaching out to other people who are in this work to say, what is it that people who have our title do? [Laughter] REIN: What do you do? And I get different answers. EMILY: I think maybe you need to write a blog post of your own, not necessarily about all the specifics. Your org, that might be different, but how you went about finding it out. REIN: Maybe if I blog, that's a thing I could blog about. [Laughter] REIN: I think another issue with mentorship is that mentorship is a system 2, like a slow thinking thing. It's about learning, but being productive is a system 1, like fast thinking thing? I don't know if folks have read Thinking Fast and Slow. But Martha Acosta would call a paradox here, which is you want people to be better mentors, but what you incentivize them to do is produce. And those things seem to be incompatible. EMILY: Yeah, that's also hard because sort of very related to that is it can be harder to measure. It's like, "Well, now do you start measuring your mentees' commits? Is it their satisfaction, like with your mentorship? And I think that's why people can shy away from it versus something like, "Oh, you shipped to this many products." Or, "You squashed these many bugs," or so on. That's a challenge that I think part of that is -- so Etsy released a career ladder on their blog, an engineering career development ladder. It's actually very interesting, so I think it's a great ladder. But it was interesting. One thing they actually shied away from was doing too much quantitative because the problem that they found, like for example, rather than saying like, "You need to speak at two or more conferences," because that could be limited by your role, your team or a personal situation. And it might not actually get at the real intent behind what this competency is trying to -- The reason of speaking is maybe you have like a presence in the broader community. Maybe that's the core thing you're trying to do, you contribute to it. And there may be a lot of ways besides speaking that you can do this. So actually, they say they don't have score calculation or graphing an individual. And they find that they can still work to reduce subjectivity. Otherwise, it's a lot of potential for bias to creep in, but that doesn't necessarily mean you have to do like rigid numerical guidelines and quantify everything. REIN: So on my bookshelf, I have the book, How to Measure Anything. And next to it, on my bookshelf, I have the book, The Tyranny of Metrics. And I like to think that they're fighting it out. EMILY: There is also that saying, it's like kind of whenever anything becomes like a metric or a target, it becomes useless basically, because people can figure out how to game it. And I do think you're right that there is some push or pull and there's not necessarily a right answer here, but just trying to keep things in balance because there are benefits having some metrics and there are drawbacks. And so, sometimes there's no shortcut to just doing kind of like thoughtful work. REIN: There's actually a question I've been meaning to ask for a while, which is about data science that's directed externally at customers, at markets, at the environment versus data science instructed internally, at how teams work, how the organization is performing, all of those things. My very limited perspective is that most data science is focused on analyzing customer data or marketing campaigns or ads or things like that. How many organizations and people are doing data science on internal organizations? EMILY: One area that that's common in is sometimes they're called people scientists, but data scientists in HR. Google, for example, has a big team of this. And there were these things like Project Aristotle, which is around understanding how teams work. They also do tons of work on trying to understand hiring. And there, it's a mix of some social science. There's a lot of good literature out on this of like how do we mitigate bias in hiring. Big point there is doing structured interviews. There's also literature on teamwork and then mixing that with the data they collect internally about the org. So, I think that's the biggest place that I've seen of people analyzing internally because a lot of other data are like what are our most popular courses? It's kind of internal data, but they're taken by external people. So it wouldn't really exist. It's like we had no interaction with the outside world versus Google and people scientists. That kind of data would still happen. CHANTE: I want to ask this question because I feel like we have you on here. We don't have as many female data scientists on the show very often. But I would love to just kind of talk about the obvious that we don't see enough women in data science. And so, wondering if there's anything that you can kind of talk about here in terms of your experience. I'm sure that if you reflect back to your younger self, you maybe didn't foresee that you were going to be a data scientist. And I'm really thinking again about these women or young women and people of color, marginalized folks, who may think this is a great career. We see that this is a top career choice right now, but I don't think people really know how to get there. And I think as we see with the trends in terms of STEM, for instance, it's the same situation. We just don't see as many girls and women getting into STEM and kind of keeping with it, we're people of color. So any recommendations and kind of firsthand or secondary experience you can offer in response to that? EMILY: I have lots of thoughts about that. The first is, I think for a marginalized group, having a community is even more important and especially finding a community of other people with similar backgrounds. I mentioned R-Ladies. There's also, Gabriela started that. She just started a new thing called AI Inclusive to make artificial intelligence more inclusive. There's PyLadies. There is, I think it's called URG, a group that just started for sort of wide ranging, like all underrepresented groups in data science. There's a group called, I believe, Black in AI. So I do think it's really important to try to find some role models. I also do think, honestly, it's important to seriously consider the types of companies or programs that you're going to join and to look at. I think it's easy for companies to say like, "Oh, we really --" The first step is even them maybe recognizing there's an issue with having, say, they're engineers. It's 5% women or 1% people of color. Some companies aren't even quite there and some are but they really think of it as like, "If we'd only just hire more, then everything would be fine." But like, "Oh, maybe they don't exist." And there's a great post by Rachel Thomas that's debunking the pipeline myth of like, actually there are people there. And often a bigger problem is the environment they face when they get in the field. And so, I do think it's really important to ask some critical questions. I actually was very fortunate. Etsy, I think is around 30% women engineers, which is definitely better than the field in advancing people of color. And it's funny, I didn't realize how many senior women engineers I knew and that's not very common at a lot of companies. And even though I don't work directly with them, I realized, "Wow, that was pretty cool," and really important. Etsy's leadership team is 50% women. And this is partly reflective of like the Etsy sellers are, I think 85% women. But I do think it's important for folks to really think about like, is this going to be a good environment for me? You can look at the representation. You can look at what initiatives they're doing. I think in general, companies have become more aware of this. But to be a little bit skeptical and to really try to find an environment that will not tokenize you and takes these kind of things seriously. CHANTE: Yeah. I appreciate that, because those are all great resources you just listed. I'm going to try to grab links and put them in the show notes for everyone who's listening. And I just appreciate the fact that you're thinking about it and you've clearly had some conversations with folks throughout the industry. I would love to know, maybe hear in your own voice, why you think it's important to have diverse representation, especially in data science, as we move towards a more data driven world. EMILY: One is, as we're building these products, somebody might have heard about, there were -- I think it was maybe something like hand soap or like Google products that wouldn't recognize darker skin tones because that wasn't in the training data. Facial recognition as well. Hopefully, I would be less common if some of those people were represented in the teams making it. But I also think you don't want the -- there's a lot of companies looking for data science benefit from these skills. You want people that are good, and if we do believe what I hope people do that there's no white people or men are inherently better at this, then opening up these opportunities, you're going to get stronger people. You're going to get stronger teams because these folks have these skills or can gain these skills. And so, it's important to help encourage that. And I do think there are very encouraging signs. Like in our book, I think half or more of our interviews were women and that was not even a challenge. Especially in R, you just can't -- I don't know how you could make a -- and unfortunately, I've seen it at one conference in particular. But making a speaker lineup that's all men, you would have to very deliberately do that. If you ask a bunch of people, "Who are some of the top R programmers you think of?" Or blog posts or other things like influencers, a lot of women actually rise the list. I think there's a lot more work to be done for having other underrepresented groups like people of color. I don't want to make it seem like, "Oh, it's so horrible and bleak out there." There's definitely been -- and I think R-Ladies is a big part of why you have that. And so, I think there are efforts and work that people are doing to make the field more inclusive. I also saw, I think it's Guido van Rossum, the creator of Python, making a commitment to mentor underrepresented groups. I've seen that with the Python core, how they're changing their practice to bring in a more diverse set of people. CHANTE: That was great. That was really good. I agree with all that you just said. And I would love to see more diversification and a bigger push for this as we move closer to this kind of place where people have, like I said, access to technology and resources like we've never seen before. I have a gut feeling that we're going to see some things shift with education in universities, kind of. After Corona virus, that maybe we'll help folks get into more programming and stuff like that, more like university programs get accepted and change the way that we're teaching people in the first place. So I don't know. The world could be different next year. We'll see. EMILY: And I do think there are some successful efforts. Harvey Mudd, which is one of the top computer science programs in the country, I think now has 50% women computer science undergraduates. And this didn't happen by accident. There was some effort. But it is certainly possible. And I think there's a lot to learn there about how other groups have succeeded, not just in raising the representation, but in making it an inclusive environment. CHANTE: Yeah. Touche. I want to say, I think Rice, too, have been making some efforts to try to bring in more females. Is that right? You attended Rice, so maybe you know. EMILY: Yeah. Although it's funny because statistics actually has historically been even at the graduate level, like most say 50% women. Unfortunately, there's not great data on non-binary folks. But yeah, it's interesting and I wonder if that's why R is more inclusive because a lot of people come from a statistics background or a computer science background. CHANTE: That's an interesting point. I don't know. But I want to hear the data on that. I want somebody to do a project on that. I will definitely read it. EMILY: Actually, I'll send you the blog post. Reshama Shaikh actually wrote one. Why are women flourishing in R but lagging in Python? I do actually want to add a quick note. I did take some computer science classes at Rice and there were a lot of efforts around the curriculum to make it more inclusive. One of which being redesigning the intro course to make it more project based and also more accessible to people who never programmed before, and to go away from this assumption that like, "To succeed in computer science, you need to have programmed before," because it was less likely that the women in the courses would have and that could really dissuade people. CHANTE: Well, shout out to Rice University for that. That's awesome. REIN: Who has a reflection? AVDI: I think mine's very simple. It's just mentorship does not have to be a huge commitment to be useful and sponsorship is often as important, or more important than mentorship. CHANTE: One of the things that feels overwhelming is I do get hit up a lot about helping people. And I think sponsorship is a way that I can maybe more strategically try to kind of nudge people down that path where I can sponsor you. And you mentioned that it requires some capital, but then you kind of went further and said or you explained there's like some social capital there. I would wholeheartedly agree with that. And I think it doesn't necessarily mean you have to go in your pocket and write a check. But it could mean that your reputation's on the line. So just like, what are the things you're willing to do for people who need the extra boost or a push or some support? And I think that that's going to be a big thing for the year, given all the things you're seeing already, people experiencing some layoffs or folks who maybe were getting excited about graduating and they're nervous about getting a job right now since our economy could be in a recession. I love that you brought that up and that was something we spent a lot of time with today. Thank you. JACOB: I was just thinking about how mentorship is actually really possible without the mentor even knowing that they were mentoring someone. I can share that numerous panelists on this podcast, including several on this call, have said things either on the podcast or elsewhere that have stuck in my mind and I've used to shape my career. And it wasn't because I asked about it or solicited. It was just something that I was able to draw upon. So yeah, I guess that's a call for action for everybody. Just sort of think about how the ideas you're putting out in the world could be useful to someone, even if you aren't intentionally trying to be a mentor directly. REIN: I have two very quick ones. One is that if you're a mentor and your mentees aren't coming to you with already formed concrete goals for what they want to accomplish, it's your job to coach them into that as a mentor. And the second is, I guess I'm writing a blog post now. Great. Thanks. I feel like I've been assigned [that one]. EMILY: [Laughs] I think you have because I was actually going to say my biggest reflection is this: I don't think there are enough resources for senior engineers on the non-technical side of things. Both for maybe advancing their own career, but also for part of the responsibility now is to help others and how do they do that. So I think that's something that now I'm going to be keeping an eye out more for, including your blog posts hopefully. Maybe one day contributing that myself, because I think that stuff is a gap that I'm seeing. REIN: We just want to thank you, Emily, for coming on the show today. I hope people go check out your book, Build a Career in Data Science from Manning, that you co-authored with Jacqueline Nolis. In fact, Manning has been kind enough to give our listeners a discount code. They're giving listeners of this show a permanent 40% discount, which is good for all products in all formats for everyone. Again, this is a PERMANENT discount code for all Greater Than Code listeners. Use the code PODGREAT20 every time you shop Manning. Thank you again, Emily, and thank you to Manning. We'll talk to you all again next week.