happy-hour-dec-4-2020.mp3 [00:00:06] Hi, what's up, everybody, welcome, welcome to the @TheArtistsOfDataScience Data Happy hour. I'm so glad you guys are here to join me. It's been an awesome week at the podcast. Had released an episode with near Bouchon earlier this week. I hope you guys had an opportunity to check that episode out. I absolutely loved talking in the air. I know you love the conversation we had before we get started, guys, I just want to do a quick shout out for my sister. My little sister Jase has built an amazing app called A Close By. Close by is a tool that lets you know if what you're shopping for on Amazon can be found at a small business. It's one hundred percent free, helps you discover new indie merchants and see what they have in store this holiday season. Find unique gifts for the love for your loved ones. And you're doing all of this while helping support small and local businesses. Check it out, guys. I hope you guys get an opportunity to download the app and. Yeah, welcome, welcome. Welcome, everybody. Wow. The office area is popping. Thank you guys so much for coming in. I really am really excited to have you guys here. So, yeah, let's let's open it up for questions. Does anybody have any questions? [00:01:27] I do. I have been delving into Tableau recently and I am trying to understand where what the line is between when you would say I can do this and say Python versus I should do this in Tablo instead, or maybe I can do this in SAS or I should do this in Tablo instead for an operator, that's like what you do all the time. So can you kind of help me understand, like what the what the limits are or where it'd be just smarter to do one or the other. [00:01:58] Reya let you take that one over immediately. Oh no, I can't hear you. Yeah, it's still not. So quick question. If I could just repeat it is. Okay. There you go. You can hear me. Yeah. Hear you now. [00:02:12] Ok, I don't know what's going on this. I recently got a new laptop for work and I'm using that anyway. Well I'm still learning on the on Python side, but I do do Excel and SAS. Sometimes I do some analysis in there versus having it done in Tableau. So it's kind of like the standard consultants answer. It depends, you know. So I'll think about like how hard is it to do what I'm trying to do in whatever language tool I'm in? It's simple stuff like summary's averages, you know, some basic stats, meetings and stuff like that. I'll do that right now. Aquarium staff and I want to be doing more like slicing and dicing. I'll take it into Tablo because it's a great tool for doing data analysis. So I'm not in this kind of an around fashion. I'm not really answering it. But that's kind of my approach. [00:03:12] And Eric, when it comes to using Tableau, is Eric here? Where's he going to see him? I think he's here somewhere now. Eric up. Not here. Come in, Joe. Sorry. Sorry, Eric. Joe, from a architecture standpoint, if we're using Tableau, like, what do you see being an issue? Right. Because typically if we're using Tableau, it's not just for us to do our own exploratory analysis. That's data scientist. It's usually to give to some stakeholders so that they can interact with it and and play around the Data. Right. And usually that's going to require some type of back and processing of some type of pipeline that takes the raw data, models it and then puts it somewhere where then Tableau can sit on top of it, right? Yeah. [00:04:01] Yeah. [00:04:01] I mean, sometimes you'll see people using Tableau server and just doing one of the Data transformations where I think we're typically called out to help people out of that problem and the tablet servers were crashing. So we recommend it also. My my colleague also he's on the line too. So we use this not arbitrary. So but in your case, I'd recommend backing your what are your Data, something like that warehouse and dimensionally modeling that data and then servicing that dimensional model to business users that also facilitates things like software that let go that route. So it's basically a project that you and your team definitely consider ways you don't have to someone to make the data always available for your performance. Right. So under one of our clients and architected. So they're they're using a data warehouse where they. So that was one thing, so many queries and transformations in that data warehouse and the warehouse has to be at, that was also at capacity. So the business wasn't feasible. The person in charge of analytics is getting calls from a lot of the calls you don't want to get. So let's see if it's something besides analytics or something. [00:05:30] Is this a good time for a shout out to maybe or some other back end tools that take care of some of these details and school that maybe you're hired to do directly from Tablo? Shout it out. Yeah. Yeah. How many of you here have used before? No, not use it. OK, interesting. Yeah. So what is so DeVita is basically it's a tool that brings you can almost kind of like object oriented capabilities to SQL like that's the huge headache with SQL SQL can do all kinds of transformations, but there's no notion really of libraries or managing code or anything like this. And so he tells you to take care of some of those details and then also manages transformation workflows for you. And so in terms of like putting it back in behind Tableau, that Skill-based lets you do that when maybe it's not practical to put Python in that back in and you can potentially take some complex python flows as well and then translate them into detailed language. [00:06:25] So I just wanted to go ahead and go ahead. [00:06:31] Does it work like any do that we use for scheduling their services? Sorry. Does it work like I need you to do the BBB thing? [00:06:42] It's different. So DTT is a package where everything it is, it's built for Analysts'. Right. So it'll start this trend more and more called analytics engineering. [00:06:52] It's a new buzzword that's popping up. But yeah, yeah, it's it's actually the new job titles for it. And some people are hiring analytics engineers now. But the point is DVC allows analysts to create their own data transformation pipelines using nothing but select statements. So that's pretty cool. So you can make slowly changing dimensions. [00:07:14] You can make that table dimensions using simple statements. DVC adoption is growing at an insane clip right now. [00:07:22] I would say that there's two schools of thought right now, like traditional Data tools that you're probably more familiar with, like your informatic because you're Talon's or whatever. But then there's more of the analytics, engineering by code and infrastructure as code, which is rather particular. [00:07:38] If you haven't seen it now, you're going to start seeing a lot more of it in twenty two. I want, especially since the company behind First Town Analytics has raised like forty five million dollars a couple of weeks ago, and that's on top of like 30 something million or whatever the series. But a few months ago they got enough money to plow under this project and it's getting widespread adoption. So I'd say that might be alternative to just running Dimity and then back to the original point tableau to the warehouse doing that. And that frees up a lot of administrative burden for days for that stuff. [00:08:13] So I wanted to swing back around, if I could, Derek's original question, because are you Eric, were you talking about you as the individual analyst when you pick one or the other? Generally speaking, when you're talking about sharing something with Tableau in an organization because it costs so much money, that's usually decided for you up front. There's usually some sort of standard sort of practice. So I just wondering, are you talking about you by yourself or when do you decide to actually buy Tablo versus Power Buy or something else? [00:08:41] Yeah, I think I'm more talking about me by myself because if I'm in an organization and we have Tablo available, then there's going to be some like you said, there's going to be some standard of like people are like, well, I need this in a dashboard, so please put it in Tableau. I was like, no problem. But I was wondering, as I'm looking at it and it's, you know, you know, local versus low code or no code solutions, and why are there certain times when it's like it's just way easier to just do it one way or the other? Anything that you kind of come across. So that's why. [00:09:09] So just rule number one, I want everyone understand this. There is no such thing as no code does not cut it up. I do not buy it. Maybe low code if you're lucky, maybe, but never know code. So what I would say, Eric, in answer to your question is whatever's fastest, if you know, you can get stuff done faster, tablet use that if you know you can get stuff done faster. And Python is that that's why I typically use Excel most of the time. And then when I know I can do something faster and ruina, that's why I'd fire up our studio. [00:09:36] Well, thank you. And hopefully that answer your question. I want to take a moment real quick to just point out that there are three people in this chat right now who have been guest on the podcast. [00:09:46] Obviously there's Carlos, but my good friend Mikiko is also here today, who I miss dearly. She was a mentor with me at Data, says Dream Job. Mikiko, thank you so much for showing up. And we also have Brandon Kwok, which was hands down, one of my absolute fav. Episodes that are recorded this year, so if you guys get an opportunity to check that one out. Thank you guys so, so much for overcoming a lot of awesome people here today. And I know a lot of you guys have questions. So, Eric, thank you for that. Great question. Let's go ahead and move on. If anybody else has a question, how about this? [00:10:24] Just to raise your hand and I'll call you, cause I have a question because you brought up Tablo A.M., So sells for quite Tablo sells for about slack recently. What are we expecting to see from these major companies and how will it affect Data science? Because I think we're seeing a lot of aggregation of tools. And like a small group, I want to know what Salesforce into an salesforce, Einstein. I don't know it. And I'm surprised how little I hear about it in the data world that I know. [00:10:52] Chuckling Ben, go for it. [00:10:56] I chaired a conference once where I had to introduce that guy. What is the name? Richard. The the Salesforce guy. So he had that company was it mate of mine that was bought by Salesforce knows that. So he's in charge of Einstein, heated NLP research at Stanford. Not an important thing, but just a funny fact that I the conferences freaked out. He didn't show up until like five minutes before the talk slacker. [00:11:23] So, Carlos, I know that didn't answer your question. So everybody has the answer to Carloss question. [00:11:30] What was the first was the first part of the question. [00:11:32] So forth. But Pablo, another buying flack. What's the play for them as they grow? And what should we expect to see in our community related to that kind of fourth power coming up? [00:11:43] I don't know. The word on the street is that black acquisitions, more targeted at Microsoft at least, is what The Wall Street Journal pundits say. Right. So Whoknows and so forth is a juggernaut, though, so it remains to be seen. But as far as anything goes, I mean, I haven't I need to echo your point, at least in circles that I run. And I was more of a running joke after that. People had implemented it at Salesforce in the Salesforce environment, and I don't think they got a lot of value out of it. That's the people I talked to. And maybe that's other people that have gotten value out of it. And Salesforce has enough of a sales and marketing arm that makes it make things look successful to. So who knows. [00:12:21] But Mikiko, as I need there, go for it. [00:12:24] Yeah. So so when I was over at this little startup called Walk Me, I worked for the VP of Operations there and he was a director over at Salesforce and he told the story about how one day he had walked past this glass room where finance was meeting about potential future acquisitions. And he saw a bunch of names on that list. And this is like maybe three, four years ago at least. And he was like, yeah, like, you know, and this was when they had acquired Tablo. So from his perspective, he's like, it's very interesting to see, like, which companies that they acquired next. And then when Slack was acquired, he was like a company, too. So I think like I think no one likes Salesforce, you know, like, hats off to them. I worked as a consultant implementing Salesforce, like CRM and marketing cloud solutions and like they are a money making machine. And I think one of the reasons why their money making machine is because, like, they do plan things out years in advance. So, for example, the Tablo acquisition was actually planned out many years in advance. But I think part of it is that so some companies will they'll do like there's the IBM model where they will just kind of acquire companies and they'll run independently. And then there are some companies who I'm sure we can kind of all think of which these companies are. They will buy companies to kind of kill them just to, like, kill our competition in that space. And then there are other companies who like they will sort of like buy companies out to incorporating augment kind of their own like products and services. [00:13:57] Autodesk was very is very famous for doing this. I worked there, too. They had about 70 plus products out there, like their biggest wave layoffs in the last couple of years, still across like mobile desktop and across these different industries. And their whole thing was written by it. We're going to kind of like, you know, like white label it or rebranded and then incorporate into the stack. I don't think that's what Salesforce is doing. I think Salesforce recognizes that, first off, like tools like Slack and Tableau, although they have like B2C adoption, like people love those tools, they're really moneymakers for like the enterprise level. So I don't know who's ever been in a vendor conversation with Tableau in Slack, but they're actually really expensive at the enterprise level, but they're really good. So it's like one of these things where if you know, you're worth you know, you can kind of like charge it accordingly. And so I think, like Salesforce recognized the first off, these are actually enterprise tools are not really B2C. Right. So and secondly, they're all tools that their current companies use. I think anyone who's worked with like Salesforce, Data and Tableau can tell you that, like, it'll crap. After the first x gigabytes, ray, like you can't even run those tables, and so I think they probably bought Tablo, they're going to improve the connection to Tablo. They're also probably going to bring in a lot of their expertize and security. Salesforce also has not been broken into as much as like other tech companies. [00:15:24] They still have. But like their security breaches have not been as significant, I think, as other companies in the space. And then they're kind of going to let them run independently. At some point they might try to like rebrand them. But I think it won't be for a while now in terms of like as like an analyst or as a data scientist or as a machinery engineer. I don't see kind of like your role will change. Hopefully it just means that when you work with a dashboard in Tableau that is like Salesforce Data heavy, it's not going to crash or they will try to build some, like, intermediary tools for like Dataprep Dataprep is probably the the sub product of Tableau that will see the most impact and change, because that's the one that is not as used, frankly, by a lot of analysts and data scientists. But it's also something where it could be super useful. They've already got a skeleton there so they can kind of like like really boost it up even more. So I don't think like your role as a design or machinery engineer, anything will change in that regard. It just might get a little bit easier. If you're a director, then you don't have to figure out the the bundling on that. But they will probably, like Tableau did recently, they will change their pricing to make it easier to acquire those tools. So that's just kind of like my my heart take a lot of this business talk. [00:16:38] Yeah, but Brandon doesn't need any comments on that. [00:16:41] No, I visited not Salesforce and met a couple of people from Einstein, but that was a couple of years back. So, you know, I would imagine that a lot more at that time. They were just meeting each other. People were, I mean, there as a potential customer. And people from Salesforce are like introducing themselves to each other. So that's how I do like, this is a very recent thing. But I, I don't have the latest news. So like I said, I was like two, two years ago. [00:17:07] Thanks. Thank you. Mikhaela. That was very, very in depth. I love that. Let's see if anybody else has questions. Again, at this point. You guys have questions just to wait. [00:17:15] I don't want to just want to point out what I think. I like Microsoft, Google, Amazon. Right. And we know which one then. Third place in cloud. I feel like Salesforce is a Microsoft Office suite, a G suite away from joining that group and a lot of enterprise ways. I think that I was kind of expecting to hear more about that. So, I mean, that's like what I imagined as our five year vision at that. They really want to do that and be enterprising and have cloud that's like gatepost and stuff like that. All right. Next question. [00:17:46] I don't think you and just want to shout out that. Also, David Telo has joined who is also a guest on the podcast, also one of my absolute favorite episodes as well. So check it out. David, thank you for being here. All right, cool. So if anybody has a question, feel free to just flag me down that wave. If not, then I'll just scroll down. Yeah, definitely go for it and go for it. [00:18:10] So a lot of big names in here, all of the people who do hiring. Has there been anything that stuck out on a resume for Data science machine learning for any kind of position like that? What's the one top three things you've seen on a resume that really sticks out at you on maybe on a recent hire this year? Yeah, yeah. [00:18:32] For me, it's an interesting project name. They'll immediately captured my attention and then following that up with a well thought out, well constructed project. I think that is something that I personally look for and really does grab my attention. How about how about you, Brandon? What do you think? [00:18:51] Yeah, I was thinking about that when you had asked. I nothing that stuck out, particularly like I can't think of a specific thing. I think that's what you're asking me, like a specific thing that I saw. [00:19:02] I mean, I just generally a little here, like a highlight from this year, a recent hire that you're like, wow, I'm really glad I got this guy. And I only got him because this part of his resume is said X, Y, Z, not at the resume level. [00:19:13] At the interview level, yes. At the resume level. It was more of I look for people who are doing outside projects, passion projects on their own. And if they happen to have that, that that helps a lot. That would be my short answer for that. [00:19:25] I know Ben does a lot of you know, he's well versed in this recruiting space. [00:19:30] Know we? Well, I we actually did a project where he analyzed four hundred thousand resumes, not Data science resumes. There was part of that. So we did built a neural network on that Data that HireVue. But I think the thing that's really surprising when you get on the hiring side is the the disappointment that all the resumes look the same and these are good resumes. So for everyone on the call, if you have your resume and you're very proud of your resume, maybe you wrote it in low tech or is it latex? Like what? Playtech. Right. So you wrote your resume. It looks amazing. It looks like latex. Looks like 20 other resumes or 50 other resumes, everyone puts in the same keywords. So the thing that really stands out, like Brandon said, it's the passion projects. And I've joked with other Data sites managers about the theoretical resume that doesn't exist. Imagine if you have a resume and it was like four get commits to that went into public projects and you could just go look. And I think the last thing I'll throw in there is interviews. They suck because you don't have enough time to really evaluate the candidate, to really evaluate. You don't have enough time resumes even worse. And so when you're interviewing to go get a job, you're actually you're perceived as being especially as a junior scientist, you're not you're not as good as everyone on my team. [00:20:41] So everyone on my team is better than you. And that's the perception, even if you have a project with that. But if you have a gig commit that got accepted into one of those projects, what does that do to my perception? It actually turns that completely upside down. I think you must know more about that framework than everyone else on my team. And I'm not saying that's an easy thing to do. People get frustrated when I say this because, like like it's it's not an easy thing to do for you to go get something committed into a public project. But that's something we've talked about that that would really stand out as cleaning up variable names count. I guess they'll go look at that comment. But it's still it's still impressive that you found something. There's a lot of low hanging fruit. I've been really surprised. Open source projects that we just rely on all the time. There's a lot of technical debt that is alive and well in these projects. And when you start looking at the code, you realize, yikes, like there's like a mix has like an exception for to check whether or not you have open c.B, like the way they've written. You're like, whoa, what is this doing buried in the code and why is this still here? It's well, it's not a priority and they don't have enough people working on it. [00:21:44] So anybody else made any hiring decisions this year? And if so, what was something that stood out to you? Interest me. [00:21:49] So why I ask? That is the last couple of years I've been involved on the other side of recruiting, bringing people into our our company. And so I get to review the resumes. And I know that every resume that comes across you get like half a minute face time with it before you like. You know what? No or yes. I have an old manager used to make the joke. I take the half the stack of resumes thrown in the trash because he doesn't hire unlucky people. So, like, resumes your first few seconds into the door. So that's for software engineer. Like, we'll list out every language. You don't tell the world when we'll we'll do all the projects, all the GitHub. So we have but I don't know what it looks like for our data science resume. So if you look at my resume, it says machine learning like maybe two spots, but it's two pages of software engineering because I've been in the space for eight years. But now coming into Data science is like, what should I be? Politti is what I'm after. [00:22:40] I think the thing about science is like it's it's a meta skill, right? Data science isn't just one particular skill. It is a skill that comprises several other distinct, discrete skills. Right. You've got to be able to code. You got to be able to think like a businessman. You got to be able to think like an engineer. Right. You got to be able to think in all these different ways where these different hats. Right. So like Data science itself is a matter of skill. And it's hard to evaluate, I think, a skill on a résumé unless you have a very concrete, tangible artifact that showcases your ability to flex these motor skills. Dave, what do you think? [00:23:15] Yeah, so I was going to say I haven't done any hiring this year because I'm also a solo entrepreneur right now, so I'm not hiring anybody. But in years past, what I would typically look at is I get somebody's resume. And to be honest with you guys, I wouldn't even read it. First thing I would do is I look at the name and then I'd go find the model LinkedIn and I would check them out on social media because to to Harp Pete's point. Yeah. You know, Python, you know this. You know that. Ba ba ba ba ba ba ba. Can you communicate. Have you done public speaking. Do you have projects, do you post, do you have intelligent conversations with people in the space. So that's what I would typically do. So I'm not as bad as like taking half of the resumes at random and throwing them away. But I wouldn't say, look, people don't have a social media presence. They would definitely get put on the back burner. And I would concentrate on people that have a social media presence because that was a way for me as a hiring manager to ascertain do they have this eclectic mix of skills that we call data science these days? [00:24:13] Yeah, I haven't done the hiring this year either. But I mean, I've got several Data teams in the past and I think Tego Dave's point a scan, the resumes, I wouldn't say I read them in depth because it's a lot of in depth readings do. But the things that I would focus on was the first thing is have like any real world experience and it kind of bumps you up. Right. [00:24:36] And I can't I can't remember how many candidates have rejected just because I didn't have to experience or somebody who had did have experience. [00:24:45] I don't really care the a PhD. That's nice. But, you know, if we could find somebody who has real world engineering experience, machine learning experience, for example, I'd much rather be on that because I put emphasis on that precisely because I my team has work to get done and I don't have enough time. To see I mean, that's particular to my problem, but again, that's the sort of role that I was hiring for in particular. But I think Teko Davis saying to do a lot of homework on underestimates. I do like and I always think are the resumes that I rejected, too, because there's there's a lot of false negatives in there, right? Yeah. Well, I like to talk about your job as a hiring manager to say quickly, like you're not there to say, yes, you're supposed to filter people through and get to a candidate as quickly as possible. A good candidate preferably. But your job is to find as many reasons as possible, somebody not reasons to accept somebody. And that sounds horrible, but that's how it works. [00:25:45] So and yeah, when I was a hiring manager, if you if you had yourself up like you did a meet up and it was recorded and you were on YouTube and I could watch you explain the technical concept with just great eloquence, you're coming in for an interview, which is not simple, just that simple. [00:26:03] Ben, after you after you go, I'd love to hear what Jennifer looks for in a resume. [00:26:07] So, Jennifer, you're going to be on the spot after Ben that I just had a memory from the head as I worked at a hedge fund and oh, my gosh, resumes so the hedge fund would freak out about these resumes they could find and I don't know where they found these remains. So the hedge fund managers regionalist resume and he's like, this is amazing. I love this resume. Like, what does it say? The very top of this kid's resume in bold. It says, I won the 2011 international graduate school coding competition where like like you'd react like, wow. And then the next line says, But I was disqualified because I was only 17. And his resume start to like, what the hell? Like, I'm pathetic. Like, that's what you think. So the hedge fund would find these resumes of these people. I'm not saying that's normal. And this that's not even a useful thing to tell this audience. I just thought that when I think of a resume they give like these legendary hedge fund resumes, they would find, oh, my gosh, have you guys ever seen a resume like that? Like some like two line statement, the top to shut up, like an 11 year old kid. [00:27:06] Quite a few resumes like that in my career, where I would say for even pedigree people, it's like it's like a coin flip almost. [00:27:13] Right, because I work for people who know Ivy League graduates and, you know, top in their field and all this stuff. And I, I swear to God, these people like included half the time. So it kind of does. That's where I just learn not to really care about the resume too much after. I mean, I've and I work with somebody who is one of the world's leading endocrinologists deal, and that went in a different direction. And there's just you never know. People are people. So there's the recipe part. There's also the part that going to work out any higher. Right. [00:27:44] So, anyway, is the 11 year old kid that presents a dedicated conference? That's a resume. A headline. Jennifer, let's hear what you look for. [00:27:52] A lot of what we do at Intel is really similar to what I'm hearing here. The first thing I do is I go out and look at LinkedIn. It's a very can resume. Usually they do look alike. That does not surprise me a bit. Do something to make it look different. If you do Data visualizations, why is that not on your cover letter? Why is that not embedded in your resume? Do something that really stands out. I've seen some that are just different columns with with things in it. It just helps it stand out a single word. I'm not going to make or break something, but if you're LinkedIn profile has additional files that I can click on, has things I can buy a YouTube presentation. You're right, Dave. That would be fantastic. [00:28:42] So I just want to shout out Nicole real quick. Sure. A really awesome article this week talking about the interview that I did with Carl Gold. Thank you for writing that. That was a very well-written article. Poor thing. I'm wondering, do I need any contribution to this discussion regarding resume's quantify everything? [00:29:01] I don't think we've touched on that point yet and figure out how to make everything internally consistent. So if you are putting periods at the end of points, just keep that rolling throughout the whole year. And I don't like, like have different sections formatted in different ways by then. And consultants like to have things go in groups of three. [00:29:23] So I had my first fight on my first point on quantification and adding numbers, the second point on having consistent grammatical formatting. And then the third point is just if you can figure out some way to always have a third bullet or at least an odd number of items in each list, I think that it's nice to have. [00:29:46] Thank you very much, Michael. I want to open up some of the newer people that have come in. First of all, Greg, OK, he is here. Thank you so much for hanging out, Greg. I see Manpreet has her hand up. I'm afraid if you had a question, go for it then. Afterman, I'd love to see if Timothy Gordon has a question or maybe even. Centeno hopes he might be a a go for it as you work for a corporate company or do multiple projects. [00:30:13] How do you incorporate that in your resume and then that? I'd been working on different technologies for different because it seems like you need to have a one page project where you give and apply for a job. So how to incorporate those things and explain it in the Towards Healing star format is key. [00:30:34] I would say you can bake that in your resume. You can have a one line opening sentence that says as a exwife. See this to say, as a junior data scientist at this company, I had the opportunity to work on several initiatives, including the following. [00:30:50] And then just bullet point situation for this project was dot, dot, dot. My task was to do that Data the analysis I performed or the actions I took was dot, dot, dot. And as a result, I observed or achieved dot, dot, dot. That's pretty much kind of my boilerplate answer to that. Mikiko, what do you think? And I know you've had a lot of questions like this as part of. [00:31:14] I know. Right. And I always have the same like ad libs answer to which is like I used X, Y, Z tools, technologies to solve a big business problem by accomplishing like I take a solution which had an impact of the quantity. [00:31:34] No, I feel like if you give those like that's like a one bullet point, you can kind of like dove in deeper if you want to. But I think that's usually like pretty pretty clear. Yeah. It's always just like those like four. But I think people like going back to that resume thing. I don't think I think all the times people put so much information on the resume that they don't need to because sometimes they're a little bit worried about not having enough. We're not having the right stuff. And then it kind of clouds the clarity. And so I think like and like. So for my team, we've hired, like, a lot of people, like when I was over at Le Wango, if we saw that, you know, or someone was just like those like four elements. Right. The tool or technology use the business from your Trius all the impact as it was quantified, maybe even the challenge for each project. They'll be pretty good, even if you don't have all of it. That kind of form and I think is is pretty, pretty clear and translates really well, hopefully that. [00:32:33] Answer your question. Let's go to Timothy. Timothy, if you have any questions, feel free, go for it. And after Timothy, we'll go to know how in me. [00:32:43] But my question or my question is like, how do you manage, like imbalance Data? [00:32:50] So we're like our target variable variable rates up around the world. We're trying to raise up around maybe twenty percent of the actual datasets and the were the different types that can be used to help better classify that target variable while still being taken into account the the alternative. And so it's a classification problem. How do we classify something that has only maybe 20 percent of of the actual reference for it and or the different types of people use? [00:33:26] I think a common technique, a text textbook answer, so to speak, would be probably some type of synthetic creation of examples. So smote is a good technique, but I open the floor up to maybe either Monica or Dave. If you guys got any suggestions, apart from the answer I just stole from you. [00:33:48] Yeah. So the classic ones are you can downsampled, you can up sample, you can create synthetic using smoke. Those are the top three. Usually you never know under the no no free lunch theorem. You never know which one's going to work the best. So usually you try all three. You can also use something like the like the random forest algorithm, for example, allows you to do stratified sampling, which essentially is a form of downsampling. But since you're doing multiple bootstrap iterations, it helps even things out. So that's a good thing. You can try as well. [00:34:17] Monica or Brandon, any advice or tips when it comes to working with imbalanced Data for a classification task? [00:34:24] I would just echo what David said and also a lot of Data signs I think that doesn't get discussed is the the thinking about it. Right. And the the thinking like, OK, I tried these three sampling methods and this one works best numerically in terms of. Right. And then the thinking about like do I still want to consider this other one that maybe works second best, but somehow it makes better business sense for what I'm what what the objective is, you know, because you might do something different if it's fraud or it's a very low sampling rate, you might do something different than if it's I don't know, you just happen to have like Data from twenty eight or some other like a coronavirus Data where you're like this is. Totally skewed, this is not a normal year, Data, so you got to think through like what? What is it that and that's when the subject matter expertize comes into play. And then that's when the general thinking comes into play. And that also goes into hiring for me as well, because I think about like if I have another Data scientist here, what what do I need help thinking through? And it's always questions like this where I know what the textbook answers are. Right. I can read that in a couple of minutes. Right. But which one do I want to choose that like really makes the most sense given given the situation, I don't think you might have any other contribution or do currently. [00:35:37] So I think everyone summed it up really well. But I was going to say that the domain expertize is really helpful when you're looking at any Data. So if you don't know the Data that's within your industry, it's kind of hard to really decipher anything, any variables you don't know any with the categories even mean. So that's, in my opinion, the hardest skill to obtain from a data scientist. So if you think about the mystical Data unicorn, you know, you have your math and statistics, you have the hacking knowledge and then you have that domain expertize. And you can always learn math, you can always learn it stuff that's kind of those hard skills. But the domain expertize is the longest thing to get a grasp of. [00:36:28] Timothy Hope that answered the question that everyone awesome. Let's open the floor up to where do we go, Shantanu? And then after Shantanu, we will see Jeff Jacklin. His question and I see Dave had his hand up. We'll get to you as well. David Tylo, go for it. [00:36:50] Thank you so much. Like, I'm really happy that I am going to ask some questions. I have all the like always followed by with our boy here on LinkedIn. And it's really great to see all of you on a video. I have to I have two questions, actually. The first one is on the line of what you guys were discussing a little while before on the resume apart. Like I have heard, a lot of people are saying that all of the resume is kind of looked like the same and they don't stand out. So what would you say would be like a good example of do you say and I have seen two kinds of resume and one where you have a photo and you have some numbering of the skills that I'm at seven, five, then eight and something of an almost. And there is one bit I think the original ones are the chronological orders and all those things. Which one would you say would work or anything else that we might be seeing nowadays that can stand out? [00:37:45] I'll open this up to either a David or Carlos. Carlos, looks like you feel very strongly about this. [00:37:52] I hate seeing seven out of ten python, five out of ten. I'm like, that doesn't mean anything, man. Like Data science is not the field you want to give me fake numbers about and stuff like that doesn't make any sense to me. Like, just you have the skill or you don't and don't put the scale. If I ask you a basic question, you don't know what to do. You're allowed to Google in real life, but like people will put crazy stuff like, oh yeah, C++. And I'm like, OK, like, what have you written in C++? And they say, Hello World. I'm like, do you have fifty skills in your resume? I like focus on the ten. You actually know like, like a clown like you, it's you want to avoid losses. That's what I want to frame it as avoid losses. Don't go for big wins. Like stick to what you've got have to LinkedIn like go to fundamentals, avoid losses. [00:38:39] It's my point just to piggyback on Monica and Carmelo's, you know, for example, in the bank, that there are two questions that I think the financial industry is driving right now that everybody wants to know what's going to happen. First question is, how are we going to deal with negative rates? Because there's a sense that they're coming our way in. The second question is, what's going to happen with commercial real estate? I mean, are we going to have the same decline that we saw back in 2008 from how the housing market now the scenarios with commercial real estate? [00:39:16] So if anybody goes and does a project of that right now and puts it on Harp, I guarantee you you get an interview in a bank tomorrow because no one knows what's going to happen here in the United States. Some countries are seeing negative interest rates, but not the United States in. This is the first time the commercial real estate is going through what is what we're seeing today. So I what what is going to happen to avoid your typical typical project to go and just do something even if you don't get fired? There's nothing that nobody is doing right now. That's something that the employer actually is comfortable. You know, in that once you get into the interview, then you called the big guns on me, I remember Michael helping me out, me to when not my Mikiko. I don't know if you remember. I asked you last year, do you know this model? And you pointed out to me through this one book, and that's how I went from second and third interview, because during the interview you said, you know what, I saw that model, blah, blah. So, you know, I have to go through five interviews to get the job. But it was like, you know, you're one step at a time and start building a one on one interview at a time. So if you want a big the big job with the big paycheck, you've got to go one step at a time. But don't try to do everything I want. That's my opinion. [00:40:51] And I hope that answers your question. Let's open it up to Jacqueline. If you had a question, go for it. And after Jacqueline, we'll go to Greg Coquille, ask question. Then after Greg, we'll see if Matthew has a question. [00:41:06] Hi, everyone. So going back to the topic of domain, expertize that when you brought up, what would be the advice that you would that you would give to someone like me that starting in the field, how could I develop these skill? [00:41:24] Monica Doing it would tackle that Denef. Monica, let's hear from Nicole or Jennifer on that one. [00:41:30] Sure. Do you have an industry in mind already that you want to get into? [00:41:35] I'm still keeping my options open, but recently I've looked into maybe retail or more in the tech industry. [00:41:43] Ok, so that would be the first thing is to kind of nail down an area or a few areas that you want to dove deeper into, because I've got a lot of questions where, you know, people want to get into Data science. And my next question is where do you want to work? And they say, well, I want to work at a data science team. Well, there's teams all over the place in that domain. Expertize really will get you far. So pick a couple of those industries that you want to focus on and understand what kind of problems that they're trying to solve. So if it's the retail industry or if it's in a general tech or maybe a health care tech industry, just really focus on those problems that they're having and how you can solve those problems. [00:42:27] Nicole, what do you think? I think Monica suggestion to start with the industry is is a great one in terms of potential projects. I also really love projects that are focused on your local city because that Data is really easy to find. Most municipalities have an open Data portal and everybody that you encounter at like any networking event that's in that city is going to immediately be interested in the topic. They're going to you know, they're going to be like, oh, yeah, where do you find out about my neighborhood? So that's another great place to start if you're kind of looking for an interesting questions that haven't been answered yet. So I definitely recommend that, you know, you can't go wrong with just brushing up on your python skills and also learning a data visualization tool. And you talked about how well a lot in this session particularly. But I think people maybe don't know that tableau public is free. And so long as you're using publicly available Data, Data that it doesn't matter if it's exposed, it will be the underlying data will be shown publicly, then you can even post your post your visualizations to the cloud using that platform. So it's a great tool not just for Iida, but also for, you know, if you build a model and then you have a resulting data set, then you can visualize that. [00:44:01] And Tekla, awesome advice. Nicole, thank you. So, Jennifer, do you have anything to add on to that and then after that? OK, Greg, any advice on how to get domain expertize and then after they go through your question? [00:44:11] So domain domain expertize, can you guys hear me domain expertize is is research. Right? So if you're an independent one, you're not going through school there. There's there's so many information out there become part of a community that is doing the same thing as you. You know that that is of common interest. I'm not a data scientist, but yet I joined this community so I can learn. Right. And that gives me some sort of level of domain expertize. So it starts with your research and it's definitely needed. It gets you more than half of the way in terms of getting somebody to hire you. So, Jacklin, you say you're you're you know, you're interested in. Well, if you search more about retail, you'll discover that retail has limits, right, then you will search for people who solve those limits or issues. Then you start thinking, OK, what are the products that I can work on that address these issues? Then you become a domain expert. So it starts with research and then it grows from there. [00:45:22] So to put it simply, I wanted to say one thing too, about resume because I couldn't help it. Right. A lot of times we want to put a killer resume because we want to impress. We want to make the biggest impression on that person. So we have a tendency to put so many things in that resume. But think about this. The only thing you need is to become the best storyteller with the few words and connect with the higher connect with what you feel like is the biggest issue. And that also to research. But you already have another thing. You already know what you're capable of. You already know what you've worked on and then mapped out to map those to what you think that hirer has in terms of pain points and then say it in simple words. Harp did say, you know, starlit. That is key. I have my resumé changed to the star method when I applied to Amazon and it worked wonders. That first sentence said any everything, whatever project I worked on. I said, what was the biggest thing that I did in that project and why was it important? What was the result? That's it. Start method and then a couple bullet points and there are no more than four. Make it clean. Sweet connects directly makes that person understand why you are interested in that position. So with that, my question. I want to build onto what Timothy was saying, because my mind is like, well, I want you guys to help me understand. [00:46:49] Let's dig deep. Mine. All right. So if you think about this, these guys Data, they achieve something that's amazing that I think. But they only needed one hundred and seventy five thousand samples to train their model and the population of two hundred million, that's less than one percent of that known population. And do you I don't know if you guys read these articles, but who can explain to me, you know, how come less than one percent was enough to determine that this model is strong enough to speak to a population of two hundred million that it feels like amazing and also speaks to, OK, what kind of sampling do you need to do? How do you determine what that simple population is to make sure that it's a representative of, you know, what you need to be able to predict at such precision? Right. And then to have a second level to that question is when you enter these competitions that these guys got into the C. A. S P can't remember what what it stands for when the CSB releases those samples for these guys to compete in do these samples to tell you what the task for these proteins are ahead of time to give you a heads up on how to predict the shape of these proteins. All right. Maybe I'm speaking a little bit too long. So for the first question, who can help me figure this out? This is about simple population. [00:48:26] Sorry, I was probably I have a quick question, but something that we sometimes forget is that like statistics and stuff is about identifying the fundamental Data generation process. And when protein folding, it's a mechanical process. So like it will either fail, in which case it'll be like a junk protein essentially, or it will succeed. So like there like they're discovering a truth. That's the mechanical. And that's interesting because it might have a true randomness. That's a very, very small in other processes, might have very large, true randomness. So I don't know if that answers the question slightly. But when you ask why is one percent enough? Well, that's a function of what the true randomness is, because that determines like how the sampling works in terms of being effective for recognizing the population. That makes sense. Doesn't see your direct question, but just a quick note. [00:49:18] It makes sense. Any thoughts on that? It looks like he's frozen, but for me, yeah, I was just thinking a lot when you were asking that question, I, I'm just going to say I don't know the answer. Yeah. Yeah. What about what about a theoretical mathematician, Jacklin here. Yeah. He's stumped. Ask you something. [00:49:40] So what Carlos is saying makes kind of makes sense if it's a mechanical thing where you so it doesn't matter. One of the population is in it will either pass or fail. But think about this, though. The Foulds, you have ten to three hundred power possibility of folding, so. Is going nuts all week about this one, so that's why I was asking and sure, but it's like flying an airplane, right? [00:50:08] Like the engineering problem. Those are deterministic problems, like the airplane is going to go up or it's not the fact that it can go up at a technically infinite amount of angles like the wings and still fly. Isn't it indicative of the difficulty of getting into the air? So I don't think the idea that there's an infinite amount of possibilities actually implies anything about the difficulty of understanding their current process. We're going to like a weird philosophy like philosophy of statistics, but I'm not for the ideas connect to the way that you're connecting them. [00:50:38] I have an example from the audit world, so I have an audit background and we do test of one if we're testing out systems. So think about you. You're putting in your password into your computer. It's either right or it's wrong. You can put it in as many times as you want. You only have to test it once because it's either right or wrong. So I think that's that same kind of thing that we're trying to explain, which I believe makes sense. [00:51:07] So thank you. [00:51:08] And a great question stumped all of us here. Let's see if Matthew Barzun has a question. And then after Matthew boss, I will open it up to see if Mark Sasho or Austin has a question. [00:51:20] I'm just quietly listening, but I'm not able to. Thanks for coming out. Appreciate you. Mark Austin or Sasha, any questions? [00:51:28] Yeah, I actually have a question that I'm facing right now. And my role, it's been really switched from Marvy analytics product role in building Data products. And so I'm curious for others, what's your decision process for bringing in pre-built package or module into your code base? Sometimes, you know, like Saikat, Warren has some package that makes sense, like things where you can easily build it yourself or implement this new kind of module for analytics is just like I you to this one. So wherever I go and whatever, but now I'm like making decisions for how our products can move forward. And so example of why is like choosing what NLP package and why you shouldn't spacy other packages as well. So I'm curious what other people's thought process is around that. [00:52:20] Matt Housley, any comments on that? [00:52:22] Problemi Space Bar is not working for some reason. OK, so so in terms of kind of mapping product and data science together, my understanding that correct me, how do you choose what assets to deploy in terms of successfully realizing a project and interpreting correct framework? [00:52:41] Pretty pretty much as like when when choosing certain packages that balance chain let me build it myself versus implementing this new package into our code base. [00:52:52] Yeah, that makes sense. I mean, in terms of product deployment like that, I think our bias is always toward starting maybe a first draft of the managed solution X draft with a popular solution, then with maybe something more obscure and then moving on to customization. I feel like in the Data science world, there's a lot of like not invented here syndrome and a lot of hobbyist syndrome. And I feel like every Data science team maybe has like secret super power, and that's the one you want to focus on. And so the other piece is just maybe grab off the shelf if you can. Sorry, that's pretty vague, but that's kind of my thought process, what Matt says. [00:53:29] I mean, they finish each other's sentences to this point, but the there's there's a term undifferentiated, heavy lifting. Right. So if if you're doing undifferentiated heavy lifting, by all means, find like we're big fans of managed solutions or something like quarter to you, that's undifferentiated heavy lifting. Typically find somebody. There's a lot of companies out there in the Data space right now. I'm working great tools, open source projects. If you're going to go that route, manage open source. There are teams of highly intelligent people, very qualified people working on these systems. Building these systems out of all of them will tell you they're also some of the best things of the Earth and and so forth. They should totally use these products as your focus needs to be on what are you great at? This is that pointed out and you're not going to be great at everything. You're going to be good at like one or two things. Realistically, I'm really good. And the rest of it farm out because it's not just choosing the tools for your project because it's a huge opportunity cost, because like, the more you're managing software with open source, for example, if you if you have a stack of more stats, show that twenty five percent of the time when using open source, you're also maintaining your implementation of that open source package on average. So what do you want to spend your time? By all means, do open source if that's what you want to be good at. But I think as far as we're concerned, the you can be good at everything. So pick your battles. [00:54:57] So, Mark, can I ask you a qualifying question? So when you say when your. Product owner, what is the scenarios in an internal product like an internal light system, or are you building something that's for commercial resale? Because I used to be a PM in SQL Server at Microsoft. And one of the first things you want to do if you're building enterprise software to resell to people is you want to check all the licensing, all the open source licenses that everything you're using. [00:55:20] And thankfully, we have a legal team. So whenever I bring in another thing, I go into a security section, I'm like, can I use this? So that's been really helpful. When I say call myself a product owner, I'm Morsell. Like, we're a team of people who are building out products that will go out to our customers. And on the Data side, design team, kind of like taking the statistics and the models and actually putting them into production. [00:55:47] Yeah. So it could be it can be kind of tough. I would generally speaking, say generally, when you're building software and you're selling it, you typically want to decide what IP is really differentiating for you as a go to market strategy. And typically you want to own that. Typically you want on that. And that's typically what we what I usually did with my teams when I worked in Microsoft was like, OK, do I need to own that cool that we might want to build it ourselves? We own that IP because it differentiates and then we can push everything out. Of course, at Microsoft we have the other teams like Azure and all that kind of stuff to build everything else. But what differentiated us in the market, we want to know that from an IP perspective. [00:56:24] I wanted to add real quick and I fully agree with you, Dave, from a business perspective, strategy is key. Right? So you want to what is your strategy for releasing it to your target customers? Is it speed or is it scalability? Is it somewhat something else like cost effective strategy? So all of these counts. Right, if you're trying to do it faster, is it open source that's going to give that to you? Or if you want to build it in-house, is that the best strategy? So always look into the final business metric that will give you the biggest bang for your buck. Right. So that's that's pretty much what I can say. [00:57:08] There is a counter example of what you probably don't want to do. I remember I this meet up before it started, this guy was telling you about his database that he wrote for his company. [00:57:18] The battles I'm talking about. Are you talking about Europe, about this thing? And I asked him what it did is to go to of Data. And I see how is it different than postgrads? It's just like making a database. But now that's used in production in this company. So this guy kind of has a job for life unless he wants to leave, even if he does leave, I can't tell you what's going to happen to that company, but might not be fun. So that's an extreme example, but that is a real world example. [00:57:44] So a prime example of why when I was an enterprise architect, I did not like developers. I didn't want them writing code at all was for crap like that. [00:57:53] There's in the book, The Phenix Project, they talk a lot about the constraint and I think his name is Bruce. And he he's like he's so important because everything going through him, like every database, every code base, he knows all the answers. At the end of the book, I fucking find out, like, how they manage to maximize and they maximize the constraint. At the end of the book, they say, Oh, I wrote this book. People wrote to us and said you would have solved the problem faster by firing him. And I was like, it completely blew my mind. I was like, that's almost true. Because if he's forcing his way to be like the middle of everything and you don't have to approve and incentivize that. So very interesting when you set that code base database stuff. [00:58:35] We have a real life cruiserweight that we met. There was a guy who was maintaining this IBM DB2 database mainframe at had a customer. [00:58:44] He'd been there since the 80s, have been writing all of the code, didn't document anything, didn't want to have the silo. Right. So we didn't want to share anything with people because God forbid, somebody takes a job. So we'd been there for a really long time. Then one day this guy just doesn't show up to meetings and shows up for meetings and, you know, and a couple of days go by because you're not passing away, actually. And sadly, this mainframe also ran the order system, which was really key. And, you know, it took a lot of excruciating. They had to bring in a whole team to figure out what this guy had written. Right. And so, again, extreme examples, but these are sort of like you to talk about technical debt that keeps accruing at high interest rates over years and decades. [00:59:27] This is what happens in a way that happens, is these things happen in the. Go ahead, Monica. [00:59:35] I'm sure I was just going to say for Joe, in that instance, it was written in a language that was still alive and not something like Pearl. [00:59:44] Oh, no, no. This is written then. [00:59:46] I think it's very archaic language and experience as well with the with the people that just go and build broke systems and then the companies rely on them and then the retiring soon with no documentation. It's written in language that nobody understands anymore on a mainframe, so it actually happens more commonly than some people would think. [01:00:09] It's funny, I'm hearing this is funny. I hear these stories because I can relate to all of that because I come from manufacturing and I've seen like folks with 20 years of experience. They're the worst trainers. They know all the processes. They know where the documents sit, sit and everything. And when it comes to training newcomers, oh, my goodness, they're the worst. They will not share because they're thinking somebody is out there to take the job. I mean, it's everywhere. And it's amazing that I'm hearing the same comments right there. It's crazy. We all do that. [01:00:40] I guess it's a human thing because you want to add anything cool. So let's see if. Oh, yes, go for it. All right. Apparently not. So let's see if either. Austin, Sasha, Venkataraman, Naresh, Nick, if you ask my questions, let's start with the US. [01:00:56] And I don't have any questions here today. I'm just been voraciously taking a bunch of notes from everything in the chat and what everyone's saying, because it's so much great information and I'm just happy to be here and be able to absorb right now. [01:01:11] And it's recorded that it's for awareness. Oh, I know. [01:01:13] I watch it. Yeah. It's a recorded transcript is up and the chat will be up as well. [01:01:18] I haven't NLP challenge on the transcripts. [01:01:20] So when we talked about yes, I'm going to be launching this early next year. So you guys keep an eye out for that. And also, you guys should be like Austin, who has listened to thousands and thousands of minutes of my podcast. We like Austin. What is the metric minutes? Because that's what that's what Spotify put out for a record. So let's see let's see if either Sasha or Nick or Venkataraman or nearish the or Sasha. Any questions. [01:01:49] I'm just hanging up. That would be some nice people offered to give me feedback on my resume. But some advice that they gave was I'm not sure I agree with like but I mentioned it earlier in the chat about the soft skills. For me, it's just a waste of real estate, I think, because I don't think I should be listing things like leadership and written communication, because if I didn't have those skills, I would not have made it through college because I think it's redundant. [01:02:18] Yeah, I think if if you have more valuable things to put, like, you know, real estate on the resume is prime, I wouldn't recommend putting that above some other more important skill for lack of a better word. [01:02:35] But yeah, I kind of in line with you on that LinkedIn LinkedIn came out with an article saying that the most in demand skill in America is oral communication, according to their analysis. And I was like, nobody talks like that. So nobody's going to put that on their resume. And I know that you're just saying, like, it makes no sense like that I would be a skill. So I agree. [01:02:54] I think some people might disagree with that, depending on what kind of oral communication they're talking about. Fair. [01:03:01] They're fair enough. I just felt like the phrasing of it, like they did a LinkedIn skills gap versus job post and what people put on our profile. And like, I've never I would have never thought to type in my LinkedIn skills like an oral communication center. So I think there's like a mismatch there. I would say show this skill through the narratives don't just like list it. [01:03:21] I agree with you on that. So, yeah. Great point. Sasha, thanks so much. Let's see. Open up to, I think Naresh Venkataraman or Nickless start with Naresh. Any questions. Amaresh. All right. Doesn't look like it. [01:03:34] Venkataraman any questions are not actually nice session. This is what's I like it. But earlier I had to follow through the LinkedIn postings. I would like to see how I can access this chat window for today's event and the events. [01:03:53] You have to subscribe to the podcast and keep an eye out in the show notes and it'll all be there. My friend Nick Urban, how's it going? [01:04:01] Well, thanks are great. No questions specifically, but I couldn't tell if that NLP challenge was a real thing or not. What was the plan there about? [01:04:12] I'm still ideating on it. It is going to be a thing next year because I just I've got a ton of chat transcripts, so this will be a thing. Let me just think through what the thing is going to look like. [01:04:24] And it's the idea to do a better job than something like red dot com or even some of the big players. [01:04:31] I don't know what the idea is. I still haven't thought completely about it, but yeah. Yeah, keep an eye out for that. [01:04:37] Cool. So I see no issues unmetered. So if you want to take over break, go for it. A question on brigading portfolio. [01:04:46] I mean, I've been waiting for years, but I know what I mean. [01:04:57] Yes, it's difficult to hear you. So if you can just stand steady and. Right into the microphone. [01:05:03] You guys hear me now? Sorry about that. Yeah, OK, so my question is about creating Data portfolio. I was, but I always get confused which which one is the best practice, if anybody can suggest me the best way to create a portfolio. [01:05:23] Yeah, that is a massive, massive question. The best way to create a portfolio is, um, and I don't have a great answer for you, but let's just say it starts with a really clear problem statement, really good, clear definition of what it is that you intend to do, followed up by a, you know, a good analysis plan that you can put up. And then you stick to that analysis plan and you execute on it by having well written code, spaghetti code using making good use of functions, doing some good exploration is good data modeling. And then, you know, a being very generic here because it's a huge question and you know. Yeah. Do you have a specific part of that question that I can answer for you? [01:06:12] I can you kind of it coming back and I'm just taking it back. I guess I and go in and then figure out if it's best for me, but it's just that I'm just thinking back. [01:06:25] But yeah. Yeah. [01:06:26] And in terms of like which project would be best for you, I think that has to be very much in line with where it is that you are trying to go ultimately with your career. I think I'm going to just flip this one over to Mikiko because I know she's got some great advice. Um, yeah. Go, go. [01:06:45] Yeah. So and so there's like there's kind of like three pieces, right. Like first off is kind of like knowing who you want to become, which sounds like a very sort of existential question. But I feel like not a lot of people ask themselves that when they are considering how to build their career capital, because that's that's what you're essentially building. Your GitHub portfolios are your portfolio, assuming that it's in GitHub or a personal website, most likely in GitHub, your portfolio or your resume, your LinkedIn, those should all really be just extensions of like the sort of idea you have for your kind of, like, future destination. Right. So that's the first piece is understanding kind of like the kind of work that you want to do and not just like the like Data scientists, Alex Engineer, Data Atlas Machine Learning researcher, but specifically like, you know, which bucket do you want to be doing strategy and like analytics, like, do you want to be that internal consultant? Do you want to be doing research? Do you want to be doing more like engineering work? Right. If you understand kind of like what Buchheit you sort of want to be doing work in, that's a good first step, because I think that will serve determine to some degree the kind of portfolio you put together. Right. So, for example, if you are someone who is aiming for like a research role, right. Let's say like at Google Brain. Right. [01:08:03] Or Deep mind, then most likely what they're going to want to see is familiarity with some of the more sort of recent cutting edge tools and also the ability to, like, implement research papers. That's going to be like a really, really big thing, right? If you're someone who is going more towards, like the strategy analytics bucket there, I think communication screen analysis, being able to like really properly scope questions is going to be really key. If you're someone who is going more towards, like engineering work, then there you might not actually have to do like like a computer vision, like full stack, deep learning app, but maybe you instead put together that's something that's just like really cleanly architected, has a good test coverage, good documentation. So I think that's like the first Balestrieri. I kind of like what kind of work you want to be doing. Once you then have that kind of idea, then it's figuring out within it like, OK, so what is kind of like the key, like responsibilities you will be doing within that bucket and you can kind of curate your portfolio accordingly. Now, what I will say is that when you're first starting skills, you don't have to go find like the most nesh out there Data you don't necessarily have to go in, like, randomly scrape data off of websites or do anything. When you're first learning skills, it's actually talk to you like find really good portfolios or really good projects or Col's and see how people approach like those those like tasks on like, well, benchmark datasets. [01:09:35] And you can find plenty that on cable. I think once you go through a couple of those there, then it becomes a lot easier to understand. Like first off, what might you need to develop in skills or the kinds of problems you want to work on? And then also what would be like a good sort of structure for how to represent that project. We have like a few. There's a few. I'd recommend people out there if you're really. Kind of hurting for like a really clean structure, there is like cookie cutter Data science, I would like to look that up. They do have like a template that to me is frankly a bit robust for what I've typically used. Like, I don't have that they have a very sort of specific lens with like multiple like folder hierarchies, which, you know, you don't have to use that, like the really big one. But that can be a really good start for understanding, like what needs to be in a portfolio project. But for portfolios, I always tell people, like, aim for quality over quantity. Right. If you have like two to three, just like really, really good projects, I would spend more time on those as opposed to like trying to 20 million sort of half half. But GitHub reposts like that kind of doesn't really look good. [01:10:48] Yeah, absolutely. Cookie cutter Data science. You can trim down their their structure and make it see your needs and believe it or not, actually have a presentation that given that dedicated conference, talking about tips to make a portfolio project that will get you hired. And I promise you, it'll be chock full of great tips and great advice. And hopefully that that helped. If you got, like any more specific questions, definitely feel free to ask or just, you know, let's not have the time to. [01:11:24] And I've been learning a lot. I just thank you, everybody, for your time. [01:11:29] No, thank you very much, man. Thank you for coming through. Let's see if Toshi has a question. And after Toschi, we'll see. I haven't even asked if Jennifer Nickless have questions that just assume you guys are here to give advice and hang out. So if you guys have questions, let me know. My bad Toshie, if you got a question, go for it. I have no questions as of yet. All right. All right. Jennifer, Nicholas, you guys are good. I'm just chillin right on. And we'll do this and an max out some last minute questions. Now's your chance. Otherwise, I'll start wrapping up office hours. Greg, I see you. [01:12:07] Yeah, no, no, it's not a question. It's nurse question just made me realize. And also Mexico's answer made me realize why we struggle so much to determine how to position ourselves in the real world. We spend so much time in school being guided through projects already planned for us, right from high school or whatever university that we don't think about. OK, we're handling it ourselves. Right. So no wonder we're struggling to figure out how to position ourselves. And if if if we get conscious about that, the sooner college, you know, maybe we struggle less. I don't know how to fix that issue. Some observation that I made. And I think that Mexico's answer was sublime. Very, very awesome. Thank you, Mikiko. [01:12:55] Thank you so much. And to, you know, just like piggyback on that, I think a skill that is not taught and it really should be and I was very fortunate to work with these teams is sales and marketing and how do you like brand and message? Kind of like your contribution. So like when I started my career, right, five years ago, I actually had a massive stuttering problem and I graduate college and couldn't find a job. Right. So I got a job working at the front desk as a hair salon in like one of the biggest areas of San Francisco. So I had to talk a lot. Right. And it was something that hit me at one point because after that then I worked, did analytics for sales teams. And I was like, man, these sales guys are so good. I like selling. Like, they could be at like a company office hour and like selling the fact that they played ping pong for like two hours, like in the afternoon drinking, like tossing back a bunch of like, you know, Micki's and brewskies. And they could kind of like sell a message that almost sounds like, yeah, we're like team bonding. [01:14:01] You know, we're really gelling, we're improving communications. You know, we're meeting with clients. I'm like, well, that's that is a lot of, you know, wrapping up a pig and lipstick, you know, making it glorious. But I think sales is something that's like not really well taught, you know? And I think especially like for people who come from like immigrant families like myself, we're kind of taught to really keep our heads down, you know, like be cool, right? Like, you know, don't make waves like, no, don't put yourself out there. Right. Like a Japanese family. Right. We have this thing like the the stock that goes above the grain gets cut. Right. So I feel like in some regard to do really success, to be really successful in athletics or any career. It's that sales bit. It's being able to confidently understand, like what is your value and to courageously package it to to say, like, look, this is why you should take notice of me, you know? And I think that's something that like it's I still develop it, but like sales. [01:15:00] That's like that's the skill that is worth developing, because it translates through everything for me is such a rock star ability to focus on, because I think the thing that gets lost a little bit is that when we talk about all the things that we want to develop, the skills we want to develop and how we want to apply them, we always think of it in terms of what can we do for the business? And we lose track of the fact that maybe a little bit selfish leave or individuals with aspirations and ambitions and actually like we kind of want to achieve. And there's nothing wrong with selling results and being prepared to say this is my contribution and being able to package that up and really put yourself out there for your own career progression. But it's not just about the skills that you've got and what you've contributed to the business. Actually taking a little bit of yourself and thinking, you know, this is how I separate myself from the crowd of other people. They are focusing so much on the processes and the Day-To-Day job. You've been able to to sell yourself a little bit, I think is so important for yourself and your own career progression. [01:16:02] Oh, yeah. This is the biggest investment in the link I posted here. Warren Buffett. Right. I mean, he actually used the stock of public speaking royalty and he had to and he had to buy a Dale Carnegie public speaking course. And he said that was the best investment he ever made. And when you hear him speak, he speaks with a lot of confidence and a lot of competence. And I think sales is being good at sales, comes down to being confident in your abilities to communicate. And it's weird because the more I think, the more confident you are in your abilities, the better you can communicate stuff. I think for Data people, although it's an interesting one where there's this tension between showing that you're the smartest person in the room and then communicating effectively. And there's there's sometimes a dichotomy between the two. And I think what we found is, as the old boss told me once I started out, I Data career. I was twenty years ago, I came to him with a bunch of numbers and try to cross party was like, look, when I ask you for the time, don't tell me how to make a watch this time at the time. Right. I see the tactic and let's talk about some of it. But I think after that I realized communication is everything and the people I've seen succeed in their careers wildly are the ones who can sell and communicate to appears at other companies and other people in general. Communication is I think it's an underrated skill, especially in the Data community, you know, as is externalism discussed on numerous occasions, I would say it's the most important ability. But anything else you can communicate what you're trying to get across with Data or without Data, you're just talking, right? [01:17:39] Yeah, I've said it once. I said it again. Learn to build. Learn to sell. If you could do both, you will be unstoppable. And this is exactly why I interviewed Brendon Coomaraswamy for my podcast. We had an entire episode all about how to master public speaking. So go check that interview out. It is chock full with amazing, amazing tips on how you can go and improve your public speaking skills. I spent a lot of time reading up on how to be a better salesman. One book that really helped me this year was The Art of Selling Anything. Another one was to Sell is Human by Daniel Pink. Those are two books that I highly recommend. There's also The Child's book, which is Dissuasion, an excellent book to read as well. I highly recommend those. Yeah, those are excellent reads. Any other last minute questions here? All right, guys. Well, I've got a very, very special episode releasing on Monday for the podcast special to me at least. I interviewed Donald Robertson. He wrote the book How to Think Like a Roman Emperor. He wrote Stoicism in the Art of Happiness and a few other books. He's releasing a graphic novel about the life of Marcus Aurelius. So I think it's going to be a very, just very good episode, right. For the holidays. You guys are really going to enjoy it. So definitely tune into that. That'll be the last interview episode of the year are on the fourteenth releasee year end wrap up. Two more officers left in the year. You guys, thank you so much for coming. Thank you so much for hanging out. Remember, you got one life on this planet, so go and try to do something big. My friends. Take care. Have a good rest of the weekend and we'll catch you next Friday at the office hours. [01:19:28] Good day, everybody. Thank you. Thank you.