open-office-hours-dec18.mp3 [00:00:07] I hear what's up, everybody, welcome to the @TheArtistsOfDataScience Happy Hour. It is the holiday edition of the Happy Hour. Welcome, everybody. Hope you guys got an opportunity to check out the very, very special episode that I released on Monday. It was just me talking to you guys, and I made you a mix tape. So definitely check it out. People are funneling into the weight room like crazy. It is. It's going on and it's happening. Welcome, everybody. Welcome to The Artist, their science. Happy hour or so. Happy to have you guys here. So I want to kick this session off with a question that was sent to me on email by one of our listeners named J. And Jay wants to know what is the day to day activities of a Data scientist and a non research organization? Do they work from home or similar lifestyle like software engineering or other types of jobs? What's the work life balance for a Data scientist? I mean, I'd say the work life balance is pretty well considering that all of us are here in various time zones hanging out in a massive zoom chat. So I think work life balance is quite nice for data scientists. I love to hear from this here from Tom. And also, I can't believe there's so many freaking people in here. This is awesome. [00:01:30] Hey, everybody, I have to personally rejoice and I almost want to do a jig because Susan Rice is in the house where Susan and guys, I know I'm not putting her on the spot. [00:01:44] She's planned a special Lip-Sync for us, right? Oh, really? [00:01:49] Yes. I'll just go watch one of her many Lipsy post settled for you. [00:01:57] Yes, I will. We need to have a proper singing session, Susan. [00:02:01] We have a need to do it. That's what we need. Yes, that is. [00:02:05] Well, so Harp your question and I don't see this completely jokingly. I don't know that balance and Data scientists go together in any sentence or even any paragraph. I'd like to borrow from the extreme wisdom of Gary Keller, who was the founder of the Keller Williams real estate. He wrote a book called The One Thing That I Had to Read because I'm a trier holic. That means I was I was I still have to fight, not doing too many things. And he made a point. There's no real such thing as balance. When you're trying to excel and do good things. It's more about every once in a while you got to stop and try to rebalance. And I I've found over my many years around the sun that that's pretty accurate and I'll shut up. [00:02:54] And it's a cliche thing that those are not work life balance, work life integration. I don't know if I believe that. I don't really know what that means, but it's the whole thing. It's trending right now. [00:03:03] Maybe it's crap like that, work life integration. So let's hear from my good friend John Sebastian. Josh Bashan is one of the mentors that this dream job, one of my good friends superexcited you can make it, man. [00:03:17] Yeah, well, I'm super happy to be here, of course. So it is my first time, which is kind of strange considering that we worked together and still haven't I haven't had the time to to stop by and say hi to everyone. But hey, here I am. So so yeah. What was the question again. How is it to, to be a data scientist outside of academia. [00:03:39] Yeah. Like yeah. What's the day to day activities in a non research organization. And you come from a research background right now. [00:03:46] You're kind of. Yes. Yeah. I spent almost ten years, if not a little bit more in research and then I switched to to the industry maybe a year and a half ago. [00:03:57] So so yes, I would say when it comes down to my day to day, of course, you know, every day is different. But generally speaking, I would touch base with my colleagues and we would well, we don't do this every day. [00:04:13] But still, you know, just to just to make us look very, very good. Basically, at the beginning of each day, we would talk maybe fifteen, thirty minutes just to have an idea of what we did the day before, what we're planning to do today. And we would just, you know, exchange some ideas about how we could tackle some problem or some issues that will be having probably during the day. And it's a good time for us to explain our problems, which is always the first step to everything, and also to get feedback from colleagues as well. And then, you know, we basically it starts meetings after meetings, after meetings and then. You realize that, hey, you know, I kind of have to do some work, but yeah, so that would be probably how it starts. And then, of course, you know, we all have our individual projects and we try to to do as much work as we can. Being home is it's awesome. But at the same time, it's awful because you really need to to to get, you know, a specific room for you to work. Otherwise, you know, your work life balance is just out of the window. So, yes, I have my office here, so it's OK. [00:05:40] And yeah. [00:05:41] And I think the most important thing is to make sure, like every hour to take a break, because it's easy just to sit here and just work. I think there's something that is missing, not being like at work and we don't realize that. But, you know, every once in a while someone will come up to you and say, hey, you know, let's grab a coffee. And without thinking, you just go. And 15 minutes or half an hour later, you come back and you start working again. And fortunately, unfortunately, we don't do this, you know, in this situation. So we don't realize we work a little bit too much. And after a while, we just get tired and we don't know why. And now it's Friday. It's five thirty here on the East Coast, and it feels good to be here. So. [00:06:27] So to answer your question at Data, scientist is just a regular ass job, just like any other job. It's not that much special, except we just get to use Data and science. Anybody else want to talk about what their day to day is like? This was a question that came in from one of our listeners, Jay. He just wants to know what the day to day activities of the Data scientists and non research organization are. We'll get one more response, then we'll open it up for other questions. Then what's your day like? [00:06:56] Needs to be better on the and turn on and off for me is kind of crazy. I think I'm a little bit unique because I do the strategy side of the house. So I take a lot of meetings. I spend a whole lot of time not talking and then maybe say in two or three sentences and that's, you know, an hour meeting. Sometimes we also do the hands on building, developing some ideas, chaos. And I'll take breaks for like an hour to actually sit down and think we've got that huge whiteboard behind me that sometimes helps. Sometimes just as ideas sit on, it will never go anywhere. And I think creativity is what I struggle the most to keep in my day to day sort of work life balance, because a lot of times I'm building something or trying to figure out a problem and that just bleeds into my life. My brain doesn't stop working on that problem. The fact that I'm working at home and I have been forever, you know, leads together. It starts to take off the calls downstairs. And I'm at the kitchen table having lunch. And, you know, again, it's just kind of leading in code late night or early morning. And again, you know, it's it's it's so easy because it's right here. I can do work any time I want to. And that's supposed to be the excuse for me to be able to do only eight hours a day or less than that. And it ends up in the recent I worked 10, 12 hour days, sometimes end up working on the weekends. I've got to talk that I'm going to be working through the weekend to prep for it. Work life balance. I think I was just a pure data scientist. I think you'll be a little easier. And if I was in an office, I think it'd be a little easier. But it's a combination of covid and what I do, it's integration. I think you're right. Maybe, maybe invasion's the right word. Workplace invasion. [00:08:45] It's interesting. Take Manso, Jay, if you're listening for tuning in. That's a day in the life of data scientist I guys. Well, I just wanna take a minute here just to recognize everybody that showed up. A lot of friends of the podcast we've got we've got a vicious they just heard him. He was on an episode with me once, Kubernetes and Sreenevasan here doing two two lab recordings one day. Man, Yes, he had some stamina's events. And who else we got? [00:09:10] We got Tom AIs. We got Dave Langer, Joe Reese. We got Giovana. We got Greg. Susan Walsh. [00:09:18] Oh, my God. George, Leona, Jennifer, Mark Rey, Monica. We got so many awesome people. You guys are absolutely amazing. Thank you so much for saying that, LinkedIn. Yeah. All right. Pretty much ahead. All of all of LinkedIn Data community in one spot. This is awesome to see you guys. Thank you so much for taking time out your schedule to come hang out with me on this Friday right before the holidays. I couldn't pick any better place to be than right here right now with all of you guys. So thank you so much for being here. So we've got a question in the chat from Saurabh Saurabh. You want to go ahead and yourself and we can help you out here. [00:09:58] Hi, everyone. This is my first meeting. Plans to open office. I was. Harp, I'm a big fan of your forecast, a lot of big names here, so excuse me if I'm asking some stupid questions, such thing. So I am a project manager, basically being 15 years in the industry. I'm really curious and keen to switch to Data science. I've been reading about it being connected to most of the big names here and my first day of languor, everyone and mostly everyone. My question is of someone starting new. There are so many avenues. So for example, we have a fascinating science data scientist program. Then we have W data scientist. Then we have, as your data scientist, how relevant or how useful that is in the industry. So I myself in the industry were not in data science, but if someone has to pursue any of these programs when you might be interviewing people, you might be recruiting data scientists all around how relevant authorities branded certifications out there. [00:11:17] All right. So that's a great question, actually. So if I can distill that down, it's there's so many different tools out there to pick up. And some of these tools end up offering some type of certification at the end. How do I decide which one I want to take and what's the benefit of picking one over the other? I kind of get that right. So I can say that when it comes to SAS, like SAS is really using very highly regulated industry. [00:11:44] So I would say biostatistician working in a pharmaceutical company, SAS was huge, like we had to do everything in SAS. And then when I was actually in the insurance industry, again, everything had to be done in SAS because it was a quote unquote, I guess, validated software, as they call it. So for certain industries, you can't really use open source. Those are two examples right there. [00:12:05] That being said, I'd like to flip it to let's see if it's robots and wants to help tackle this one. [00:12:10] Yeah. So I think you rightly pointed out arbitrating SAS is still used if you take banking or many related industries, some of their traditional risk model or even like forecasting model to Ronstadt's know. [00:12:26] I would just come back to the question, what do you want to do? Because SAS and Tableau or other tools are like a different part of the lifecycle and you're coming from project management now. You have other avenues as well to enter into datasets. From a project management perspective, you can just kind of move yourself to more like project management. I know it's not an easy transition. I don't want to kind of say, like, it's going to be easy, but at the same time, take your domain knowledge that you're working on a project management. It'll be working in a practical in a particular domain. Can you take the domain knowledge and get into project management where you work with multiple stakeholders, set up the background and also like create a road map for the Data project? Right. That is one part. And then slowly see what tools to use it because you have 15 plus years of experience. If you are good at Dominy to start with, that are now coming to do, then you can decide you want to be on the visualization side or you want to be the modeling side. That's where SAS or was this tableau comes into play. Because if you take tabloids, one of the leading tool in the visualization world, follow other tools. So definitely these tools are useful, but it all depends on how you want to transition your career into Data science is excellent point. [00:13:43] So that's a great point about which tools to pick. And then when it comes to picking which resources to pick, I don't think there's any like one magic course that's going to teach you everything or separate from anything else in terms of content. But there are courses that are done with trusted, reliable, long term thinking individuals who really put in the work and effort into how they play the content. For example, Dave Langer, he's got an excellent course, so they would love to hear about what you think about his question, about which text actually I pick up, what's what's the benefit one over the other. [00:14:19] Yeah. So my perspective is based on being in the technology industry for more than 20 years, technology comes and goes. So I was around for a long time before it became hard. Python was around for a long time before it became hot. Python in our will eventually be called technologies over a long enough time line. Trust me on this, it's going to happen. So picking a particular technology stack, I would say, is going to be a tactical decision based on researching the kinds of jobs and companies you want to work for in the kinds of things you want to do for those companies. First and foremost, because a lot of hiring managers are just going to check the boxes. Do you know this? Do you know that great understanding the base concepts is what's really super important. So I tend to focus my. Off my content on the kinds of things that kind of will stand the test of time, like SQL, for example, you cannot go wrong with learning SQL if you don't know it. And that's independent of a technology stack. However, if snowflakes really super popular right now, sure. Learn SQL with Snowflake makes sense, but I would keep that in mind. Tech stocks are a tactical decision at a point in time, which you really want to focus on are the core concepts that are going to be reused and useful for you long term. So that would be fundamentals of machine learning, statistics, data access and data management like school, things like that. Rather than worrying about our Python or RWC versus Azure or whatever. [00:15:44] So it sounds like principles, never tools, will change over time. So focus should be just on picking up the principles that will lay the foundation for you to build on later in your career as the tides change and just having the mindset of adaptability to be able to just change whatever tool you're using and be open to experimenting with it. So thank you. [00:16:07] Yeah. Dave. [00:16:09] Awesome. So I've got with the chat here and I've organized some questions as best I could. So next up, we got Ashan with this question, so I should go for it. [00:16:18] Hey everyone. Sorry, let me scroll to my question because I forgot what to ask. All right. So Hurford so yeah. So what are some trends you've noticed on LinkedIn recently? Did you take a course on how to LinkedIn? I I've noticed that people, you know, people like their own posts to make it more visible to their connections or certain, like tips that I picked up. But I don't know if there's like a course on how to LinkedIn that people are taking or, you know, what's what's going on. Is there something I'm missing out? [00:16:52] That's a great question. So in terms, of course, on LinkedIn, I've signed up for a few and maybe I've watched a few of the videos and just didn't finish it. So I'm notorious for like buying courses on how to LinkedIn and then just not figuring out how to Lincoln. So I guess I'm kind of not the best person to ask for that. How about how about, Susan, what we need to do to LinkedIn on LinkedIn. [00:17:16] So I have not done any courses and I've been pretty successful without needing to ever do a course. I find that courses get used to not be yourself. So the most important thing is talk in your own tone of voice, you know, present in your own style, how you would talk to your friends, your family, do those kinds of things and don't be afraid to test things and fail and learn. And, you know, it doesn't matter if a post bombs, you know, nobody will remember it tomorrow. Try loads of things I and and be patient. So it's taken me a good year and a half to build up my network, my my brand. And you can just figure out who I am and who I want to be. I'd like to comment on lots of posts. I do like my own posts, which is just habit. Now, I don't know whether it makes a difference or not, but I do it. I don't use any automation. I do everything myself. I think especially with the more followers you get, I think you're putting yourself at more risk if you have any kind of automation at all. Why else have a mix of content as well? So it's really important to show your skills, show your knowledge, but sure of yourself as well. So show some of your personal side, maybe some fun bits to need to do. Lip-Sync But that's working for me. [00:18:45] Somebody who I see is super positive on LinkedIn like just everywhere is Giovana. So Giovana, what are your tips for LinkedIn? I can't find you on my screen, I hope. Still here. [00:18:57] I thank you too. I think the most important thing is I agree with everything that I have said. So, Zach, I love your style. Your tone of voice said I'm your fan. And I think the most important thing is to be authentic because you have to show what you are. Don't pretend to be another person. Just be yourself. The most important thing in our community that is sharing and caring. And I think sharing knowledge can help our community to grow is the most important and that we can give to everyone and all all the people who is today here, they know how to how to do that. [00:19:47] And I love when I, I write every post of everyone that is today here. It's amazing to see everyone here is like everyone has his own style. And I think this is there and there the human touch that is we have. Every post that we we publish and we need to maintain that, and I think that's why we have almost the same followers, but they love to go to one place to another, maybe the same idea, but with a different perspective. And I think this is an add value to the information that we share. [00:20:27] Thank you so much. We actually have a LinkedIn. We have a couple of LinkedIn top voices and topolice alumni is here. So, Greg, let's start with you. So Greg is a newly minted LinkedIn top works in data science. So we talked about this, by the way, guys, I have an interview release with Greg early next year, recorded maybe a couple of months ago. It's been a while, but I remember during our conversation we have this this talk about how to maximize LinkedIn. So share some insights with us. [00:20:59] Yeah. So for me, it's kind of like the it's like a kind of different steps, right? You you pass the threshold of caring what people think and don't take it personal. Right. And then be in a mindset of, you know, sharing, like just just giving off new information that you learned out there for me. I see a lot of content that are super tailored to a crowd with their knowledge about a subject or SQL like Data science. For me, I get lost. Right. I can understand it. But from from a content creation standpoint, I kind of look at it a little bit, scope it out, scope out and look at it from a distance and understand why. [00:21:49] So why do we have, for example, computer vision? What is it good for to use case, what are the use cases and then kind of pull stories that are related to new discoveries. So but I cannot go to the specific about how an arboretum works, so I kind of take a different spin on it. So where I discover something new that can connect with not only the specialist, the specialist in the Data science and also a businessperson who can also connect with my content. So in this case, I kind of to me, I grew my base. And also another thing, too, that's good is create some sort of cadence. Do you want to do it on a daily basis? It does work that people subconsciously expect something new from you from time to time. So it could be that from a time range that you choose, whether it's from between nine and 12 p.m., you want to pull something out. And then the other one, too, is always be open to somebody else. Just slapping that post right back at your face with something that you didn't really know or expect. So create that way of communicating and getting the conversation going. One of the things that they told me when they nominated me was that the conversation was good whenever I posted something. And to favor that is, I make sure to respond to everybody's answer. If you comment on my post, I'll make sure to respond to it. And most of the time, my learning growth happens when somebody shares something new with me on the comments. So that's what I did. [00:23:33] And I meant, thank you very much, guys. For everybody just chilling. See the reminder that the the chat is popping off. So definitely be active in the chat is going to be saved and published when this episode releases as a podcast episode. So keep an eye out for that link there. So can I see you got your head up? I added you to the Q So we'll get to your question. I'll get to your second question later on in the program here. Next up, though, I've got Mark. Mark, you've got a question. Go for it. [00:24:06] I was just curious people's thought process of how they got through debugging for my current projects, basically building out a whole NLP pipeline and coming across a lot of different bugs that I create. And so being able to slog through them and get them done. But I definitely want to get more efficient at them. So I'm just curious other people's steps of like what's their critical thinking process of like this? Like Wagener, what's my first thing I do then? I check this I Nozomi that every single time, but I feel like it's almost an arts. [00:24:38] And so I'm curious how other people approach it and actually love to hear from another fellow student looking at you. Leon actually works at IBM, if I'm correct and she does big things over there. So I'd love to hear how you go through debugging code. [00:24:56] I everyone, that's my first time here. And that's a really good question, I guess. Blogging is really important and it can be challenging, but as you do your coding, you're kind of learning where to look, how to deal with it. And when I started my career as a data scientist, I was panicking when I was seeing errors or blogs. [00:25:21] But as time passes, you kind of know where to look at, like the last line or which specific lines. [00:25:28] If you were coding in Python, you should pay attention. And just like anyone else, I keep Googling what a certain error may mean or how other people specifically stack or full deal with it. So it's just like trial and error, sometimes three parts. [00:25:50] And how do you suggest doing some debugging? [00:25:53] Ok, so you mean the whole application will try? Debugging was pretty easy because the tools and everything supported it. Let me let me maybe go into one level kind of detail on the main thing I noticed. Like when we are looking at a data scientist, debugging is the toughest part because you have an entire pipeline that you built from Data sourcing to your final deployment or insight completion and the entire pipeline, even at a place it is it's very difficult to know that it is failing because we as data scientist, sometimes complicated without writing a structural code. We just take a single notebook and push everything inside and think like, OK, we are done. The very first thing, this modularize code. Alex played directory structure for your preprocessing step, for your what are the visualization outputs tab, the models tab and and also the model in the production audibert. It cannot be Warralong. So the insight in the production and use the right libraries, be it like you are testing libraries like Blindest or other things just before going. The second is logging is very important that you see, it's the the critical you need to make sure you log everything to a file. Also in the cloud, you can use cloud services where all the logs are centralized. Now writing 10 different logs for each and every process is going to be even more complicated. So your logging has to be thought of a horizontal capability in that process. So you're logging framework upfront, trying to integrate the log and try to create also links to the model metadata and model. [00:27:36] But that's where the analysis comes into play if you're talking about machine learning in general. So try to create a more kind of outputs because like your kind of insight cycle is going to be multiple experiments. Each experiment will give you a separate results, right? Or separate metric? No. If you keep on changing your pipeline and then you are generating and said you don't know which by which metrics you got for which model called. Right, so start logging each and every model metrics and this logging frame. But any company has to invest upfront or we have to invest in creating that so that tomorrow if you want to go like what I did to this bag, you'll feel that was the best insight that you got. You can always go back and revisit and the logging will allow you to exactly. Go ahead and get there. Right. It is happening. And you can then maybe in the code itself that are way to Data such a stack overflow and the results, you can do that or you can do a manual search. So that's what typically we do. [00:28:35] For logging analogy, Carlos, is it different with our when it comes to debugging? Does does the principles change depending on on programing language? So let's hear from Carlos about debugging then after Carlos from. [00:28:48] Yeah, I think everything she said is exactly right. [00:28:51] I would have thrown at our have Penn's package that lets to save the status of various things. I think in Python, Megaupload does a bunch of stuff for that too. But I was just putting the Chata at our studio. Twenty, twenty, twenty. Brian had a whole workshop on debugging. I went to all three days. There is also a keynote speech about debugging. I highly recommend that I LinkedIn have a chat, but also just the number one way to debug it, to not make bugs. And the way to do that is to really have in your head what the desired output is before you throw your input into a function. I see a lot of programmers are like, yeah, and I took this code off the Internet and I worked for that guy and it is Data frames. I didn't work for me. Why is it broken? And of course, you know the proper thing to do to make a real practical supplementary use traceback use browser stuff. I really like to have your input output in your head as you code so you're not debuggers. Lose less. I said earlier today, but try to get out of the situation or you're accidentally causing bugs and do that. If you think about your input, that outputs are excellent points. [00:29:53] I think that better flow is an awesome package, actually. Just recently found out about that developed by Netflix. It's open source, completely open source. I plan on experimenting with this in the. Next coming weeks, that's actually a really great topic that I'd love to hear from more people on, I think we can even get into some of philosophy of debugging. So let's ask a couple more people what they think. I'd love to hear from Monica about the. And after Monica. Let's go to George, everyone. [00:30:21] I came with my bill line. [00:30:25] I just have to general things that I wanted to share when handling errors. So what I like to do when I'm learning something new is kind of fail on purpose. So I just break it, see what error pops out. And then if you do that enough times, then when something comes around again, you wish you would notice. All this has happened to me before. I know how to fix this. And then another one being be very active on sites such as Stack Overflow. You use that, you probably use that to to research your own errors, but really get into like answering other people's questions. I think that really helps you solidify what's going on in the back end as well. [00:31:15] George, there's one thing I do want to mention. Great answers. By the way, I've started my Data career as a software developer, and I think there's one commonality between software developers and data scientists, and it's the fact that we we tend to go at tackling the problem so low. And that's not a bad thing necessarily. But it's sometimes valuable to try and ask, help, train, ask your colleague or somebody else that you might not know the issue at heart as well as you do. But just getting that fresh pair of eyes definitely helps. And I know if I'm asked to do debugging for somebody else, I'm not always happy. But sometimes it's it's nice to tackle that challenge as well and see what he can find. And you know what he can save you a few hours at instead of digging it yourself and finding out where the bug is. [00:32:03] Thank you very much for that awesome, awesome tips and advice on debugging. [00:32:08] I just have one more thing to debugging. Oh, definitely. Go for it. Yeah. One of the one of the things I've noticed when I divide my code is 90 percent of the time it's something really stupid, like a misplaced comma or misspelled word or something like that. So I would say if you find yourself spending more than 30 minutes trying to solve a problem, there's a very good chance it's something that you missed and it's usually something really stupid and simple. [00:32:33] So maybe check the simple stuff first and that might save you some time. [00:32:38] Awesome. So I guess to also address that issues have a good idea. I use voice code, whatever you guys use, type it out in the chat. I'd love to hear and see what it is that you guys use, so definitely type that out. So next I have in the queue floor in there for flooring. I've got Jennifer, who was the spark off a python versus a hard to beat. Who should be a good one in that after Jennifer we got. Okay, so Floran, are you still here? Yes. Awesome. Go for it. [00:33:09] So my question was regarding Googe scraping Lovington Data said. So I'm planning to kind of scrape the whole world. Webbe No sense. And the four different websites like on Goodreads you can find what people read on other websites, you can find where they travel and other things like that, and do like the discovery of Hobby Database discovery. And I'm curious if people worked on this and also I know legally some somehow complicated, but also I don't know what the challenge is. So how do people see it? [00:33:39] So you're asking us to give you advice on how to scrape personal identifiable information on the Web? Is that they hear that, right? [00:33:47] So I know how to do it technologically and having the database and putting Data more in. Also, maybe if the projects that are doing this in different universities. So I don't know. [00:33:59] Yeah, you're saying that it's a bad thing, but there's an entire industry called open source. That doesn't mean our open source that's focused on scraping everything on Reddit, everything like Twitter, everything on Instagram, everything like if it's public government that you don't like is recording it. [00:34:15] So I just like one of the things that I'm doing. I'm mostly scraping like the users. Not exactly. Not that much, though, because people this is too much capacity for me to handle. So I prefer to have, like, metadata about the user. Like, the difference is not the thing I got that you don't get that much of the post because I don't have that much memory. [00:34:36] Anybody want to take a stab at that? [00:34:38] I guess I'll go ahead. Go for it. I guess one question I would ask is, so is this for research purposes or is this for commercial use? [00:34:47] I like Data do so for me is like flinger the data and seeing. I don't know how I could monetize these probably though they could be different, but it's mostly for me because I, I like to see what would be the results. [00:34:59] Got it. So, so, so, so in general I don't wan. Make this a blanket rule, everyone, please feel free to argue with me on this, because I'm sure that is wrong. But like, if you're like an individual and you're doing it for, like, research or like, you know, toy Data set purposes, typically they won't go after you for that. Some websites, if they do have like, you know, a. kind of scraper sort of initiatives going, it will be a little bit harder to do that if you are doing it for like eventually if you want to monetize or commercialize it, that's where you could kind of get into trouble potentially. [00:35:35] Well, it depends. But in general, just like don't do it. I just think it's safe to say don't do it. Secondly, like also from I guess like a lot of places they offer APIs. That's honestly like the better way to go about is if you can get access to, like, legitimate API, I would just go do that. And then like the other thing I would consider is like, if you really want to, you can you can look at scraping surfaces. I've already prescribed it. It's one of these things where it's like figuring out like, are you are you doing it because you're just like interested in like learning how to scrape that can be very, very useful. Or are you trying to like, do a project, Alvah? Are you trying to spin a product or service? If you're trying to do product or service, just don't do it. If you're doing it for just like an individual project research purpose, you probably don't need a whole lot of Data just to go like, oh, I can write like, you know, like a beautiful soup pastor. Or I can do like little, you know, crawling spider. So I would just say, like, consider that because like even those are LinkedIn racing, like they had some Rolling's right about like how you could approach their scraping. But it's it's still tricky because like if you, for example, take a project that you eventually want to monetize in turn, like if it's someone else's intellectual property, then that will get you into trouble, like long term. So I would just say, like, you know, consider some of those questions, you know, are you doing it for individual purpose? If you do eventually want to monetize it, just don't do it, you know? Like what? What do you really need it for? Everyone else can kind of I think there's a lot of people here who are experts who can speak to that a little bit more. [00:37:11] Yeah, this just a topic. I'm just move right past you. [00:37:15] I give you a white hat hacker answer. [00:37:17] Yeah, you do that in the chat. So there's a resources post in the chat, web scraping, whatever that would just move past that. Next up is Jennifer Nurdin. You have a question. [00:37:32] So I don't intend to start war here, but I like learning about Data. I'm very much on the business end of of Data pipelines, but I've got a lot of databases to get to. I need to start merging some code over vacation. I want to either deep dove into Python or ah, which one should I deep dove into. You can give me one reason why and one reason why I should not do the other one go. [00:38:02] I was trying to do one or the other of them so the time I said I would recommend one of those. Yeah. Oh good. Excellent. On the right track. One of them is a good place to start then with a contrarian views. [00:38:16] Let's hear it. [00:38:16] And Python are the actual correct answer. [00:38:20] I have been there and done that. Now I want something new one. Oh yeah. [00:38:28] Yeah I, I, I always forget the answer to this, but the way I remember it is the pythons for products and R's for research that helps me remember because I forget when it comes to picking up Tensorflow over Christmas break. [00:38:42] So I repeated this, I would just I would just flip a coin. [00:38:53] Right. Heads or tails. Five one. I mean that the holy thing are kind of silly, in my opinion. I mean, I, I'm more of a python guy and we're getting Python like most of the twenty, twenty two. So that's the interest in Python, but said are great too. [00:39:10] I think the languages are just utterly stupid. Let's try to flip the coin along either way. Yes. I'm real real quick. [00:39:18] Important, just a real quick one. For those of you Python people that are privately jealous of our shiny, you can pull it into Python. [00:39:30] We also have cash and stream, which is like really nice. I will admit our has made me jealous for many years. [00:39:37] So Jennifer, I'd say this. I'd give it unbiased answer and this one has to go. So take a look at, you know, maybe I don't know if it is for your own just personal development that you're trying to code, but take a look at people in your organization that you're close with around you and see what they know. Right. Because if you're in an organization and that organization highly favors use of R, that means you'll have a lot of people that you can tap into and be like, Hey. [00:40:03] I'm stuck on this thing rather than having to post it on stack overflow and know search hours and hours, you just go to somebody like, hey, need some help, help me out. So I'd say that just take a look at people around you, people that, you know, you can reach out to, who knows some programing language and just see what the consensus is among them. [00:40:21] It is pretty balanced. And that's why I'm like, well, I got to pick one. I only have a couple of weeks. [00:40:27] Yeah. And I think that kind of points to the like hour versus Python being sort of like one of those academic debates because like at the company I was that which wasn't big. [00:40:38] But, you know, it's like around a couple of thousand people for our science team of like 30 or 40 individuals. And we're serving different teams. It really was split 50 50. And like all the managers had to know, like, ah, in Python. Right. Because sometimes we would get code from like a different part of the business and then we would need to adapt. And so I was like, oh, like, you know, I can't do this. So the manager or the team leader is still usually on point to like help like mentor and guide that sort of translation. I know for me I started learning are in like academia when I went out and then learned Python. But like I think at some companies I still had to do both. Right. I had to like read and Harp code and then understand if issues came up like what was going on. So if you just pick one and rely on the people around you, you know, it's you can't really go wrong because I think honestly, there's so many tools and like libraries in both languages that you'll be you're sitting you'll be sitting pretty on either one. [00:41:39] So we're going to do this. I'm going to actually just fire up a poll and everybody can vote and and you can pick whoever the winner is. And we'll just go on with the next question that we have in line is from actually so everybody, when they set up a poll real quick and we're going to decide Jennifer's fate. So in the meantime, actually go for what's in question is actually still here. OK, you might have left, I think actually said it. [00:42:05] I think I had to leave. [00:42:07] Ok, that's going to take a minute to recognize some of the folks who have jumped in just because I finally got a chance to go to here. I just realized Ben Taylor's here. Ben, what's up? We got Sarah. [00:42:17] Sarah, welcome back. Glad to have you here. Cameron is also here. Keep an eye out for an episode that I have with Cameron. We got Ray in the house, got David Telo. Beautiful, wonderful, amazing people. Thank you so much for being here. [00:42:31] So let's go to Karen's question next. Are you still here, Karen? Has Karen left at the camera left? Actually. So, Ashley, what is your second question? I really like this question and I'd love to hear from everybody here on this as well. [00:42:47] Yeah. Hi. Second time. So what are your expectations from someone in a junior role? I'm currently a junior Data developer, so I feel like I'm kind of all over the place as far as learning goes, like I'm handling, like, you know, a couple different tools at the same time. So are you expecting me to, like, know it all or just like, you know, document stuff? So, yeah. What are your expectations for someone in general? [00:43:14] So, Sarah, I'd love to I'd love to start off hearing from from you. And then after dinner we'll go to a Shreveport's and then and then we'll hear from Monica Mikiko. We'll hear from everyone, because this is really, really important question. Go for it. So awesome. [00:43:30] Yeah. I like this question. I guess the way I would guide someone who is junior is to not get super overwhelmed, to test out different things, but find out where your strengths are. I think when I had first started, it's really, really easy to just see the see of everything you don't know and feel like the expectations of you are extremely high and feel kind of down on yourself that you can't meet everyone's expectations. And so I think that also coming from above, we should be a little bit more level set on what our expectations are of junior folks and that it is so widespread and you're not going to find a Data unicorn. So I would lean into projects that you find very interesting and that you can see is very much the industry that you like, very much related to the industry that you want to stick to and focus on developing the skill sets. For those I know, we Harp on on communication skills. I think documentation is extremely important and then focusing on industry specific skill sets that can help you along the way. So that would be kind of how I would guide you there through Butson. [00:44:42] What do you look for? [00:44:43] Yeah, so. So if you're specifically looking for a skill, I would say the very first an important skill that is required is a skill, because most of the time as a junior, you're coming into a project, you will be given Data to analysis of 90 percent of the analysis can be done with the SQL. Right. Like all the way down Labruzzo. Visualize and show you the output, but this girl does all the job, so that's where you'll be spending most of your time, right? So that's a good start. You need to go advanced into a SQL, I would say. Right. Like everybody can write all the Harp to select and all. But when you're doing analysis, you may need some of the analytical functions of a SQL and all the key expressions and everything. So go in detail with that. That's a good start. And as you get into the project, right, you can you can learn maybe if you know by dawn you can go into detail of it and you can learn on the job. But that's what I would see. But the industry is completely polluted with all the jargon and literally everything that junior's that. That's kind of unfortunate, but I would say that school is a good start. [00:45:53] Monica, what do you look for in a junior data scientist? [00:45:57] Yeah. So aside from any of the technical skills and what they don't usually put on job descriptions, are those soft skills mainly being a curiosity? So the main point of your job as a data analyst or a data scientist really is to solve problems. So if you're curious to understand what that problem means and what you need to do to solve that problem and continuously learning on those type of skills, that's that's what I lean towards. Because you can always improve your SQL or Python or what have you those those skills along your journey, then what do you look for? [00:46:39] I think, you know, everybody's brought up most of the the main points and soft skills, curiosity, usually one or two technical skills that are really important based on the business skills, obviously huge. But every once in a while you'll run into a position where you might need something else as well. [00:46:57] What I'm going to do is partner up with a technical lead. I'm going to partner up with somebody. You can do mentorship from a career perspective. And those are the two things, not so much that I'm looking for, but I want you to be able to develop under mentorship and under technical guidance. I want you to see that you're progressing quickly and learning from someone who's essentially on the same project and handing out smaller tasks to you and frequently checking in with you. I think there's got to be some sort of a framework and especially a safety. That's a good way to say it so that you never get too far out on your own. And I want like I said, from my perspective, that's what you can expect from me. But then from the expectations of you is the rapid learning. I want to know that I've hired somebody that learns quickly and someone who's willing to try new tasks. [00:47:51] Actually, I'd love to hear from from a listener as well. [00:47:55] What do you look for in a junior Data scientist, maybe specifically at your organization? What kind of candidate to look for? [00:48:04] So soft skills are pretty much a lot of people mentioned is really important. And we're looking for people who are curious, ask questions, because a lot of things can be new when you just start your role. So asking a lot of questions and clarifying things is really important. And the other thing is networking with Otter's talking to other junior Data scientist or senior ones is another thing which we really, really value. [00:48:38] And what do you think then are you mean? [00:48:41] Yes. How are you guys all engaged in a minute past me on this one? [00:48:44] Yeah, definitely. So let's hear from a listener from from from I don't know who wants to go, but so many people would love to hear from. So, Amaia, do you have to answer this question or do you have a question? Because I have you in line at number four from now. So if you have a question just to sit tight, Mikiko, what do you look for in a data scientist and then after this? [00:49:09] Yeah, I will caveat that. I think I will caveat that there are two factors that kind of impact the implementation rate of what is expected of a junior deisinger. One is the company size. So if you are in an early stage startup to be so the mindset right typically is if you're really of early stage, you're in a small business. It's like we might not be able to pay you a whole lot, but we'll try to treat you with respect, but also give you a lot of responsibility and learnings can be really fast. So I think there it's like personality and like grit and creativity is like super important, a lot more important than having advanced technical skills. There is still like the bare minimum of technical skills. Right. So that's like SQL some kind of scripting language just because once again, there's a lot of additional analysis that you might need to do and visualization. [00:50:03] But the personality, the grit, the creativity, the being able to go to work on a problem, you search Google for issues and bugs as they come up, but then coming back to like the team lead or to the manager if you run into issues. So that's like for an early stage, for a more established company, you might be not wearing multiple hats. You might be wearing kind of like one hat. And so if you're sort of the central focus of your work, for example, is strategy analytics, the expectation there is not so much technical, but it's a facade off. You can ask really good questions. Secondly, you can kind of scope and do some kind of like time project management. But, you know, being willing to ask for help if you run into scope issues. Right. If you are, say, for example, the product of your work is more engineering focused, then I think like having the technical skills. Yes, maybe you might not be as up to date on best practices. Maybe you might not be up to date on like Scrum or Agile or anything else. But those are some things that we can kind of like teach you and show you. And if you're doing research once again, it still goes back to be like you don't seem to have advance knowledge. You just need to have just enough to be able to do the work but still come into it with that sort of being ill, ask questions, being able to talk to people, not isolating your business partners, all that good stuff. The the technical bullet points, honestly, are like SQL some kind of scripting language and then some way of documenting your work and communicating it. And it could be visual, like a bi tool. It could also be some kind of like document management tool. I mean, that's pretty much that's pretty much what I look for. [00:51:55] So, yes, I definitely think there's something to be said about having a collaborative mindset and approach when you're dealing with stakeholders. And that's definitely something I would look for in a juror data scientist, given that they oftentimes are having to translate a lot of the heavy business problems to a technical solution. So definitely that's something I would look for as well as the ability to communicate with Data recently. So has someone who's very analytical, who knows how to utilize different types of data visualizations or setting up dashboards in such a way that it helps translate recommendations very soundly so that your stakeholders are not confused of what to do with the information that's being said. I definitely think that's a great way to help the teams who are more advanced people to know where they spend their efforts. So I definitely think that's a good roadmap to helping to drive someone's career as well. So that's something I would look for. More on a soft side, as someone who was interested in being data science, it's wonderful to have so much great advice. [00:53:05] Great tips. Ashan, I hope you're taking notes. Even if you weren't, I transcribe this so it'll be all there for you next up. First of all, cheers everybody. I hope everybody has their holiday beverage. I've got a cranberry stout getting really festive with it. So thank you guys so much for hanging out. [00:53:24] Next question we got up is from a Lolita. Are you still here? Then after Lolita will do Eric and Greg. And I'm a little circle back and see if they're still here. The next step leader still here. Hi. [00:53:39] Hi. Yeah, I'm here. So it's my first meeting and I'm like, I will introduce myself. I'm a graduate student at University of Minnesota and I'm like applying for full time opportunities and Data signs and machine learning goals. But I have had hard luck so far. Like, I'm not even getting calls from the company. And when I apply on LinkedIn, I look I go to that job description and I feel like this is the right goal for me. But even then, I don't I don't hear anything from the recruiters, from the company. So what advice would you have for a person who is just like want to enter in this Data science community? Like, I have done academic projects and I have a professional experience of more than two years, and I'm also doing an internship with a company over here in the U.S. So I think I have a pretty good background. But still, I'm not doing anything so far in the industry. [00:54:47] Yeah, definitely. That's a great question. So I'll chime in with a couple of bits of advice. And I'd love to hear from Sarah that Mikiko then Ben. I would say I would say this. I would say first, make sure that when you apply for a job that you don't kind of just like apply for the job and just. [00:55:03] That they're going to call you back. There's still work to be done after you hit the apply button, and part of that work is to let people in the company know that you applied for the job, whether that's going to LinkedIn and trying to find somebody that's a technical recruiter to then shoot them a message and say, hey, love what you're doing with the company. They thought it was amazing. I'm so happy to talk about how I can help you contribute positively in this role. I'm just like making stuff up above my head. But you got to get the picture. You want to make sure you're actively letting people know that you've applied for the role. Now, if you can't find anybody on LinkedIn that's a technical recruiter, then maybe try to message a data scientist and just let them know that that you've applied for the role and and that's about it. I wouldn't I wouldn't ask for a recommendation or anything like that. Next bit of advice, I would say is just be persistent. It's a numbers game. Any application that you submit, you need to assign a higher probability to even getting that job at like less than one percent because you're one of maybe a hundred thousand applicants, maybe. So just keep opportunities in your pipeline by just applying, applying, applying and then following up and following up. So let's hear from Sarah, Mikiko and then Ben. [00:56:24] Ok, I guess the first thing I would say here is that I had done a talk on networking, which, you know, forging relationships in the community is extremely important. The timing, though, is also important when you create those relationships. So I did a talk during Keith's Data science conference that talked about the importance of the of those relationships and when they when you can actually put them into play. And so you're you're currently in the stage where you're looking for a job on the job hunt. If there are connections that you've already forged with people who can help you in the stage, I would leverage those in the case that you don't have that. I would say creating I don't want to say like noise or chatter or your own personal brand within this community can also be helpful for people to see your profile. Since I started, I think my first job was the only one that I actually applied for. Since then, I've been approached by recruiters, which means that I'm now on their radar. And many of us who are active on LinkedIn, engaging with the community, putting your name out there, having a personal profile that people can look at what you've created and what you've generated and be interested in having you on board to what Harpreet Sahota said. I think also it plays into what did you mention that I liked? You said something really good. [00:57:57] Now I'm forgetting. But yeah, just forging those relationships, talking to people within the company and shoot, if I remember it all, I'll come back definitely solicit from Mikiko and then then. [00:58:14] And then after that, let's hear from Tom. [00:58:19] Yeah. So like Carlos and Matt have kind of sit and chat. Right. Parvati's about your sales pitch. So for context. Right. For, for my particular background. Like I only have the bachelors and like anthropology and economics. I did like a boot camp that spring board, but for the most part, like I think by of measures would be considered very uncompetitive or unviable for design skills. But for me personally, like the way I was able to leverage it was first off, I had sort of domain expertize and specifically in sales, marketing, revenue operations, but also I really working with sales teams a lot. I learned just really how to develop my pitch. What is your unique value proposition? And I can't like I hate to say it cannot be you have a PhD in acts, you have a master's in Y, you did your undergrad and see you depending on what kind of degree you're going for research role and you have published papers and all that, like for Google Brain or Facebook research, then that's that would work for them. But for example, if you're applying like Data scientist roles that are closer to strategy or like senior analyst roles, they actually don't care as much about your education. They tend to care more about your Problem-Solving abilities and how you can demonstrate that you push the needle. And so they care about if you've worked with product Data, if you've worked with sales marketing companies. So one thing I would say is that, first off, a really kind of understand the kind of roles you're going for. [00:59:49] Right. Essentially, there's three sort of personas within the science world. It's you're either doing engineering work, either are on models or Data pipelines. You're doing some kind of research work or you're doing some kind of. And Alex, you know, they have their kind of things that they like to see, so I would say like figure out which of those roles you kind of want to go for. So definitely don't spring break then. Secondly, develop your pitch such that you are focusing on your unique contribution or what you could bring to that role. You know, Harpreet and a couple of other managers Street like to talk about your superpowers, your superpower. It doesn't have to be technical. Sometimes it can be. For example, you're a really great facilitator between, you know, totally different teams. That was my pitch, was that I understand sales marketing for companies in early stage startups and both engineers and sales teams like to have room, which apparently is rare enough, but that's for people's ears up. So I would say focus on those two things. One is interesting or three. One is understanding the roles and work that's available. The second part is understanding which of those roles and what what type of work you would like to be doing. And then the thirdly is understanding what are sort of your unique value propositions that you can bring when you are tailoring your resume and your CV. And this is where getting other AIs to look at your resume, like, for example, Carloss made that offer, can really help with bringing that out. [01:01:17] Then I'd love to hear your perspective on the second part of her question, which was essentially she looks at these job descriptions and she's just overwhelmed by the requirements. What are your thoughts on that? And also, can we please hear the story behind the hashtag on recruitable? [01:01:37] Yeah, hopefully this isn't too unflattering. So really cool people. I'm at Brighton, Utah, the smell right now. So it's it's good. Like, it's got the powder. It'll be good. It's about 20 minutes more so people don't know right out write job descriptions. [01:01:58] And so the joke the joke that people throw out there is I want 20 years of deep learning experience or like something that's stupid just doesn't exist or will actually see requirements where it's impossible to fill that need. So imagine like I want another expert and a python expert. That's super rare. If I had the big Data stack there, that's even more rare. So I don't think too much into descriptions. I really like what people have been saying about talking about your brand so professors don't know anything about they really don't know that much about getting you a job. They know how to teach you, but they don't know how to get you a job and they don't understand branding. So I tell people, go give a presentation, they'll give a presentation at a meetup once, help build your brand, get yourself out of your comfort zone. The other thing I wanted to say is if you're normal, I'm so motivated to not hire you that I want my team to suffer. And if you're if you have curiosity or passion, curiosity, that's great. But if you have a passion, that's huge. So so get that passion and show people you have it under a critical story. I was just sick of getting hit up with recruiter spam, which everyone on this call has been hit up with recruiter spam, where they want me to be a Java developer or something that I'd rather just see what's next after this life than a Java developer. So I put that up there. Think it would help? It didn't help sell in there. Get on the list. [01:03:19] The comments are from blowing up. But just for the record, answering. That's while on a ski slope. [01:03:23] Yeah, that's that's that awesome tips and awesome place. [01:03:28] I want to be answering this while I'm in the air. I'll be like five on something stupid. [01:03:36] Jennifer, so far, Python is in the lead. Thirty three to six. There is still time to vote if you haven't already. So who else wants to share some insight onto this? Um, so the issue here is she's got a couple of issues she's applying to to companies and she's not hearing back and she's getting kind of intimidated by these job postings. Monica, what do you think? [01:04:01] Sorry, I was typing in the comments. What is the question? [01:04:04] So is she's been applying for roles and just not hearing back, but she feels like she's got a solid background for the roles, like there's seems to be a match between the work she's done in grad school and the job that she's applying for. But she's not hearing back. And there's also when she's going to go apply for jobs, she seems like the that the description is just like, what the hell? That this is insane. Mm hmm. [01:04:32] Yeah, I don't have much to add. Everyone's brought really good things to the table as far as reaching out to potential recruiters on LinkedIn. If you see somebody that can help you out or even if they're not a recruiter on LinkedIn, if they just work in a department that's related to where you had applied, you can ask them maybe if they know a specific recruiter and kind of just weave your way through to find the correct content just all about. Networking reaching out, don't be don't be shy about it, the worst thing that they can do is say no. [01:05:09] So that's what I would add. [01:05:12] So I think somebody who has some valuable and say, here's Joshua Bastion, because we answer questions like this multiple times, we had this dream job. So this is something that I know you have an answer for. So go for it, John. [01:05:25] Right. So, I mean, there was like a lot of great answers before me. But if I could add just one thing, just remember that this is just basic human psychology. So you're trying to develop a relationship with someone and you're trying to sell yourself to someone so that they will consider you position. So if you're just like on LinkedIn and you're sending your resume and you're just, you know, sending messages to recruiters saying, hey, you know, pick me, well, that might be good, actually, because, you know, if you apply for a position there's like a hundred candidates, then you're no longer just a piece of paper. You're suddenly you're someone you're an individual's connecting with someone else. And actually, that could help you out just being picked out of the bunch. You might not get hired, but at least you will be considered or you could be considered because it's all about maximizing your chances of being picked. So that's one thing. Now, when it comes down to creating to developing your networks, I mean, I agree with everything that's been said so far. The only thing that I would add is and, you know, we hear this all the time, people saying, oh, you know, I send an email to to that person and they didn't get back to me. Well, OK, but what did you do? So basically, don't ask for a favor unless you have something to give them. And if you don't have anything to give them, which is probably what's what will happen and it's totally fine, start developing a relationship. [01:07:00] And a way to do that is to it's to, I would say, bring value to the other person. So if you're let's just say I want to connect with, you know, let's just say I don't know, Harp and I want to connect with Harp LinkedIn. What I would do is I would go on his profile, see see his connection, see articles they have talked about. You know, I can figure out that he's actually a host of a podcast. So first things first, you know, just send a message saying, hey, you know, I just saw that you're hosting a podcast. This is awesome. It looks very interesting. And that's it. Don't ask for anything. Just create that relationship. And then maybe a week later, come again, bring something else and then, you know, start trying to develop some sort of a connection in which you can actually start to exchange with that person. And then maybe you can bring up some some concerns, say, hey, you know, I've been looking at this position in your company. Would you be able to to help me out or can I do or how can I do something to to to to to show that I would be a good candidate. So this is probably what I would recommend. Just, you know, just remember that you are like in a relationship with someone and just remember that you're connecting with another human being that had something real quick. [01:08:25] Yeah, definitely go for it. So if you're looking for a Data science job and you're not getting anything, maybe you want to switch or change the strategy a little bit and maybe you want to focus on these companies where in their culture and Data something like Amazon, for example, they really encourage people to move to different positions. So maybe what you want to start with is a Data analyst position and then you move after because what the are usually looks at is how long does it take me to train that person to gain some some knowledge in any area that I'm hiring that I need help with? So if they feel like that training time is long, you're not going to hear anything back. Also, it's all about, you know, like everybody else's is about selling yourself. Right. What can you do for yourself in terms of domain knowledge? And how can you back that up with with sound Data to showcase where you know what you're talking about? [01:09:25] You solve this issue already and things like that. And in two young ones, time is your best asset right now. So there's there's nothing that tells you in the book that you have to start as a data scientist like Amazon who wants a data scientist who's done so many things that maybe is good for you to start somewhere else and then work your way into it. And now somebody will look at you as someone who already has, you know, a lot of the area. So I said I'm going to, you know, so so they would look at you as someone who already knows. A little bit about the company that can be trained a little bit faster and get up to speed faster. So look at different strategies. So don't just look at I want to be a data scientist and get a lot on it. [01:10:13] Thank you very much, Greg. So the last person to hear from on this question is Leona, because I know you kind of went through a similar journey as Lolita has gone through here. What tips can you share with her? And then after this, we'll get to Eric's question and Greg's question, then Amyas question and then sort of I think I love the question, but I need to I was in the same position that you are right now. [01:10:37] Almost two years ago, it was really hard for me to get any interviews. [01:10:42] I was applying on LinkedIn trying to talk to the LinkedIn, but I found this conference which was related to my field plus Data science. My field is economics and I just put my resume there. [01:10:59] There was supposed to be a job fair in that conference and I got some interviews there and I got my offer from IBM from there. So the point is, if you can't find any conferences, either your advisory firm or you yourself, Tweeter, Twitter, LinkedIn, anywhere you can find, just try to attend. [01:11:18] If I believe a little bit challenging with Pandemic not talking to people directly versus when you could go to conferences, but still consider that I got my job that way. [01:11:31] So, Lolita, lots of great advice there. Don't worry. This session has been recorded. There will be transcripts and you will definitely have access to these answers. Hopefully that helped you out. And good luck in your job search. Next question. Let's go to Eric and then just preemptively for the answers for Eric's question. I'll go to Dave and can be cool. [01:11:51] So I have one. I wanted to throw out one quick thing to lovely death as a fellow job search or hear something that's worked for me. So this is and I didn't make this up. This came from Reno Perry, who's way smarter than I am. So go to LinkedIn, go to the search bar, just press enter. You don't have to search a word and then go to content and then post it in the past twenty four hours or past week and then go to companies that people work for and type in the name of the company you want to work for and see what people have posted. And I have gotten two interviews from people posting, hey, we're hiring DM me. And so I send them a message. I'm like, hey, saw this job. Look, I know and I post on LinkedIn, so if they want to look at my profile, they can see that I'm there, I'm active, I'm a real person and I've had I've had two interviews from it and so, like, it works. So try it. And then you're actually talking to the people that you want to talk to. And so there's that. So my question is, I had someone reached out to me on LinkedIn to talk about a project and so basically said, I have an unlabeled product. [01:13:00] Data said I wrote all this down so I can remember it. So I have an unlabeled data set with thirty six variables that seem to show correlation into kind of six major categories. I want to use the data to identify similarities and hopefully create clusters or profiles of those groups. I'm kind of I want to label this unlabeled data basically. Right? So I use latent factor analysis of Verolme exhortation to create six factors for kind of categorizing the data. They didn't come labeled as being six categories, but the rotation works. So two quick questions. One, does it make sense to cluster on the scores of each observation for those factors? I think it makes sense, but I just wanna double check. And then my second question is, can I use using this model now? I guess that exists. Can I use that somehow to score valid or not score, but like, I guess or can I use that on validation data to get scores and assign a cluster or group to those new observations like on that? Does that does that make sense? Like I'm trying to figure out if I can use that predictively. [01:14:06] So the short answer is yes. The long answer is it depends because there's no such thing as the free lunch, right? Sure. It might work into my not. So what I would recommend doing is if you are interested in using clustering to create labels and then use the labels with the original Data, then create like a classification model is what I'm hearing. Right. Because you've got six distinct labels. You can certainly do that. And what I would do is I would incorporate as many additional features back in when you train the model that you're trying to do as you can and then try and use a more sophisticated algorithm that has nonlinear boundaries. Ideally, because what I found typically is when I tried to do this in the past, I tend to find things like random forests or boosted decision. Trees tend to work better because they can form arbitrarily complex decision boundaries based on the nature of the algorithm. So that's what I would do. And then, of course, the. Problem is going to be, is once you train it, it's hard to verify what your generalization here is going to be, really. So I always keep that in the back of your mind that, yes, you can do it and it will work sometimes and sometimes it won't. And generally speaking, what you want to do is you want to factor in as many inputs to training the classification model as you can. So what I've done in the past, I've actually used multiple clustering algorithms and then use use the for example, then use create different models for each one of the clustering algorithms and then see if I could use them as an ensemble together to then create the final predictions for the final data set. Well, that's super helpful. Thank you. [01:15:38] So would anybody else like Chairman Cam or Van on that one? Cam, if you want to go, go for it. [01:15:47] Well, just happening here, a lot of what I was thinking was actually along the lines, what David was mentioning, just stress testing, different types of cluster analysis to see what works like pass. I've had to cluster different types of personas and have them to trust us canings or scheme androids or chemos chaplains trying to see which one is the best of those lots and then really going from there. But I agree completely with what David was just mentioning, the sound approach here then. [01:16:17] Any tips? [01:16:18] I call at the end of that the questions. I'm sorry, I didn't get the whole thing, so just give me on this one. Sorry. [01:16:26] Yeah, no problem. Eric, do you feel like your question was answered satisfactorily? [01:16:31] Yeah, that was definitely helpful. I hadn't really I hadn't really considered taking the cluster definitions and then backing that back out to just using as all of the factor or sorry, all of the variables instead of just the factor. So that'll definitely be my next step that I'm in. [01:16:47] So, everybody, thank you so much for hanging out, sticking with me while I get through these questions. Sorry if I have not called on anybody in a while. Peer input is always welcome. [01:16:57] Feel free to just limit yourself and jump in on any answer at any point. Next up is Greg subquestion Greg eager to step up again. [01:17:06] So Harp it. No, no, I tried to stop, but so at some point in my career, I would like to have run a company, create a startup. And I have a short term plan in the long term. So short term is written in the next five long term, ten plus. [01:17:28] And I've been reading about Federated Learning. One of the things that I want to attack is distribution. We don't have shortage of resources like food. We have a problem with distribution. Does anybody know a little bit about federated learning? And if it's one of those things where the big Data Gobbler's are only the winners or can start a startup bank on this? So I know something like Google started this idea of having a model train on your Data on your device without taking that Data self to the centralized data storage and to do things with it. Right. So in this case, it's partially Ghedi PR sound, but there are still some concerns that the model could still memorize that edge devices Data. So I'd like to know, how is Federated Learning moving now with some use cases that you know of, that I can have a better understanding of it if it's the solution or part of the solution behind the possibilities of fixing some of those distribution issues that we have. [01:18:50] Can I take a stab Harp? Yeah, I think over time after time, I love this question. I'm not shocked that Gregg would ask it. So this is actually brilliant. It's actually a form of integrating brilliance because you get to keep people's data private. You get to train on that data. But it's a hive mentality, too, if the parameters can be shared, well, that that's still keeping your data private. And this is one of the tricks and reinforce learning models to suffer, of course, from there, trying to reduce reduce the curse of dimensionality. That's why they don't train on all the data and that's why they're so popular when the data gets too big. So the cool thing about this, Greg, is the the model parameters can be set back up to the cloud. And there's kind of a high learning going on with balancing the parameters. It's almost like an ensemble approach, just high level conceptual speaking. And this way all the models can benefit from the small sample subset. But then there's also the balance of this. Well, you're trying to narrow down on that specific person. So the art of. Getting that right would take some research, but that's just my thought. [01:20:06] Any thoughts or Carlos Reven on federalism in Congress wanted to throw out some key words that are learning. [01:20:13] Is there useful, Keyworth looking to like edge computing and transfer learning whether it's only going do but at a high level? The idea is, like you said, like you, it's very dangerous to be recording data from users in a way they don't understand. And Apple iOS teens coming out can be a whole thing over the ability of people to block Data it can have a huge impact on the apps use to pay attention to that. The idea with better learning is that it gets you out of the problem of a criminal record. Someone's audio, ship it up to Google Cloud. They're going to translate it for me and then give me the results, send it back to me. Federated Learning would put kind of a stock model that's transferred from the cloud model on your local device, allowing you to have local model learning on a device which is extremely useful because you get like Tom said, you get out of this idea. The dimensionality no longer are they trying to understand your words in the context of all of the words that are similar to your sound. They just get what you said, like you said this and you didn't like the results. So they clearly don't understand you. So there's a lot of benefits, better learning around. Like, OK, I actually like it actually more accurately because it's just like this one guy. But I'm not gonna go too deep into I do recommend you look into the intersections of Block Channel and I because Federated Learning and as computing and block chain, there's a lot of people in that space trying to fuze all those together for those concerns or the privacy concerns. But the GDP concerns for the actual just computing cost of storage and accessing Data at high frequency in the cloud. So it's a huge space. It's going to blow up. Look at those keywords. [01:21:46] So what do you want to try to I don't know what kind of phone you have. Like, I've been talking around quite a bit with our kids for Apple, and I think that's a really good way to understand how Apple is handling stuff like Federated Learning. So, I mean, embedded in the augmented reality kit is also machine learning models, computer vision, language recognition and so forth. And the fact that you can operate on this, a couple of year old iPhone, very sophisticated models, it is really cool. So I would say it again, depends what if you're using your android, get the equivalent, but you see our core. But some interesting way to sort of segway into it in a way that's kind of fun. You can check it out. It's interesting conversation because on one hand, you have to be learning, going on or distributing machine learning. On the other hand, talking to you last week about this. And it's like you have almost a mega API thing going on right now where you have three. Right. And so you're having these like two almost polar opposite worlds where on one hand you're sort of this all encompassing NLP API. [01:22:51] And on the other hand, you have individual and also but great, great insights that that's probably what they use all these things for the case. For that then any anything to add to this conversation about Federated learning to dance around and and trying to figure out how I can say this. [01:23:14] I don't know if I can. [01:23:15] You're in really, really interesting space. I think you've already figured that out from everybody else comments. I would agree with everything that's been said, but the majority of the points that you really want to look at, especially lockshin Social want to say would look at lockshin and think about how you would maybe use lockshin to understand what your Data has been doing, what it may have done before it got to this point. And where sort of a metadata tag you are just thinking about watching that way and seeing if you can embed a whole lot of data about your data in a block chain in a secure way so that you can have access not only to a particular data point, but also to through some of the transformations and things that happened to that data point along the way as part of is sort of lockshin ish concept. There's some other people out there that have written about this, some not. It's not coming home at the top of my head right now. Like I said, you're in a very, very interesting space and a lot of these answers are really, really good answers. And like I said, the block to piece of it would be something I would look at as an important component of privacy, but also knowing a little bit more about your data itself, not just the individual data points, but maybe a bit of the journey, the data behind that journey and the provenance just I think here's under and I can also add some blocking stuff. [01:24:44] And so not under your NDA. The idea there is that like when we do when we create models or creating them with certain data under certain parameters on certain devices with certain input. And what block can you do is that lets you keep a permanent record of all of those steps and all of the. I kind of dependency management in some way, and what would I like to do, though, is you have this metadata that also the second layer of transfer learning so that on the same device, in the same conditions, you have a slightly different input. Well, like Tom said, you know, you're actually you know, you're transferring parameters without transferring data and it lets you get around a lot of concern on paper. So definitely dove in on the block chain. And I'll send you a paper on block for that. Details how all of this stuff, what like how the block chain secures the status of devices and stuff that's recording it in a chain. [01:25:37] Super. Thank you. Thank you, Joe. Thank you. Thank you, Carlos. I appreciate that, Ben. [01:25:41] Thank you, Tom. We had you on spotlit here. So while you were making our way down the slopes, we all got to join in on that. Ben, do you have anything to add about, um, uh, Federated learning? He's probably doing himself on the slopes. [01:25:58] Um, I do really quick. So better at alerting you became top of mind. covid hospital networks were unwilling to share their data. So Utah actually, I talked to a senior health informationis They literally said enough people have not died. And I feel that they didn't understand the disease. And to say that out loud sounds so stupid. They weren't sharing their data. They can share it because it happened. So, yeah, the learning needs to happen. There's superimportant stuff. Happy, happy to send people. Avadon Provisional Patent. I filed on tokenized kind of in the spirit of anonymizing learning. Happy to send it to others interested. OK, that's it. Sorry for the distraction. [01:26:42] Oh good man. Thanks. Greg was there. We had one question. Was that, was that satisfactory? [01:26:48] That sets me in a nice room. So I appreciate the answers here. Thank you so much, everyone. [01:26:53] Right on. Next question we got up is, Amaia, are you still here? And again, thank you guys for being so patient and waiting for your questions to come up after me. Oh, that sort of. And then we can open it up or we can call it evening because we've been hanging around. It's been awesome. But I mean, I go up and my first of all, am I saying your name right? [01:27:13] Yes. Yes. Thank you so much. Thank you so much. Hello, everyone. My name is Amal. This is my very first happy hour. So very excited to see all of you and really very much inspiring. And thank you for all of that. So quick introduction. I'm a scientist at Dow Chemical or last year, and I would like my question and jet as well. But just to say it again, for last year, my team is using SAS. So we are like a sex shop starting like from like last month. My company has decided to actually transition from says to Achuar and we are learning about it. It's a very different landscape. Access is prepackaged with a lot of things. And I'm learning like in adjure, there are a lot of manual things we really had to do. The company is going to use Python as a development language. So I'm just kind of like frustrated and struggling with how to switch from the SAS mentality to like now this cloud technology and a landscape. So if you have any advice on how to transition and how to change that mentality from SAS College. [01:28:29] So I think first it's get good at Python before. I mean, you can probably simultaneously get familiar with cloud technology. [01:28:35] But if you're doing everything in Python, get familiar using Python to make the transition from SAS to Python myself. When I was about statistician's SAS was the language that we used for everything. And I was in that role for almost five years and I made the transition to Python and it wasn't too difficult. If you Google, it's something along the lines of pandas for SAS users and like on pandas actual documentation, they show you the PANDAS equivalent operations for SAS code and you can kind of easily pick that up. One thing you could do is, you know, you you sustain, you're good with it. You know exactly what your output should look like. Great. That's a baseline. That is a the comparison for you. Now you can learn Python and recreate all of your work in SAS in Python and check your answers and see, OK, did I get the same output? Was that exactly what I was expecting or not? I did a whole bunch of that and I just got really, really good at the python. And Pat is really, really quickly. [01:29:37] Um, so I'd love to hear from Dave on this as well. Mr Microsoft. Go for it. [01:29:43] So just so you guys know me and I have a little bit of history. He was a student in one of the boot camps I taught at a former employer for Data science. What's up AMEA and I look a little bit different. My hair is longer and I grew, my beard is gray now. OK, so first question. I would ask is, if you move in Asia, is are you planning on using the Azure SAS offerings in the cloud, for example, Azure machine learning? Because if you are, they have a drag and drop based interface. The code behind is actually what's known as TLC into some internal Microsoft framework, which is all based in C sharp, not surprisingly. So from that perspective, if you're relying a lot on azure machine learning, it's not really a one for one transition from SAS to Python, because ideally you're using all of the uplift that you get from using Azure machine learning, which is not going to be Python at all. It's going to be this kind of drag and drop. So it's more akin to Enterprize minor in a way. So I guess would be my first question is, are you guys really planning on building everything from scratch and python in Azure? Are you planning on relying on the services? [01:30:45] No, everything will be mostly building from scratch. So writing like Python calls, like all the pipelines, Data pipelines, the gold and then the deployment pipeline. [01:30:56] Okay, well, there you go. So so what Harpreet Sahota is, is gold. You need to learn python. Yes, of course. I'd be asking. I mean, I might be asking, why are you moving? What's the what's the benefit of moving to Azure is just for the company. [01:31:11] Strategies are always moving to cloud and is expensive and already has with Microsoft on all of the different systems. So they are asking all Data scientists to monitor. [01:31:25] Yeah, that's interesting. So that is the former enterprise architect. I'd be like if you're moving to a platform like Azure, why are you taking maximum advantage of the stuff that is going to say? [01:31:34] That seems kind of odd. I mean, there's a lot of great managed services on Azure that you might want to take advantage of, especially a data scientists as your studio, for example. Awesome. And so, yeah, everything from scratch. I don't know what the advantage would be. It actually might actually be more expensive because it's mostly the cloud, right? [01:31:55] Yeah, most of the use because it's on hierarchical forecasting and it's forecasting so far that we would have to write like all that model. I don't know how that would drag and drop the model yet. [01:32:09] I would suggest using I mean, every cloud framework now adds the ability to write a model, deploy it. And I highly, highly suggest you it with the structure of the cloud. And much like if you're using the class or something crazy like that would be weird. [01:32:24] Yeah. So. So, for example, to Joe's point, as you're told, it allows you to say, look, I want to use all the managed services except for right here and I want to put my python code right there. And you can do that. And that's what you want to do if you can. OK, yeah. [01:32:40] So Matalote, right. We worked with us and you can see the reaction and the mistakes you see people make when they move into a cloud environment as they try and replicate what they did in the cloud. And if you try and do a model to model that sort of setup, it almost inevitably means it's going to be more expensive infrastructure running 24/7 and just some more expensive if you run it that way, trying to do things that cloud native way, I would suggest reading the docs or maybe even getting a server or something. And I think they have the eighty nine hundred balances. Understand, like how azure you to stuff and then do it that way they'll make your life a lot easier. The biggest trouble is you'll notice the cloud of data say, oh, I know what I'm doing right obviously. And then they get along really well. Why am I suddenly saying all this money? [01:33:28] And so each cloud has its own way of doing stuff starting at. [01:33:34] Oh, just going to say, I mean, generally with managed services, you've got to watch the costs through. Some of these services are quite expensive, but generally you're going to save money on the Ottos gaming aspect and reducing your operational overhead like your Data ops team can just be smaller if you're not constantly managing these instances or spending them up and down just happens automatically behind the scenes. [01:33:53] So the way I always when I was back in my architect is what I always said was, if you need the same number of software engineers when you move to the cloud, you're not doing it right and go on. [01:34:04] Awesome tips there. So hopefully that's enough to get you started. Thank you, Eric. Thank you for for reminding me. I saw this actually pop up on LinkedIn earlier this week as well. Matthew, Vltava in the House just started a new job with I believe. Is it Brinks? [01:34:24] Yes, it was Brinks home security, not bricks that the bank AIs brings the eye making sure people don't get into your house. [01:34:32] Guys do that. Is that that is awesome, man. So congratulations, everybody helped me. And actually, that is that is awesome because so much. What was your journey like, man? What was your journey from from when you first heard that you wanted to get into Data to finally get this job? [01:34:48] Well, basically, I started out in digital marketing, building our Facebook campaigns and Google when that first when that first happened. So I mean, it was more questioning. So, I mean, first it was a OK, so why do I have. [01:35:01] Cpi's here, and then it was OK. So how do I set up these KPIs and then there was like, wait, is there are some statistical significance to these KPIs and what can I dig out of this so that I can it's just always been a what can I learn new and what kind of insight can I pull out of this? So basically, I would say it was a natural progression because of curiosity. [01:35:23] And so your role as a data governance analyst. So talk to us about what what that role kind of like what what what is a company expecting you to take care of? And did you have to, like, train or learn on your own to to be able to come up with the knowledge base for this role? [01:35:40] Well, I mean, as far as like the knowledge base comes for this role, I did have some Data governance background mostly. I mean, background with me. It's like ever since the pandemic started a little bit before the pandemic started, I did work with startups and I was a consultant. So most of these are medium to small shops. I mean, they're not big, big organization. [01:36:03] So I'd be in there not only doing the data analysis, but I would be also working with retail. And part of the problem, the part of the thing I found out is a lot of you guys have probably experienced is the Data governance. I mean, inconsistent tables, poor, not not very built out etel documentation. So, I mean, it was mostly just experienced from over over consulting and working with startups. As for the actual job itself, it was it's more of I'm sitting in the middle. I'm not I'm not the data scientist, I'm not the data engineer or the data analyst. But I am the guy who's there trying to work with multiple departments to standardize definitions, make sure Data integrities correct. And then that annoying guy who shows up in your GitHub in your comments saying, do we actually have this KPI? Correct. Is this right? I'm kind of in the middle there and I'm working. I'm in the middle between different departments. [01:36:58] That's cool, man. Congratulations on getting the new role that I'm sure everybody here is just as happy as I am for you. That is freaking amazing, man. Great job. Looking forward to seeing you move along in your career and continue to do great things, man. So let's see if anybody else has questions. I've gone through the Q here. I'll open it up again. Thank you guys so much for sticking with me and hanging out and trying to get to everyone here. So hopefully you guys all had an opportunity to to provide some assistance for right now. Let's let's roll it back up. Does does Akshay still have a question? Because I think you had a question earlier, but we missed it. So if you have a question, go for it. Okay. I see you right there. Okay. [01:37:47] Hey, I'm not sure because there was like talk shows in the chat site that was me for the question or somebody else. [01:37:53] But if you got a question, man, go for it. I'd love to hear it. [01:37:57] I did have a lot of fun coming in to the call. But being part of this call, a lot of those were answered. So thank you to everybody. This is my first ever podcast and then into session. And I'm impressed with the kind of responses and experiences everybody brings. So I'm looking forward to continuing that. [01:38:15] Right. Um, and well, thank you for dropping in and thank you for joining the happy hours. And make sure you log in to my actual podcast. Listen to my other episodes. You got a couple of weeks to catch up before I start releasing new episodes, so tune in. Plenty of time for you do that. Anybody else have questions that we didn't get to? I'm looking at either Greg or Juan or in shop. I want Harp. Yeah, go for it. [01:38:39] Yeah. Hey, everybody, it's great. My Fridays have become more memorable with these sessions. I'm just loving it. There's so much to gain. Coming back to my question, what transition, a transition does it take to become a data scientist from data analyst? I mean, I keep reading and listening. People are saying there's not much transition, there's not much of a difference. [01:38:59] But why why are there two different titles for this one I'd love to hear from from either Monaco or Giovana or Mikiko on this one. Let's start with with Monica, then we'll see if she has any thing to say that MIKIKO. [01:39:15] Well, I guess it depends, depending on what the specific job roles are, because I've come across data scientist positions which are truly data analysts. And I think the distinguishing factor is really those advanced analytical techniques. So machine learning or NLP or any of those more advanced versus just your fundamental statistics, finding trends and and anomalies and such, that's not a very, very high level. There's there's probably so much involved. And it's it's kind of they blend into each other very much. [01:39:53] Yeah, definitely. It's a great question. So I'd love to get a lot of people's input on this. Do you want to go for it? And after Juvonen, let's hear from from Ben. [01:40:01] Yeah. It's always dealing with Data at the end. I think that 80 percent of our time is Data clean Data Data to see the correlation, all these amazing things to know the story behind Data. So I think is everyone has to have the data and that is a skill. And sometimes people ask for data scientists. But at the end, a lot of people is working as a data analyst. So I think if one is inside of the other one, so it depends. But I think it's a very good start to start it to learn about data analysis. So I think it is important to know everything about how to handle data because to build a model if you have done this first, but, well, your model is going to go well and if you're the predictions in a good way. [01:41:04] So, yeah, after actually I've got to ask Eventa to chime in here because we've had conversations about this one year on my podcast and a couple of weeks ago when you're in a happy hour as well. I'd love to to have you break it down for us because I love the stance you take on this then as you can hear me. [01:41:28] Ok, well, then I heard Ben. I'm sorry. Oh, don't say a clip. The silence on the actual pothead episode because I appreciate it, Ben. [01:41:42] This is actually something I've studied is the misclassification of people using the title jobs. The data scientist job titles are actually horrible, absolutely horrible labels, and we consistently use them. So if we're talking about data analyst, the difference between a data analyst and a data scientist or a data engineer and a data scientist or a machine learning engineering a data engineer, you're talking about a classification problem. And like I said, the labels terrible. The reasons why people will say, hey, there's no difference between a data analyst and a data scientist or some people even go as far as saying there's no such thing as a data scientist. So I'll just go over glorified data analysts. And it really just depends on what you've really interacted with as far as skill sets capabilities, as well as what sorts of results that you're used to getting from data scientists or whatever you call data scientists. So when you hear people talk about data analysts, the difference between data analysts and data scientists, data analyst is the second most frequent job that comes prior to having the job title data scientist, because I think no, it's it's simply what is most commonly called the pre data scientist or that role before. And so when you hear a lot of these myths, which really hearing is people talking about jobs, I don't understand very well. And it ties directly into companies not understanding very well what data science is. Is it different from machine learning? Are the skills required to do deep learning, different skills required to Data science? And companies don't have a good point of reference because they haven't seen any of the stuff in production and those who have seen very little of it actually be effective or functional in production. [01:43:28] And if you want to understand what to do, the scientist is and how you're going to transition into the field. You have to look at a couple of target companies, a couple of groups that you would want to work with looking at the data scientists, machine learning teams are working on. Look at what they're actually accomplishing, which would look at what's getting into production and what's working, what's making money. And he thinks they put on a quarterly statement. Anything that they say. We have booked revenue of a real value, like they put a number on it and then they talk about it being tied to their machine learning efforts. And this is really rare. But if you look at that level of specificity, you can see how very few companies have come to terms with monetization. And that's really the you start at the end with we're making money and then you start working backwards to these are the actual people we need in order to continue to make money and to make more money on machine learning. And so if you want a career in the field, follow the cash, look for any sort of skill that you can use to create a tangible outcome. And when I say tangible, you've got to get beyond the buzzword of machine learning or data science and beyond model. I mean, what kind of model are we talking about? Marketing impacts or are we talking about business cases around pricing strategy? Are we talking about decision support? And so you really have to dove into your career and where you want it to go. [01:44:47] What niche of the field will help you to build value working for the types of companies that you want to work for? That's the only way you're going to get there is to sort of be smarter than the employers and the people that are trying to hire you right now because the. People that want to hire you don't know why they want to hire you. You want to hire five years of experience in 18 years with the technology that's two years old. And this is where the ridiculousness comes in is again, back to that misclassification. I can call I can call my dogs to decide. Does it mean that they're creating, you know, five times the value of their salary for the company? I'm Pamela Tibbles. I mean, you know, and that's really that ridiculous. That's the point that we've gotten. So if you want a good career, forget the job title Glouster skills that will allow you to create, build and add value and really get really good at creating specific use cases within the business for machine learning. Whatever that machine learning looks like, it's just analytics are really good at tying what you just did to a person. That's really where machine learning is going, is that dollar sign of being able to monetize and be clear about how you have used Achievability, not just a word like Python, but you've used the capability to build something in Spanish. Sorry, can hijack that first. I was absolutely love that man. [01:46:14] Actually shout out to Al Bellamy in the house and only do you want to talk to us about the difference between Data scientist and Data analyst. [01:46:21] Love to hear what you have to say about that. Um, I'm not sure if your mike is functioning or not. Um, I don't see my name on my screen. Al Bellamy. All right. Let's hear from a live to hear from Dave and Al. You are here. [01:46:38] I am. Yeah. I've been kind of ducking in and out, so I keep waiting for this Friday where I actually get off work at a normal time. So I that's my back door right there. Yeah. So OK. Yeah, I sort of helicoptered in here two minutes ago. So I mean, I personally don't have the kind of hard skills to to claim the title data scientist. If you if you told me to model something or predict something, I, you know, I could do some regressions and kind of basic predictions. But if you told me to analyze something, then, OK, cool, I can do that. You know, let me let me look at the data. Let me do some media, you know, some basic stuff in Excel. I'm working on some some better things. I can combine stuff in SQL now, but yeah. As far as a path forward for me to go from analysts to scientists, that's what I'm still trying to figure out. But I'm new to even really thinking about that that path. [01:47:34] So I and people are in their dreams. And thanks for it. Thank you for sharing that. I would say that after I die, I forget my little thing here. I'd love to hear from Dave or Tom on this, but I think the main difference is that a analyst analyzes, a scientist discovers. Right. And I think that also teeters on the subtle difference between inference and prediction. I would say that analyst might spend more of their time on the inferential type of tasks, descriptive type of tasks, whereas a scientist would be more, I guess, predicting and forecasting and things like that, if that makes sense. [01:48:14] Dave, Tom, I hope I don't intersect with Dave. I'll just chime in. I love Eric. Answer in the comment. I think he's getting to the heart of it. And I want to point out we we are the historical figures of the Data. And you'll figure I mean, we're just now seeing an explosion on something that's going to radicalize what we're doing, attention and transformers. And the terminology is just going to be messy for a while. And you know what a Data analyst does? What a data visualization. That's what a data scientist does. That is going to overlap a lot for a while. I find that the more I'm trying to do a good job, I stop and constructively criticize myself. I realize, you know what you need to be more like. Kate, strike me in your pipeline work so you'll do a better job so that you really understand the Data better before you just dove in and apply a model. And so I think I just love it that we're all still learning from one another. But and I don't mean this critically, but I kind of get irritated by a hyper concerned with titles and differences goals. I know we've got to do it a little bit, but we're just so early in the Data age, we've got to be flexible and learn on the fly and it's just still going to be messy for a while. That's all I'm saying. But I really did like Eric answer. I think he's getting at the spirit of what we need to think about. [01:49:47] Eric, do you want to share your answer with with that podcast audience who cannot share? [01:49:55] So I did just a little bit of research about it and found out that, like, physicians didn't act. We have specialties at all until like about two hundred years ago is when it finally started. And so as medical science has progressed and roles and titles have become super specialized because it used to be that, you know, all of your physician tools get pretty much fit in the bag that you would take wherever you were practicing your medicine. Right. But then things got more complex to trial and error and regulation. There were a lot of medical schools just handing out degrees for basically no work. You know, it was it was kind of a bonanza, right? I don't know where else we might have ever seen that bonanza of any sort like that. But anyway, so anyway, it's not going to take two hundreds years to subdivide Data science, obviously, like we move at breakneck speed now. But it's helpful to me to keep that in mind that there's no point in chasing after a title because right now it may be super valuable. And then in like five years, it's going to be like a dinosaur title because all these cooler, newer, you know, junior data unicorn titles will have come out that people are are way more intense. This isn't the first time an industry has matured. Unspecialized. [01:51:07] Dave, let's hear from you. And then after Dave, let's hear from Mikiko. [01:51:11] Yeah. So seven years ago, I would have said, if you're not doing machine learning, you're not a data scientist. And I was wrong. I was completely wrong because I just work at an insurance company and actuaries would be like, dude, we've been doing data science for decades, man. We've been using data to drive business results. And then the older people, the operations research people will be like, no, no, no, wait a sec. We started doing this in World War Two, man. We were data scientist before the actuaries were data scientists. So I wouldn't worry too much about the title Persay. I really like what Vin had to say, which is Target, what you want to do, grab the skills that you think are going to be useful to deliver business value using Data like in most of my content on LinkedIn, I typically use phraseology around this idea, like, are you a professional that wants to drive business results with Data? I don't say, do you want to be a scientist? I don't say that because maybe you work at H.R. you work in supply chain these days. I don't think it really matters. Focus on how I can drive business results with Data and then work back to the skills that you need. And that's going to be all the basics. [01:52:15] It's probably going to be SQL. It's going to be some sort of scripting language like our python. It's going to be some statistics. Not nearly as much statistics as you think, by the way, in practice, generally speaking, not even close. And it's definitely not as much machine learning as you think, generally speaking as well. You can get away with very few simple and powerful techniques unless you're in some sort of specialized area like self-driving cars or something like that. So my advice that I give first is like take my opinion is one data point. Go get other data points. By the way, first thing second of all, do research. Look at the companies, look at the rules that you want to do, cross-reference the descriptions of what you're going to do, what kinds of technologies they list, and then that'll give you some idea of what you should what you need to do in terms of skills and capabilities to actually make it to a data scientist title. Or maybe you can just be a data savvy professional, maybe work in marketing, and you figure out a better mousetrap in marketing. Awesome. Guess what? You're going to get rewarded for it and you're going to build up your portfolio and then that can take you anywhere you want to go. [01:53:18] Thank you very much, Dave. By the way, everybody wants to see two thousand thirteen Dave with the short hair in the earrings, apparently. MIKIKO Yeah. [01:53:26] So, yeah. So that's just some background. So I start off in like ESOPs growth stacking, like whatever the title was like in Silicon Valley at that time. [01:53:39] Then I moved into like a Dallas roll titled Right. And then I move to a title to a scientist role. And then now I basically do everything under the sun except sign the vendor contracts for the staff that I'm at. Right. So the the lines there are a little bit more blurry. Right. But I think like to kind of summarize everyone's points. Right. So the first off is historically design, as well as just the casual bucket when D.J. Patil wrote his piece and we've had some more roles that have special license. And so I used to be this idea like the PhD, who is also the super computer engineer and all sorts of stuff. And they talk to people and not make and feel like idiots, you know, in the in the boardroom, because I was just kind of like the first like that was the be one. Right. And the V two is we saw some other worlds pop up. It was like the Data ALISDAIR scientists. And then I think the Data engineer I remember seeing like a bunch of blog posts were like those were the three. Right. And it was basically like like strategy slash, like living with the business in the middle is kind of like still the research person who would kind of build their own models, do their own analysis. [01:54:49] I had like a white paper for the company or to a prototype or understand like, oh, this new algorithm, can it can we incorporate the product? And then you have the. An engineer who worked on the infra and I think now we're kind of going like the V three four, if you're like some companies, let's just skip the no go to like V8, right, where you're seeing some more specialized roles like machinery engineer also or stuff. So, you know, one thing to remember is there's like this historic trend, as Eric point out. [01:55:20] The second reason why you see those titles and they sometimes are the same thing is because companies will use them as traps to try to get people into the role. They'll say, like, well, we can't offer you as much money, but we'll offer you a title because we know we can sell it to you as they like if you have a design style and it's easier to get recruiters in the future. [01:55:40] So there's that. And then I think the third point that people brought up is that sometimes hiring managers actually just don't really have a good understanding of what those needs are. And typically, you'll see that in like an embedded sort of structure where it's like a hiring manager on the business side is hiring for like a data scientist or Data Alice who lives with them. Right. So you do have those kind of like three factors going on. Now, that being said, right, like some companies like Facebook and Microsoft, they're specialized enough where they have what are called decision scientists. Right. So they have data analysts. They have a data scientist or a research scientist, and they have a decision. Just a day is still like living with the business partners, is still helping guide the business decision. Scientists usually is more involved in the experimentation and inference aspect. And then the research the scientist is is just doing research. Right. So part of making that switch potentially is first off, looking at companies where you could make that tile change, just become more senior in your kind of existing role of companies, will be willing to negotiate that title. [01:56:44] If you you know, if they're flexible enough, some companies like Google, Apple, they do have a ranking ladder. So you can't just negotiate a title. You have to actually have a switch and responsibilities. [01:56:54] So that's one method, right, is you you negotiate your own company the second way you do it sometimes it's also just going to another company where maybe they're doing the same work or something similar and they're just calling it a data scientist title, you know? But I think those are sort of like shortcuts, right? I think you really want to as everyone point out, you want to identify the kind of work you want to be doing, whether it's working with the business and all that, doing research, doing engineering, and then just trying to figure out, well, what is the next iteration in that sort of particular track? And I think that's probably the best way to go. My particular trajectory, I started off in the business side and growth as a growth maker moved over to analytics because at that point I had really enjoyed using data to help inform the business, move to do science, because I, I like the idea of focusing in on a problem, you know, researching it. And now I'm moving over to the engineering side because I like the idea of building products, and that's just kind of how it is. So you can make the switch either through the title or through getting more senior in your responsibilities, moving to company. But I would encourage just understanding, like what's the kind of work you want to be doing, like, let's say the next one into your schmiel and then sort of planning around that. [01:58:12] Absolutely. Love it. Thank you very much. Mikiko, last call for questions. If anybody has a question, just type it out into the chat. I also just want to say, obviously ignored the titles. When I was leaving grad school twenty thirteen, I was interviewing for roles that were called Predictive Modeler in the actuarial field. And now that job title doesn't exist. But those are data scientist and Data nice work, which was just like random for us and in SAS. Tom, how are you doing. [01:58:41] Real quick shout out everybody. If you quickly hold your hands like this and then start doing this for Harpreet Sahota Harp read your entire pattern of happy hours. You've built an incredible community. We praise the good job, brother. This is awesome. [01:58:58] Thank you so much. We had forty people show up today. I mean there's so many people that I wanted to hear from that I just didn't get a tattoo. Matt Housley, thank you so much for being here and being so active in the chair, man. Appreciate you. Everybody else who's been so consistent showing up people here from day one. I started this thing 14 weeks ago, 14 weeks ago. It was just like me and four people. And then one week of in came and then one week three and came. And then after that it was just like thirty people minimum. It's been awesome. [01:59:28] Thank you guys for spending news with me. Yeah, definitely. So why did you start this win. Why? Why. So like the podcast in general. Just the open office hours. [01:59:38] Yeah, the office hours. Yeah. [01:59:41] So for Data AIs dream job do I do like multiple office hours a week, like, you know, six hours a week, pretty much out of office hours. And that's great if you can afford the program because it's expensive. And I figured how can I help more people by using a skill that I already have. During a time of day where I'm usually doing this already and, you know, I figure I would just do open office hours and invite people in and help them out somehow, and that's kind of how it started. So Fridays for 430, most of my office hours are usually at four thirty four thirty or six p.m. and so Fridays are open. Had nothing to do. Obviously I can't go to the pub anymore after work. So let's do office hours and then and hang out with people. That's kind of how it started. And you know, just inject more of the Data science into the podcast because I know I've kind of shifted in a new direction where I'm interviewing just authors of books that I find amazing. And I've been fortunate enough to to convince people to come on to my show like like Lanying Robert Green. Like, that's mind boggling to me. But more of a way to keep Data science. That's part of the podcast while I venture off and explore other areas for interviews and still keep that element, um, with the podcast. Yeah, that's. Yeah. [02:01:05] So thank you guys for joining. Happy holidays. Merry Christmas. Happy New Year. For those of you that that join the festivities and wore your festive sweaters. Thank you so much for those of you that join me in a drink. Cheers. I really appreciate that. And if you're the average of the five people you spend the most time with, I spend the most time with 25 to 30 amazing individuals every Friday. You guys like the only people I hang out with nowadays. So thank you so much for for being here and just helping me raise my average that much higher. Take Care will be back January 8th for the next happy hour. And they're happening every Friday after that, um, new stuff happening with the podcast next year. I've got some awesome, awesome guests coming on the show. I've got some awesome podcast episodes recorded. I'm really excited to to share all of this with you guys and hope you guys keep coming back. Hope you guys continued to show up and help support everybody. Thank you so much, guys. Take care. Have a happy holidays. And remember, you got one life on this planet, so why not try to do something big, take care of everybody. [02:02:20] Yeah.