Chanin_Nantasenamat_mixdown.mp3 Chanin: [00:00:00] So I would believe that scientific method would be the science part of data science, and the data could be biology, chemistry, physics, business data, economic ecology. So I would believe that it's pretty much like a plug and play like data could come from many discipline. And then the analytic part, the machine learning part would be to take that data and make it into an interpretable model. Harpreet: [00:00:36] What's up, everybody? Welcome to the Artists of Data Science Podcast, the only self development podcast for data scientists you're going to learn from and be inspired by the people, ideas and conversations that will encourage creativity and innovation in yourself so that you can do the same for others. I also host Open Office Hours. You can register to attend by going to Bitly.com/adsoh forward slash ads0h. I look forward to seeing you all there. Let's ride this beat out into another awesome episode and don't forget to subscribe to the show and leave a five star review. Our guest today has a passion for data science, machine learning, bioinformatics, research and teaching. He's earned a PhD in medical technology from Mahidol University in 2006, where he was awarded the excellent thesis award by the National Research Council of Thailand. Among many other pursuits, he is also an associate professor of Bioinformatics and the head of the center of data mining and biomedical informatics at Mahidol University, where he leads a research laboratory that harnesses data science for unraveling [00:02:00] the hidden knowledge of big data in medicine. In over a decade and a half, he's published more than 100 research articles, reviewed articles and book chapters, and has been invited as a visiting professor at many universities, including one I am very familiar with, Cal State Fullerton. However, you may recognize him from YouTube, where he's gained nearly 100,000 subscribers across his two channels the data professor and the coding professor. So please help me in welcoming our guest today, a man with a mission to educate the world about data science, the data professor himself, Dr. John in Mammoth Hand. And thank you so much for coming on the show today, man. I appreciate you being here. I know it's quite late for you in Thailand, so I appreciate you taking time at your schedule to be here, man. Chanin: [00:02:53] Right. Yeah. Thank you for for the awesome introduction. Yeah, it sounded like like a boxing match. Your introduction. So cool. Awesome. Harpreet: [00:03:02] Thanks, man. Dave fan? Yeah, definitely, man. You know, I 100% absolutely deserve it, man, because you've put out so much good content, so much good work, man. I remember using some of your resources to help overcome hurdles that I was facing when I was learning some stuff and stuff that, you know, I still come back to for for additional help on. So I appreciate you doing all that. Before we get into some of your content and, you know, get your views on data science, where data science is headed and all that stuff. Let's get to know you a little bit better. Talk to us a bit about where you grew up and what it was like there. Chanin: [00:03:34] Right. So I grew up in the US in California and actually in Los Angeles. So and then like it was like about I went there like when I was four, I went to elementary, went to high school, and then I took the equivalent examination, a California high school proficiency examination. I was just like the equivalent of a high school diploma. And then I went back to Thailand, I entered university, and then I [00:04:00] did my I got my bachelor's in biological science and then I did my PhD in medical technology. And then I've been a professor for the past 14 and a half years, and I started my YouTube channel two years ago. And yeah, so we're here today. Harpreet: [00:04:18] That's crazy, man. I didn't know you grew up in California. That's actually where I'm from as well. So I'm from Sacramento. Yeah, from Sacramento, California. And actually, I went to Cal State Fullerton for my undergrad back in like and I'm dating myself here. But that was like 2003, maybe 2003, something like that. So quite, quite, quite a long time ago. But that's, that's really cool. So, so what, what took you back to, to Thailand? Was that just family? I'm assuming like your background is Thai like what brought you back to, to Thailand? Chanin: [00:04:47] Yeah. So like at the time my, my dad was in Thailand. I mean, he migrated back to Thailand and my brother was in the university there. And so I had to migrate back to Thailand as well. And then I did my college there. I almost had a chance to go to community college in Irvine, Irvine Valley College, and then planning on doing a bachelor's degree there at UC Irvine. Like a transfer. Yeah, but then we move back to Thailand first. Harpreet: [00:05:15] Yeah. Actually took a class or two at Irvine Valley College. I think it was like a biology class. Oh, cool. Just to make up some some classes in between sessions because I was a horrible student and flunked a lot of classes because I just did not know. But so how different is your life now than what you thought it would be growing up? Chanin: [00:05:34] Yeah. So like. When I grew up. Yeah. So actually, like growing up, I was pretty, pretty hooked too to the TV show Bill Nye, The Science Guy. And so I think that I got motivated to get into science because of that show. I mean, he's a very talented educator, and at the time there were no such thing as YouTube. So I think that's like the most closest thing to like science [00:06:00] entertainment. Fast forward, like at the time when I was starting my YouTube channel, like my daughter, he gave me an idea like, like, why don't you start a channel? Because at the time she was like watching Kids YouTube channel. And then at the time I was thinking of a YouTube name. I mean, like, my name is very long and then cinema, right? So I don't think anyone would be able to remember that. And then at first I thought of getting a name like Data Guy. I think that someone had that name ready and then like one day it just hit to me like, okay, I'm working as a professor, so why not be the professor? And so, yeah, I didn't, I used to the professor asked my name on, on YouTube and yeah, that's pretty cool. I never, never had thought that I would be a YouTuber, you know, like at the, at the first, you know, like at the first impression there, it's like I could remember that some of my students were kind of like surprised. It's like, wow, you're you're doing YouTube. Harpreet: [00:07:03] That's cool, man. Your daughter is the one that kicked off the idea for you to do that. When it comes to, like making YouTube videos, what is your most favorite part about making the YouTube videos and what is the part that you just kind of liked the least? Chanin: [00:07:15] Okay. So the famous my favorite part would have to be like making new friends. Like, for example, you can do anima, you know, like all of the great YouTubers. And it's always a pleasure to learn from all of you also from the awesome audience as well. So, you know, like all of the comments, all the suggestions, whether good or bad, like I try to incorporate that and use it as a lesson to improve upon because there's a lot of things that I don't know, like content creation, video editing, graphics, you know, blogging, all of that. Pretty much like learned from the internet as well. Yeah. Like making YouTube [00:08:00] videos by learning it from YouTube. Harpreet: [00:08:02] Yeah. That's like super meta. That's super meta, man. So what about your least favorite part? So what part of it is the toughest? Is it just that the the editing and the blogging and stuff like that? Or is there some parts of it where you're just like, Oh, man, I hate doing this? Chanin: [00:08:18] I wouldn't say that I hate doing I mean, I don't think there would be anything that I hate about it, but it's just like there are things that I think I could improve upon. Like at the beginning, you're like my very first or a couple of videos. It took me like an entire week to edit the video and then like over time I pretty much like learn the tips and tricks I could maybe customize my keyboard and then, you know, like playing video games. When you play video games, you have like keyboard shortcuts. So I have that for, for Adobe Premiere Pro. Like, for example, I would set my I would use only two buttons like the Z and X for cutting and for Ripple. Ripple. So after cutting, you know, like it'll combine the clips, you know, like just to do keyboard shortcuts. So I would use my mouse highlight area. Cut, ripple, cut, ripple. And, and after that I dabble like using Python to do automatic editing for like deleting clips that are having like very low audio and then cutting that out, piecing it together. And I mean that made my editing super fast. It would take maybe 10 minutes for editing or 20 minutes. Harpreet: [00:09:34] Yeah, that's that. That's pretty cool. Using Python to clip out like the low audio portions. Chanin: [00:09:39] That's right. Harpreet: [00:09:41] That's something that I am. I need to get the source code from me for to do that. Chanin: [00:09:45] Voiceover. Harpreet: [00:09:47] Make life a lot easier. So let's get into kind of bioinformatics. At first, that's not something I'm very familiar with. So you're kind of like at a high level, like what? What is bioinformatics and how did you get [00:10:00] into that? Chanin: [00:10:01] Right. So bioinformatics is essentially you're you're taking informatics approach computer science in order to make sense of biological data. And so like the term, bioinformatics is essentially biology and the interface of biology with informatics. And so how do you do that? Like for example, you could develop tools, you could perform data analysis, you could, you know, like there's a subtle difference between bioinformatics and computational biology. And like my good friend, I've also tweeted about. A recent article that we published together. His name is Dr. Malik. And so he suggested to me, like, there are subtle differences between bioinformatics and computational biology. Like bioinformatics would be computer scientists would develop bioinformatics tool and a computational biology or computational biologist would be someone who would make use of the bioinformatic tools that scientists have developed and then solve biological problems, make sense of biological data. And so, yeah, there are some granularity, differences between the two. So in a nutshell, like we're trying, like for, for me personally, I'm trying to use bioinformatics data science in order to design new drugs. So for drug discovery. Harpreet: [00:11:22] That's super fascinating. I would love to kind of get get into that. Yeah, that's that's a great breakdown of what bioinformatics was that that really helps me kind of understand that I used to be a biostatistician for for a while. So just hearing like the name Bioinformatics Biostatistics, I thought maybe they're similar related, but turns out that are actually vastly different from your description here. I guess through that coursework that you did for bioinformatics, it sounds like it really did set you up quite well for a foundation in data science. Was there like any additional upskilling that you had to do in in, you know, machine learning or data science topics? And if there was any additional upskilling, what was your process [00:12:00] like to to acquire that knowledge? Chanin: [00:12:02] Well, that's a great question. So actually, I've never taken any structured curriculum even in bioinformatics. So all of that are pretty much self-taught. So I had my bachelors degree and biological science, so it's similar to a pre-med program. But the thing is, at first I thought I would become a medical doctor after my graduate, after my undergraduate. But then, you know, like some of my friends when they graduated, they went on to do. And for me, at first I wanted to change my degree totally. Like I was saying to my fourth year, I wanted to change to a computer science major. But then I thought to myself, I mean, I'm almost graduating, so like one more semester left. So I might as well graduate with a biology biology degree. And so at the time, I was I always have this passion for computer gadgets. I like to build computers on my own and like upgrade and build from scratch. And yeah, so I never thought I would enter the field of data science. So it pretty much came to me after I graduated. No, actually during my PhD. So like during my my second year, I came to know about data mining from a mentor who just graduated from RPI in the US. He was doing data mining research and so that was back in 2005. I heard the term data mining and then from there he gave me a book. I still study from that book and then Ph.D. There's a lot of self studying, you know, like it's not like at the time data mining is such a new field field even for for Thailand and even I guess for the whole world as well. Chanin: [00:13:40] And so I pretty much like Google. There were no YouTube videos at the time I hit the library, you know, like I discovered that if I perform tutorials, like from the first patient to the last piece of a book, it didn't really help. It didn't click. And so what it did was I selectively selected [00:14:00] particular chapters that would help me to solve specific problems. And then based on my own biological problems, data problem that I would like to solve, I would Google for solutions specifically. And then over time it kind of magically fall into place, know like I didn't really have a structured approach to learning, so it's pretty much like you're like, you have a problem, you Google for it. You find a solution either from a book, either from a book from a tutorial, and then you solve that problem. And then over the span of three years of the PhD, I would have thought maybe 100 small problems. And then you know, that 101 problem and it pretty much fall into specific domains. And if you could cluster it like first time, it will fall into like data splitting, data clustering, feature selection, machine learning algorithm selection. So I mean, if you look back, it kind of structured into a well-defined area. Chanin: [00:14:59] And I think for for beginners in the field, I always believe that it's all it's possible to break into data mining or data science if you have the passion for it. Like, I mean, the join is not going to be easy. So like the immediate gratification for me is like being able to solve the problem that gives me big joy and it's kind of like, give me like this boost. So, so whenever I feel down, you know, like I would I would feel better if I could solve small problems. And so for the span of 3 to 4 hundreds of problems. And that kept me motivating and. And at the time, you know, like being a professor for ten years, probably seeing it kind of felt a bit boring, you know, like things are becoming more repetitive. You publish, your paper gets published somewhere, but then you're unsure whether people are seeing your work. It's like you're unsure whether the work that you're doing, it's anyone reading about it. And so [00:16:00] you to pretty much give instant feedback, you know, like you publish a video. And then there would be a lot of people commenting suggesting like this is wrong. Like, you know, I would take that to improve for the future of video or like the font I'm using is too small, you know, okay, I'll increase the font size or maybe viewers would suggest some improvement to the code. And so I would also learn that as well. Harpreet: [00:16:26] Well, that approach that you took to learning, like I kind of call that on demand learning because it is such a huge broad field that it's impractical to try to learn everything all at once. So the best way you should learn is actually by doing it. So doing something and then through the process of doing that something, you're going to inevitably face an issue that you don't know how to do and research and figure it out. And then that's how that iterative learning kind of happens. Maybe it's not the word, but cumulative learning kind of happens that way. Oftentimes I get messages from people like, What kind of projects do I do? Like, you know, what should I research? People messaged me all the time asking me what they should research for further bachelor's thesis or master's thesis. And I'm like, I don't have an answer for you, but you need to find that answer out yourself. Like when people come to you with this type of question, you know, how do I figure out what project I want to do, how to figure out what I want to research? What advice do you typically give to them? Chanin: [00:17:26] Right. So I would give them advice like for example, everyone, I believe they have their own strong points. No one is perfect. Like the thing is, what are you good at? What domain are you coming from? Say, if you're if someone is coming from biology, then figure out a biological problem and then apply data science to solve that. I mean, if someone's coming from economics background, I'm sure there's a lot of economic related problems that they could solve. So if you saw something that pertains to your own field of knowledge that you have [00:18:00] deep knowledge in, I think that will give you immediate gratification, immediate joy if you could solve it. And I think it's about pushing the boundaries further and further. Like, you know, like there's a lot to learn about. So it's always great to be able to have problems and then make solutions to them and also sharing that solution to fears and then learn as a community. Harpreet: [00:18:27] We love that, that advice because you're taking like previous knowledge that you've already kind of have a good core competency in, and then you're using that to push yourself to learn something new. But when you do kind of hit those moments of, oh man, like, I don't know anything like imposter syndrome moments, then you're like, Oh, actually, I do know this thing. I just need to learn how to take this thing and apply it to that, right? It kind of helps it with that momentum and helps with that confidence as you're moving through it. Thanks so much for sharing that. Now I want to get into you mentioned drug discovery a little bit earlier. Let's talk about that real quick, man. Like what? What is drug discovery? You know, how are drugs discovered in my mind is just like mad scientist in the lab mixing things into a pot and having all these things blow up. Is that like mixing chemicals and stuff? Is there more to drug discovery than that? Chanin: [00:19:13] Right. So I would believe that that is kind of like in what's a disaster? There's like a cartoon and Cartoon Network. So like we would imagine drug discovery would be something like that, like a mad scientist, right? But in reality, it's not like a one person team. So drug discovery is a team effort, and most likely it's advancing very rapidly in academia, in industry, more in academia, you know, like progress will be a bit slower because of limited funding, like big pharma, like Pfizer. Astrazeneca, yeah. There will be teams of hundreds of scientists, you know, coming from traditional wet lab and also computational lab as well, like working together and to get a drug to market. I [00:20:00] mean, one drug would come from maybe thousands of chemical ideas in order to come up with one drug. And the cost of getting that drug to the market, it takes at least maybe $1,000,000,000 and it takes more than ten years to do that. So, yeah. Harpreet: [00:20:19] Where does data science kind of enter into the mix here? Like we talk about chemicals and then and stuff like that, like how how does data science kind of fit into the mix here? Chanin: [00:20:29] Great question. So there's a field called chem informatics. So in bioinformatics, you have biology, informatics, informatics, you have chemistry and informatics. So actually, sometime I use it interchangeably. Biology and chemistry are quite related. There's a field called biochemistry, like chemistry pertaining to biological science. And so so the thing is, if you could quantitate chemical compounds into numerical descriptor, that is where you essentially take the numerical descriptor and then you can apply machine learning algorithms. So essentially, you apply the same data analytic methods, the same data science processes to that. So, so the thing is to take a compound, a chemical compound, and then make that into a quantitative or qualitative description. Harpreet: [00:21:19] So what's that data kind of look like? Is that like tabular data? Is that like an unstructured data? How would we kind of just kind of have like a mental representation of how we quantify chemical data? Like, what would that look like if we were a table? What would the rows and columns be, I guess. Chanin: [00:21:37] Yeah. Like, for example, if you have a data set, like you have the name of the compound, like for aspirin and there's another chemical representation. It's called Smiles and Patient Smile. I can't remember the full name, but essentially it's a representation of a chemical structure in one dimension. Like, for example, the equal would [00:22:00] be carbon, double oxygen would be oh, so C equal to zero would be carbon double bond oxygen. And so it essentially tells the connectivity, the atoms in a molecule. And so if you quantitate that there's a python library called already kits, you could compute molecular descriptor out of the smile citation. And then once you have the molecular descriptor, you apply machine learning to build a model. Harpreet: [00:22:28] That's super fascinating, man. So do you have like any interesting use cases or studies you can share with us that talk about, you know, the involvement of machine learning and drug discovery, like, like a friendly, easy to read paper or maybe one of your YouTube videos if you got something like that. Chanin: [00:22:45] Right. Yeah. So my, my, my YouTube channel, I have a playlist, I call it the Bioinformatic playlist. And so in that playlist I have several bioinformatic related content. Like the first video in the playlist would be the first six video. I call it the Bioinformatic from Scratch series. And so that started from the basic like how to collect your own biological data set from the Internet and anyone. You don't need to have a biology background. If you just follow the tutorial step by step, you could collect a unique biological data set to to make analysis on that in the in part to the part six shown how to calculate the the scripter, how to build the model, and finally, how to build a web application using the stream library and then eventually deploy that on Heroku or on stream lit cloud. Harpreet: [00:23:42] That's super cool. I think I might actually do that. So part of my job at Comet is to create a cool projects using the Comet software because we do a lot of experimentation management type of stuff. So I think I might go through your YouTube tutorial series, build out a project, write about it and stuff like that. I think that would be awesome. If you [00:24:00] don't mind if I do that, I think that'd be really cool. Cool? Yeah. We'll be in touch regarding that. Definitely can machine learning in in drug discovery, can that ever, like, bypass the need to run like a clinical trial? Is that kind of an absurd idea to have or is that is that possible? Have there been cases where that happened? Chanin: [00:24:20] Yeah. So clinical trials is required to to figure out like the safety of a putative drug in humans. So that would take a couple of years. It would have to involve enrolling several cohorts of people to test the efficacy of the drug. But there is another area that data science could help through the entire drug discovery process, which is to, you know, to take existing drug that you have in the market, but then to find a novel indication, a novel therapeutic indication, like, for example, if I have a antibacterial, let's say I use data science, chem informatics, and then I figure out another indication, meaning that I figure out another treatment, that the antibacterial drug could be used as a anticancer drug. And so what that means would be I would be able to, you know, save ten years of drug discovery efforts because I would take an already FDA approved drug and then find a new treatment for that. It's kind of like teaching a. Dog a new trick because the thing is like when researchers find an antibacterial drug, they would have performed testings related to only antibacterial, but then they wouldn't have expected that it would also have anticancer activity, so they wouldn't have had the opportunity to try that out. But let's say that if computers could be used to perform some sophisticated simulation. So aside from data science, there is a field collect molecular simulation, molecular docking, molecular dynamics [00:26:00] and like using app initial like using physics based algorithm in order to show molecules and proteins coming together, binding, calculating the interaction, energy. And so. If via data science combined with molecular simulation approach, if we could find a novel treatment for the existing drug, then that would save ten years. Harpreet: [00:26:26] Yeah, that's really, really fascinating. Do you know of anything that's kind of been released on the on the market that that that has used this approach? Is it widely used? Is it commonly used? Or is this kind of something that's right now just kind of like a theoretical idea? That question. Chanin: [00:26:43] Yeah. So that's a great question. Yeah. So there is a company called in Silico and I think like in the past two years they published a paper. I'm not sure about how, how fast it was. Like it was in a matter of months that they found an existing FDA approved drug and they found a novel treatment. I believe it related to antibacterial indication. Yeah. I could find a paper and share with you. Harpreet: [00:27:09] Yeah, that'll be great, man. That's super cool. But it's like a whole side of, you know, of the methodology that I'm familiar with that I've never seen, like, you know, get exposed to. So I think it's super cool how there's this combining of disciplines to create these really, you know, amazing use cases that are really helping humanity. I think that's so cool, man. I mean, speaking of helping helping humanity, like you're on this mission to to help people develop as as data scientist. And, you know, we talked a little bit about how we got into the YouTubing, but where did that where did that that spark to help other data scientists come from? Chanin: [00:27:47] Yeah, so so like for for my full time job, I work as a professor. So one of my duty is to supervise or mentor younger [00:28:00] researchers, PhD student, master's student, undergraduate student, and also to teach and also to do research. And so, you know, like YouTube is another extension of my passion. So I, I really like to teach. And, you know, over the course of the past, what, ten years before I went to YouTube, I felt like that there are like, although I teach every semester, but then like after the semester ended, it kind of felt strange, you know, like, okay, it's like, okay, no more class. But you're not YouTube. It's like you're able to do it over and over again. And so, like, the fun never stops, like party never stops. So, you know, like, whenever I have something interesting that I heard from somewhere, I would like to share it to the world and, you know, like being able to engage with the community and learn together, I think that is an awesome experience. And like even ask the professor, I never felt that I know everything, you know, there's so much that I don't know and I'm very open to that as well. And so like I learned a lot from my students, I learned a lot from the viewers and learning. It's a very fun and eager and like, for example, I think I've remembered hearing for the first time about experimentation, monitoring. I think comments got not implicit. Harpreet: [00:29:26] Yeah, yeah. That's real. Chanin: [00:29:27] Work. Yeah. Awesome. Yeah. So like when for the first time, when I saw that, I was like, wow, this is so cool. It's like it's like your alien biology. When you're doing experiments, you would track the experiment, you would keep a notebook, like a physical notebook. And what common is doing essentially that like but then you're able to monitor the experiment or machine learning model building could visualize diagnose the model feature like parameter optimization and all that. You know, the great thing is it's [00:30:00] much more trackable and reproducible than the actual wet lab experimentation. And actually two years ago I've written a paper, a review article on the computational reproducibility of Drug Discovery Project. And in that paper I argue that a lot of the research in computational drug discovery, they're not reproducible. You're like sometimes it's like sometimes it's not even in a python or duper notebook, some experiment or like be a point, click to UI software. And let's say that if, if I mistakenly clicked on the wrong button, then that would give the wrong result. So it's not reproducible and you know, like the things that Common is doing or other related tool, I think that that is a game changer. Harpreet: [00:30:49] So it's an interesting kind of idea, something I've been wrestling with. A lot recently, kind of like, I don't know, maybe a philosophical, I guess, contention or or discomfort that I'm feeling is, you know, we call it data science, but how much science is there in data science? Because I feel like there's really two different two different breeds of data scientists. There's scientists like yourself that are actually like, you know, doing science and doing research. But then there's data scientist who are in the business and, you know, they're in organizations and they're business minded type of data scientist. You know, they seem very different to me and I don't know. If the business side, like the business data scientist type of people, if they actually understand the science that they're doing, I don't know if that where I'm thinking this is making sense, but let me let me kind of reframe the question as where is the science in data science? Chanin: [00:31:42] Mm hmm. Yeah. So like, like maybe from one of the episodes from Bill Nye, the Science Guy, it would have to be the scientific method about the hypothesis forming. Like, for example, in business, you would have a B testing, you would have null hypothesis, [00:32:00] you would have some variables, you would have some variables that you fix. And then you would like to see whether that influences the outcome of the experiments. So I would believe that. Scientific method would be the science part of data science, and the data could be biology, chemistry, physics, business data, economic ecology. So I would believe that it's pretty much like a plug and play, like data could come from many discipline. And then the analytic part, the machine learning part would be to take that data and then make it into a interpretable model. And like, there's so much in data science, like there's data collection before data can be collected. There's research experimentation, research design. I believe in biostatistics. There's so many like Latin, what you call it, refined stratification. Yeah, that is where stratification and like sampling all of that forms the core of big science. And then you have once you have defined experiment, you have you have to figure out like, okay, from which population you would like to sample your data and then you collect the data. Chanin: [00:33:16] Once you have the data, then you would perform data processing. After that, you would do maybe extraction like in chem informatics, you would take a small cetacean and then you would extract molecular descriptor ecology. Maybe you would have some biographical data. If in business you would have some like click through rate data or turnover or in business or in data, you would have churn rates. And so like the domain specific variables, I would believe that they're all the same. If I borrow Shakespeare's, you know, like all the world's a stage and like whatever the data it is, if you're able to represent it and it represents something, they're essentially the same thing. And then you could use analytic [00:34:00] methods, machine learning. Iida I think that's very important. In recent times I believe that there's like this hype over deep learning, but then like the classical EDA, the classical linear regression, I think it offers a lot of immense value. Eda also helps you to make a lot of understanding of the data, and most of the time you don't need deep learning in order to build some meaningful data model. Yeah, EDA basic models would be very effective. Harpreet: [00:34:30] Yeah. Thanks so much. I really appreciate you breaking down that that process for us. It's something I've been really getting into lately has been has been deep learning like I mean, not like an expert at it by any means, but it's something I've just been like intellectually curious about. It's it's really interesting, man, like, like the methodology, you know, from a traditional machine learning problem or project to like a deep learning one. The process methodology I feel like is a little bit, little bit different, I guess. Have you worked with with both of those how would you say it's know, compare and contrast that if you would for us? Chanin: [00:35:08] Right. So like most, most of the analytic methods based on our research project so we, we don't really have much dove into deep learning. So like our favorite algorithm, like my, my personal favorite would be random for us. For one thing, it practically works on most of the tabular data and you don't need to do any feature tuning or parameter optimization and essentially work outside the box. And so another great thing I like, it's the interpretability of the model. So like when I'm talking to a biologist or a chemist, like the first thing that they would ask is what features or what variables are important. And so the best way to explain that to them would be to use random [00:36:00] force because it provides to the index that you could interpret. And another recent addition, very powerful, is the SHAPLEY value. And there's the library called Sharpe, and it's so good. I really love the visualization that comes out of the library. And in terms of deep learning, I believe there is a lot of unique area. Like, for example, Gans, you could use it to generate new molecule. It could you could use it to train existing data and then you could generate a new molecule, same thing as you could generate a new human being. Like a photo of a, of a hypothetical person. Yeah. So same thing. It could be used to generate a hypothetical hypothetical molecule that like chemists wouldn't have thought about in that area. Harpreet: [00:36:50] Of models are so cool. Then something that I just like the interplay between deep learning and how it can help augment human creativity. There's, there's this awesome book called The Creativity Code by Marcus du Sautoy, really, really good book that talks about. It's mostly about deep learning, art and innovation in the age of AI. I'm actually interviewing the author of this book later this week. If you guys guys are watching live, tune in on Wednesday for that. So let's let's talk about a couple of blog posts that you've written. There's one that talks about how to learn anything and why it doesn't actually take 10000 hours to learn something. Those are great blog posts. If you could talk to us a little bit about that. Chanin: [00:37:30] Yeah, actually, I got that from one of the YouTube videos that I watched and I believe he's an author of the book and from pretty much I've pretty much summarized what I've watched from YouTube. And then he's also an author, a book. So essentially from that, it's like there's a misconception that in order to learn and master something, you would need 10000 hours. But then from from the video, I believe that it was only what you need 20 hours actually to learn something, but then you would learn something [00:38:00] and you wouldn't be able to do it at a at a good level. But in order to take to become a master, that would require 10000 hours. But in order to become good in something 20 hours. To to be able to play in the video. He spent maybe 20 hours or so and he was able to play some simple songs using the ukulele. But it to be a master like for example to be a pro golf you like that would probably take 10000 hours. Harpreet: [00:38:31] And even like iterations, right? Just the 10,000 tries at something I think is, you know, that's kind of where the learning happens the most is just across the iterations where that learning curve really starts to pick up for for people. Because at some point you've got to just stop watching the videos, you've got to stop reading the books and just start taking action. Right. So speaking of that, is learning data science the same thing as learning anything else is, you know, do we need a more specialized approach to learning data science? You've got this this awesome post about seven effective tips for studying data science. I guess you could share some of those tips. I know that we've talked about a few of those that kind of throughout the episode, but we can kind of condense it down here as well. Chanin: [00:39:11] Right? Yeah. So, so like for for myself, I believe that I think there's this awesome concept about the open mind, which is open mindset. The growth. Harpreet: [00:39:22] Mindset. Growth mindset, yeah. Chanin: [00:39:23] Yeah, the growth mindset, yeah. So like if you believe that you could learn something, you know, like it's just a matter of time. Then if you're keeping improving one 1% at a time, as you have already mentioned, you're like in a year, let's say for each day you aim to learn one new topic. So you're improving 1% in a year. That would amount to what was it, 300%. So you're like, I don't think there's a secret recipe to learning, but for one, I really like to break things down to the individual components. So [00:40:00] like, for example, if you're working on a PhD thesis and it looks like a formidable task, it looks like a while. But if you break it down, let's say you break it down into, okay, you could break it down to the data collection, what what you need to do. And then you make it as detailed as possible in data collection. But at the high level as well, you have data collection, you have data and you have feature or data processing, you have feature extraction, you have machine learning, model building model interpretation, and then you have model deployments. And let's say that OC in the collection phase and then you make a list, which database would you like to collect the data from? And once you have already collected data, what do you want to do with it? It's pretty much like you're going into detail, into the granular detail of that. And then let's say you make a chart and then you can figure out like, okay, for the course of the next 2 to 3 years, you make a chart and then you plan out your your time and then every week work on getting some percentage done. Like, like for myself, I would make maybe one iteration would be like maybe three months to get one paper and then you would iterate again, perform the same cycle again. Harpreet: [00:41:13] I like that approach is breaking it down smaller, smaller pieces and just manageable and then just attack it. Just just go right. So there's publishing papers now like that that must be like a lot of like stress having to do that. Like what happens? Like how do you feel like if you spend all this time like publishing a paper and then maybe it doesn't get accepted, right? For example, Yan Liqun was talking about how he, you know, submit a paper for zero IPS and end up getting rejected. And he's like, you know, godfather of of deep learning. I wondered, how do you how do you deal with those type of situations? Chanin: [00:41:47] Yeah. So like publication, like the obvious first thing would be you have to get comfortable with rejection because 99%, maybe it will be rejection and 1% would be acceptance. [00:42:00] So if you're comfortable with failure, you're confident with rejection, then you would enjoy the process, right? So the gratification would not be from getting published, but the gratification would be from learning and publishing would be an end product, would be some one of many things that you will acquire in addition to the knowledge that you would have. But then, you know, there must be publications as well. Like for example, if someone is doing a PhD or someone would want to be promoted to professor or associate professor, then getting published is also quite tricky as well. And you would also need research funding. You would also need to hire or talented individuals to pursue a PhD in order to help advance the group research projects. And so it's very I would believe it is a very stressful area if you're aiming to publish in order to flourish in the academic system. But then there is this joy of publishing just for the sense of finding new knowledge. So so one of the, one of the great thing in academia would be so you would be able to explore some area that would be interesting to you, but then you might do the as the head of center of data mining. So so not only that then I also have to manage the data center looking at the strategy, looking at the KPI and so, so. Some granularity about that as well. So and yeah, I guess KPI getting funded publishing like you have to get at least how many papers per year. I think that would be stressful as well. So yeah. Harpreet: [00:43:43] I can imagine that. But I like I guess the main takeaway there when it comes to creating work and pushing that work out to the world is actually to just enjoy the process of doing the work itself and make the objective of learning and writing kind of the [00:44:00] focus of, of your efforts. Let go of, you know, whether or not anybody is going to publish it. You know, that's not up to you. Just let go of that. But just focus on doing the thing that that you are in the middle of doing to the best of your ability. Ability. I think that's an important, important point there. Let's start winding this down, man. Let's do one last kind of what I call formal question. Then we're going to jump into the random round. So this question is, it is 100 years in the future, what do you want to be remembered for? Chanin: [00:44:29] Well, 100 years into the future, let's say that's tricky. I'm not sure. Probably as a I'm thinking I'm not sure if I could find a niche in data science, contribute to that niche. That would be awesome. Harpreet: [00:44:45] I think that's the work you do here. Bioinformatics with data science, machine learning, drug discovery. I'm excited to see where you take this and how far you go down to this niche. I was going to a real quick random round here. I guess first a good question to kick off with is, you know what, when it comes to the future of of data science and machine learning, what applications are you most excited about in the field of drug discovery or bioinformatics? What what kind of gets you hyped up when you think about it? Chanin: [00:45:18] Yeah. So, so as I mentioned earlier on about the repositioning, like the ability to use data science in order to find an indication for drug, I think that is a very interesting area to pursue at the moment. Another would be using generative modeling to create new molecules. And another I really like like it because the data visualization aspect is to perform Applied Graph Network in order to analyze protein protein interaction data and also protein compound interaction data like being able to visualize all of that would also be able to help perform data. We call [00:46:00] it drug repositioning because of the interaction between, for example, you have protein A and molecule A, protein B, molecule A, and you get to see all of that interaction in a network, kind of like a spider web, like the World Wide Web, like each website would be like one protein and how do they interact with one another. And so all of that complicated interaction, if you could find a way to visualize that and like in a graph network that fascinated me. Harpreet: [00:46:26] That's super interesting. Definitely to have to look into that drug repositioning to clarify the definition of that is that when you find new applications, new use cases for a drug. Chanin: [00:46:36] Yeah, exactly. Exactly. Harpreet: [00:46:38] So what are you currently reading? Chanin: [00:46:40] Oh, great. I'm currently listening to Audible Mastery by Robert Green. And, you know, like, the more I listen to that, it's like, wow, it kind of hit hard, you know, like you're finding you're finding an area that you're passionate about and you're exploring areas, you know, like without the, like, the influence of other people, like, like, like, for example, going up. My dad has a lot of influence on my education. Like me going into pre-med program was also influenced by my dad. I mean he practically like research all of the data. He also did the same for from my brother as well. Like my older brother, he went to UC Irvine and he went to UC Riverside. He did electrical engineering and then he did his MBA degree. And like for me, he also researched about biology and like exploring medical technology and he just said, okay, why don't you do a Ph.D. here? So, yeah, so. Yeah, that's that's pretty cool. Harpreet: [00:47:43] I love that book Mastery. I've got to that's like the book I've got on. I've got it on Audible, I've got the full book. And then Robert Greenhill says concise versions of his book, which are just real short books that, you know, like summaries of his book that you can read maybe an hour or 2 hours. And that's one that I always go back to, actually. You [00:48:00] should tune in to the interview I've done with Robert Green. I've actually interviewed him. Chanin: [00:48:03] Awesome. Harpreet: [00:48:04] Yeah. Yeah. Wow. Chanin: [00:48:05] I'll. I'll listen to. Harpreet: [00:48:06] That. Yeah. Yeah, it's he he said during that interview that that was one of the most interesting interviews he's ever had. So that's my claim to fame right there, man. What song do you currently have on repeat? Chanin: [00:48:18] Wow. Ah, let's see what song? Actually, not myself, but my. My daughter. Blackpink. There's a new song called Lolita. Okay. K-pop. K-pop group. Harpreet: [00:48:28] All right. And we take. Chanin: [00:48:29] That blackpink blackpink Lisa and the lead singer. Lisa is a Thai also. Harpreet: [00:48:38] That's awesome. I'll check that out. Yeah, I was, like, getting a new music from from my guest to to tune in, too. So now what I'm going to do is I'm going to open up a random question generator. So let's go ahead and fill this up. All right. Let's go for it. First question here is, what are your pet peeves? Chanin: [00:48:56] Pet peeve. The great class ten. Not sure. Harpreet: [00:49:02] You're not disturbed by anything. No worries. Let's jump to another one then. Yeah. You have any nicknames? Chanin: [00:49:08] Yeah. My, my, my. My nickname would be URL. Harpreet: [00:49:12] Url. Url. Yeah, I like that. Chanin: [00:49:15] And the Thai way of pronouncing it would be like earning money earn. Harpreet: [00:49:22] What talent would you show off in a talent show? Chanin: [00:49:28] Talent and talent show. Alto saxophone. Harpreet: [00:49:31] Oh, really? You play that? Chanin: [00:49:33] Yeah, I played alto, and I have a soprano at home as well. Harpreet: [00:49:36] That's awesome, man. That's pretty cool. Chanin: [00:49:39] Then in fifth in six, seven, eight grade. Oh, man. Harpreet: [00:49:44] So when was the last time you changed your opinion about something major? Chanin: [00:49:51] Oh, okay. Something major. I think the perception of what is success [00:50:00] like before. Like, success with, like, superficially feel like, okay, you have to advance in your academic career. Like getting a professor going, going up the career ladder. But then, like, I just figure out, like, you know, something doesn't really matter. You like, it's just something that other people view you as. But the thing is, are you happy with that or are you content with that? So I learn to be happy with like the, the everyday things in life and just being grateful for the friendship, for the opportunity. You know, like for example, being here today, I'm very happy and you know, like being in the moment, you know, like it doesn't require anything fancy and just enjoy life. Harpreet: [00:50:53] Absolutely. Love that, man. That's something I have personally been wrestling with, as well as kind of redefining my own definition of success to be more in line with what I. Think is success rather than what the outward world says is success. And it's challenging, man. Just a lot of introspection to get to that point. But I 100% agree with everything you just said. So thank you so much for sharing that. If you had to change your name, what would you change it to. Chanin: [00:51:26] Be the professor. Harpreet: [00:51:29] There? On the on the on the passport data. Professor Levitt. Let's do one more here. What's your favorite city? Chanin: [00:51:35] City of Angels, Los Angeles and also Bangkok. Yeah. So Bangkok, if you translate it in Thai, we call it home grown city. Heb means angel. Oh, good. And the English version of Gong Tayeb is Bangkok. And I've been able to live in both cities. I grew up in LA [00:52:00] and I'm I was born in Bangkok. Harpreet: [00:52:04] That's cool, man. I didn't know that. That's. That's an interesting bit of, I guess, coincidence that are going from one city of angels to the next. My friend, thank you so much for taking time out of your schedule to be here. How can the people connect with you? What can they find you online if they already don't know how? Chanin: [00:52:20] Awesome. Yeah. So I make YouTube videos, data professor and also coding professor. People could connect with me on Twitter. My tap would be the data prof also on LinkedIn as well. So maybe if I could provide the link to that. Harpreet: [00:52:37] Yeah, we'll definitely put all of that in the show notes that people can connect with you. And thank you so much for taking time at your schedule to be here. Man, I appreciate you staying up late for us and joining us today. Chanin: [00:52:48] Right. My pleasure. Harpreet: [00:52:49] My friends, remember, you've got one life on this planet. Why not try to do some big. Cheers, everyone. Right.