Host: You're listening to Augmented Ops, where manufacturing meets innovation. We highlight the transformative ideas and technologies shaping the front lines of operations, helping you stay ahead of the curve in the rapidly evolving world of industrial tech. Here's your host, Natan Linder, CEO and co founder of Tulip. The frontline operations platform. Natan: This week on Augmented Ops, we're switching things up from our usual formula. For this episode, we're introducing Eric Mirandet, Tulip's chief business officer, as your guest host. With that, I'll let Eric take it away. Erik: Hey, this is Eric Mirandet, and this week on Augmented Ops, we're chatting with Kai Yang, VP Product at Landing AI with a focus on computer vision. CHI has developed and implemented machine learning solutions for a semiconductor, biomedical, and other manufacturing verticals, and is here today to share his expertise with us. We'll explore best practices for developing machine vision solutions, the importance of the right data strategy, and how tools like visual prompting are helping democratize this technology and making everyone in your organization, a data scientist. Let's go. Kai, welcome to Augmented Ops. It's a pleasure to have you on the show and to talk about the present and future of AI and industry here. For our guests who aren't already familiar with Kai, Kai is responsible for landing AI product vision and overseas product roadmap. Before landing, he founded a computer vision startup that focused on bio applications. He's got over 15 years in EDA, holds a PhD from University California, Santa Barbara, and a BS from National Tsinghua University in Taiwan. Kai, it's great to have you. Thank you Eric. It's good to be here. So Kai, you've got an interesting background and I'd love to spend a few minutes just hearing a little bit about your story. I mean, how is it that you ended up in your current role at landing? It's, uh, it's been an interesting journey to get there and I'd love to, you know, spend a few minutes and, and hear a little bit more about that. Kai: Sure. So I actually start my career, including my PhD on silicon chip design. So we build chips, we build, or we also build the software to help those chip development. But along the journey, I've been always fascinated by AI, computer vision, but always at amateur level. Means that me and my friend building some project every weekend. But at one point, I just felt like, Probably the only way to really learn that domain is to jump outside the comfort zone that I've been doing for 15 years and do this full time. I kind of also use this as a forcing function to really kind of force me to learn this field that I'm very interested in. So I think that's where I kind of switched from semiconductor to AI. Erik: And how did you make that transition? I mean, that doesn't seem like a one for one from silicon chip manufacturing into AI. Can you talk a little bit about the course of study or some of the barriers, challenges you faced as you were making that transition? Kai: Actually, let me share with, uh, what is your one triggering point for me to jump outside the Silicon stuff to machine vision or computer vision is actually a, uh, one interesting Friday evening. About eight years ago, when my wife was a Stanford researcher, biomedical engineering, they call it BioE, right, where she was working on protein engineering and drug discovery. So I pick her up every, every Friday. One thing I noticed that every Friday afternoon, I stopped by her office or lab. So it turns out that the lab. The Stanford Biomedical Lab has done a lot of those cell culturing, right? They try to culture biological cells in a small petri dish. And then they actually hire students to manually count how many cells, colonies inside those petri dishes. And the number can be from very small, like five to 700. And 700, you can see the student just like counting those 700s. This is super labor intensive. This is why the researcher, like my wife, does not want to do it. Hire a poor student. So I was like, I was visiting. I was like, Oh, I cannot believe I'm doing the cheap design. I'm building software tool to help cheap design easier. I cannot believe those top research facilities still do this counting, simple counting thing manually. So I think this is where it drove me into, someone needs to build a tool to help solving kind of this easy computer vision problem. But it turns out it's a very difficult problem. We solve it. And this is where I spin up a company from that project and kind of lead to the career of doing AI and computer visions. Erik: That's fascinating. So you're looking at this, you're seeing these very capable, highly intelligent researchers spending their nights counting 1, 2, 3, up to 700 time and time again, you look, you think to yourself, there surely has to be a better way and I feel like this is a similar to the origin story of many great startups. There's like a problem and there has to be a better way to solve this problem. So that's what it is. Put you on the trajectory and I guess prompted you to shift gears into this field of study. So how did you go from there to your current role as head of product at Landing. ai? Kai: So actually when I joined Landing. ai, we were trying to solve this similar problem, right? We're working with manufacturing. They have those problems putting humans to do inspections, and they unfortunately cannot really build an AI solution or vision solution to help them. So we start actually, for Landing. ai, we start building the solution for them. For example, we build those inspection solution from computer model board all the way to those large automotive detector oil, oil leakage under the car chases. And one thing we learned is that we can continue to keep doing this for our customer. But actually a much better win win situation is that we or someone building a tool to enable our customer to do this by themselves. That's one important decision we made from landing is, are we the one giving people fish or are we the one we teach people how to fish? And then we found that the second one is a much more appealing and kind of align my personal goal to enable different people, different persona to really adopt AI, this very powerful tool. With a very simple use tool. Erik: Yeah. Well, you're speaking directly to my heart. Tulip, as you know, has a very similar philosophy here. And I think every single one of these environments is a little bit different. You know, you're not going to be able to solve. All of the problems for everyone, but you also don't need to, because the people who are in these environments, who are doing this work day in, day out, they know what the problems are and they know how they want to solve these problems. And so rather than giving them the answer, giving them the tools to get to their own answer and lowering the barrier as much as possible is a pretty compelling approach here. From there, for those who aren't familiar, can you just tell me a little bit about what is landing AI and what's the core value proposition differentiator? Kai: So landing, we are, we are focused on The goal is we are building an end to end, very easy to use computer vision tool that helps different personas, like subject matter experts, the software engineer, who have a lot of domain knowledge of their problems. For example, determine if an end product is a good product or a defective product. But they do not necessarily have the machine learning background, the deep learning machine learning background, right? Knowing how to change the optimizer, Fine tune the model. The tool is helping them. Do not worry about those machine learning details. They can focus on their core problems and using a very simple use tool to develop and most importantly, to deploy the AI model in the production environment running 24 seven very reliably and hopefully also very cheaply. Erik: So the way you describe it, it sounds like this was a very smooth progression, you know, one step after the other, but I know that that's not how it goes when you're building companies. I know that there are challenges that come up and they are numerous. Can you share some of the challenges that you encountered as you were developing this capability? Kai: Sure. Actually, there are two stories I would love to share. One is, I think for like every startup, I feel like the best way we build a product is we are the users, right? We are facing the problem first. So let me share with you a fun story. So probably you hear about data centric AI. Lenny advocate a lot of data centric AI. And people ask me a lot, why this is data centric? Why you guys are building tools, are doing data centric AI? As a matter of fact, Desktop centric AI is actually one thing that we, Lending, learned the hard way in the very early stage. In the early stage of Lending, that we have done many, many computer vision projects. One very important and very painful lesson that we have learned is Most, if all the projects that we have failed, we have failed our customers or our customers fail with us, are actually related to data. Bad images, right, wrong labels, inconsistent labels, are pretty much counts all the failures that we encounter. We rarely do not fail any project by picking up the wrong AI model structure, right? For example, at a very early stage and landing, one thing we learned for data centric AI is we got a project working with the computer manufacturing. They want to inspect motherboard connectors. So we meet with them, we talk about how the connector should be working, what's right, what's wrong. And the team come back to start labeling the images, building the AI model, get a 99 percent accuracy. We come back to the customer, it turns out our machine engineer really do not know the domain problem. Like, why is the connector good? Right. It's connected like this. It's good. It's bending this way. It's good. It turns out the domain experts do not contribute on this project that lead the whole project. Even we look very good on the paper, 99. 6 percent accuracy precision, but it's useless at the real world because the important data information never incorporate into the project. So I think from this lesson, you can see from that day, all the project that we work, also the tool that we are building, Leading Lens, are making sure that the subject matter aspects are always in the loop of AI project development. So I think there's one challenge we learned, and we really, really hope that other company who is doing AI, doing computer vision, do not, you know, repeat the mistake that we made. Like, I think it's very fancy to have an AI team working on this one. Isolated, but the fact is you really have to have these two teams, like AI team and also the subject matter aspect to work very closely together. Erik: Well, I think it's a really interesting point, right? Because I think a lot of times we have a tendency to look at these problems and characterize them as technology problems. But I think what you're bringing up is a critical point that not everybody appreciates that the tech is only one part of this. It's the tech plus the context that makes a difference. And so, you know, you might have a 99. 6 percent reliable model, but it's garbage in, garbage out. And isn't necessarily the person who has the necessary context to be able to appropriately characterize, contextualize the information that's being captured and fed into these models. It's a critical point. And I think it opens up this broader theme of democratization of technology more broadly. You know, something we talk a lot about here at Tulip as well, but I'd be curious to hear your perspective on this. Kai: Yes, I think democratization actually is the internal codename that will develop learning lens. The codename internal is democratizing AI. One of our perspective is, I think it's a lot easier to package a very complex AI tool to make it easier to enable a subject matter expert, which has 40 years experience, right, in the pharmaceutical, in the manufacturing line. But it is actually a lot harder to do the other way around, like getting those subject matter experts knowledge experience, or go back to machine learning engineer. And machine engineer might not be also interested to really learn that in a deep way. So, so I think this is where we saw the weight of democratization AI. Should be building a tool to really abstract the AI layer, like making development super easy, but hiding all the complexity, automate all those things that supposedly machine engineer or AI scientists can operate. Making sure they are can high hidden and automate, and it really makes the interface super simple. So subject matter expert can interact with a tool and getting a good result. Erik: You know, in retrospect, it almost sounds obvious. Why weren't we able to do this five years ago, 10 years ago, 20 years ago? I mean. What's changed? Why is this available from a technology at a price point that makes sense? What's unique about now? Kai: Actually, I think only technology available now. Take my personal experience of eight years ago when I started my own AI company. At that time, the AI model is not yet very good, right? It's okay, and it really takes a lot of time, me, my engineer, to really fine tune the model. But for five years, I think the technology is getting a lot more mature. Automation is getting a lot more mature. And also cloud is getting a lot more mature so we can start scaling up the solution easier in the cloud and making the cloud product to cooperate with subject matter expert, a machine learning engineer. When you Erik: say mature, like what do you mean mature? Like what does maturing of the technology, maturing of the cloud mean? Kai: So very specific example, I think the competition model today available in the market, in the GitHub, in the open source. Many of them are actually good enough, good enough in the sense to reach to whatever task that we are facing or our customers try to do. So now the problem of making a competition model works is not really finding the right model or tweaking the model, making it smaller, making it faster, making it better. That's 0. 1 percent better. The model out there is actually good enough. When they are good enough, this is only the time that we can enable our user to focus on data. If we wind the time back eight years ago, when the model is not as mature, having good data is not enough, right? Have good data, your model is not capable of learning all the features from those datas. But now, I think in the past three, four years, the CV model, computer vision model has been very mature. And now we see the problem is actually shifting to the data part. Erik: So it's interesting, right? So it sounds like there's been a transition of like, where does value creation happen in, you know, five, six, eight years ago, the problem was like, how do you develop these models? And that was the constraint in this value proposition here. It sounds like that's becoming increasingly commoditized, the model generation itself. There are a lot of good models out there. So now the value creation component of this is making those models. Available to the people who need them to solve very specific problems. Kai: Yeah, absolutely. Yes. Erik: So where is this going? If you were to extrapolate this out over the next, you know, five years or so. Kai: The AI has to be easier to easier to use. So today actually we are at a stage of the model is good enough. So having an end to end, making sure we pick out the best model, making sure all automation is there. I think we are at this stage. But people today still need to do pretty typical AI approach means collecting a bunch of images, label them from 200 to maybe 5, 000, 50, 000 images for inspection project, and then train a model, make sure the model works and deploy it, right? And this is an iterative process. One thing we are seeing is we are leading approach is if you see our website is called visual prompting and very soon We're going to announce a large vision model The idea is right this iterative and still need to label hundreds to millions of image approach Need to be making even simpler With a large vision model with visual prompting that the model is actually capable of out of box Knowing a lot of things of the product, of the vision, uh, representation of the vision features. One vision we are working on and driving is this approach can be a lot even faster. The time to value of using a tool to get to the value with a deployable model can be reduced a lot. When we started working on LendingLens three years ago, It take us 36 hours to train a model, and then you probably take four or five iteration to get a model to a stage that you can deploy 36 hours. So we are moving that for the past two years to five minutes, five minute. You can do a quick iteration of a model. There's a lot automatization on model training on the cloud. We are envision the five minutes can be even done to a few seconds, 10 seconds or 15 seconds. I think that's where the trend goes is the more powerful AI is reduce the time to value of developing those AI features or final product. Erik: It's pretty incredible. I, you know, and if you go to Tulip's library, you know, tulip. co backslash library, and you look for landing AI unit test, you know, you'll see an example of this. And so I've got a little experience with your product and it's amazing to me because, you know, I'm not a computer vision expert, you know, I think we took 10 pictures or something like that. And it's a very basic labeling of these pictures. But I was able to basically, like, take a picture, compare it against the model that had just been developed earlier that day. And it was giving me like, in one case, it was giving me like 99. 8%, I think, reliability on this. It was incredible to me. Kai: Yeah, thank you. Erik: So if you're talking about making it even easier than that, that's pretty interesting because it's not. As a person who doesn't know your product deeply from a technical perspective, but does have some limited experience as a user perspective, I'll tell you, it's not that hard today. Thanks. Which maybe opens up another sort of topic of conversation that's worth touching on. If I think back to, and I'm not going to drop any specific names here, but if I think back to how these types of capabilities were made available in the past and were marketed in the past, there was not really this notion of interoperability as a core value proposition here and I, you know, I think if you look across the market I think we're squarely in the era of ecosystem and open architectures. I'm curious to hear your perspective on this. Because you solve a really interesting, very specific problem. And I'm curious, when you think about opportunities to democratize this capability, how are you thinking about that from a product strategy perspective? Like, what are some of the things that you keep in mind as, as first principles when you say, how are we going to make sure that the people who need our product are able to receive our product? Kai: I think one of the most important thing is knowing the persona very well. Again, let me show you some mistakes we made. So in AI startup, you should start with machine learning engineer, right? This is where people know this area very well. And by nature, it's very easy. When you start building a product, you kind of, Hey, I'm building myself, building for machine learning engineer. I'm a machine learning engineer by training, so it's very easy. Very soon, I think we and many startups will find, outside there, there are just not a lot of machine learning engineers out there. Like knowing the persona, like working with manufacturing, machine engineer, the quality leader, the quality engineer. are the persona. So knowing what they want, their problem is very, very important. Eric, as you said, machine engineer care about those metrics, like precision recall, like they are living there, like care is like to the heart, but this usually not really the right reflect to the business requirement that our persona need to care. Also, the environment we build, the product we build. Let me give you a pretty fun story we have. Pretty painful one too, is when we said Persona, I remember three years ago when we just finished Landing Lens, we have a model and we have a way to deploy in the productions. Me and my team are so excited to go to the factory. We have the Docker, right, the Linux, all the environment there, say, Hey, we are ready working with you. Use the platform. The model is great. We test it. Let's integrate it. And the manufacturer line is like, why is Linux? Why is Python? We use Windows and C sharp, like why are you upgrading this like alien to me? And then they call the IT team. The IT team is like, no way I'm going to install this machine in my environment. Then we learned that the persona that we are, right, machine learning engineer, we live inside Linux, Python, kind of like, this is how the world works. It is not, right? In the environment, people prefer Windows and the virus system. So this is where we spent a lot of time making sure our product is talking the language of about our persona, right? We need the feature they need, we need the support they need. Nobody cares about Python, for example, in production line, right? So we're making sure we have C sharp support in our deployment and making sure our app double click install on the Windows machine and just get it goes. You don't need to worry about GPU driver. We take care of it. I think that's one important thing we learned that the persona of the people building this tool and trying to enable other people doing machine learning are very different than end users who is really going to use this in their daily life. Erik: Yeah, it's interesting, right? Getting people to change behavior is not an easy thing to do. If we don't evolve, if we don't grow, you know, we stagnant and, and, you know, we live in a competitive world and some of this is very natural, but there's also times when you just simply don't need to change the behavior and you can meet your users where they are and you can give them the capabilities they need in the context and with the tools that they're familiar with. And when you have an opportunity to do that, you should do that every day of the week, right? And it's, and that was one of the things as we were going through our, our integration with landing AI is, you know, just how easy it was, frankly. You know, you build a custom widget, you have a connector and you call the model, it comes back, it returns the results. I don't think we spent more than a day on that integration. And I look at that as sort of emblematic of the future of technology. Particularly as it relates to operations because, you know, if you're on a manufacturing shop floor and you start talking to folks about technology issues and challenges, they don't care. They've got a job to do and they want to know if you can help them solve their problem. That integration piece, we need to make that easy for them if we're going to really get in there and create value for these folks. Kai: Absolutely, Erik: yeah. So I completely agree and have appreciated Landing AI's approach to this as a consumer of, of the dual in this case. I want to ask a couple of other questions here. You know, you can't open up a headline or a news app without seeing ChatGPT or Sam Altman. I'm curious though, because something happened about, I don't know, nine months ago, a year ago, that fundamentally changed how the public thinks about, or how consumers generally think about AI, particularly generative AI. I'd love to hear your perspective from somebody who's so close to this technology. What's fundamentally different about AI? Because AI has been around for a long time. What's different? Like, why is this? Interesting. Impactful. How is it fundamentally different than what came before it? Kai: I think number one difference is a very personal level. My mom called me and said, Hey, I use TGBT doing something. I finally know what AI is doing. Right? So he probably hear me talk about story of what I'm working on, but finally it's a tool they use. Right. The gpt is a powerful tool I use every day, helping my message, my email, summarize all my articles. But actually the reality is it does not. So our customer also asked me, we see, even see the gpt4, right? The demo seems so appealing. Can I use their tool to help? So actually our team is the first team to jump in to test all those capabilities. And one thing we conclude is the GPT, those models are learning data from the wide internet, right? The conversation wise from the older text data. The vision wise is from all the internet image caption pairs. So one thing we test, we found that the model works very well for those generic tasks. For example, describe a theme of a pictures. Tulip is a person there, right? It's a flag there. But when we apply this to some free domain specific application, for example, we try to ask, we try to make a defect of the phone and ask, hey, can you tell any difference from this phone? It cannot, because the training data does not include those domain specific data. So the AI is no magic, right? You learn from the training data you have. GPT, GPT vision are very, very good at the generic internet data. But when we go to domain specific data, like for example, semiconductor wafer expressions, screen crack expression, it just does not work because the training data never seen a scenario like this. So for us, actually, we don't feel it's a competition. We felt it's a good competent of. The AI work, but also encourage us getting even focused on solving those domain specific problems where our customers still need help with inspection, multiple inspection. You don't see any or many images on the internet, right? There's no way GBT can be learned from that. Erik: So, double clicking on generative AI more broadly, not ChatGPT specifically in this case, but where do you see this going? Because I think what we've seen here is sort of a step function. I would agree with you that the biggest before and after of ChatGPT is basically consumer awareness. Like, people are interacting with AI in a way that they didn't previously. Whereas, you know, ten years ago this was largely reserved to labs. You know, data scientists now everyone has access to this and people are familiar with this. It's becoming a standard part of the technology we interact with every day in a way that is explicit, you know, not like Netflix. What show should I watch next? You know, that's more like A. I. behind the scenes. This is more A. I. as the entry point. And so people are becoming more familiar with this. There's been this step function in the last year. And maybe ChatGPT sort of constrains the conversation to predictive text. I think this is relevant from things like zero shot vision, and there's many, many other implications of generative AI here. What are the impacts you see this having on your business model in the way that you're aligning your product roadmap here? Kai: Yes, actually it inspires us a lot. So I think the task prompting, right? Interacting with a tool before, right? We're building UI, even before we're building command line. I think right now a text based, prompting based interface is definitely a very natural way people interact with the tool. So lately we are exploring that how, right, can we make the label easier? Can we do image search with text prompting easier? So I think that's a trend that a human computer interface will go with natural language. The other one that I think we are looking very closely is due to production cost. Like again, I think all the AI are tools. At the end of the day, it's solving business problems. So today, if you and I are using chat GPT, just ask questions, five questions, 10 questions a day, I think it's okay. But if you are building a system using any gen AI capability, today the computation cost is very, very high. Internally, we have a few models helping us to do internal data management things. Even we felt like running those models cost a lot of GPU costs, right? So we have to be very cautious. I can imagine that application, if our customer doing inspection or some data analysis running 24 7, the cost will add up crazily. I think one direction is in addition to getting everyone adopting the prompt interface. The other one is really getting a very powerful model, how to distill it, right? How to get it under control on the latency and on the cost perspective. Our goal is solving our customer's business problem. Cost is always there. Our biggest customer run almost a billion inference per year. You times that with cost per inference, it's just a huge amount of money. So we spend a lot of time automizing that. I can see the same thing for Gen AI features or API or model need to really go for that too. Erik: Interesting. You must see such interesting use cases coming out of your customer base. And I would love to hear one or two of just the most creative or innovative or unexpected ways that your customers are leveraging your technology to solve problems. Kai: So one thing we did this year, Is we decide to open out the platform for everyone, for all developer. And one interesting of offering a platform for developer is now we are able to talk to them and see their crazy use cases. One creative one that I see is a pharmaceutical customers, data scientists was supposed to write a algorithm to read the data from their machine. The time series data. And then try to build time series model to predict something on the anomaly from there. And the guy actually reached out to me and said, Hey, I found a much easier way using the old platform. I've connected webcam, look at my machines, and they kind of continue to take pictures. And I train a model to anomaly images of the reading of a very weird dashboard and a normal one. And then they have running that for several weeks and the result was amazingly great. So I was like, this is not how computer vision works. I do that and they show me a picture of the webcam pointing to the machine and like running for two months already, then just want to tell us how amazing this is. We're like, ah, this is interesting. We never, ever will think people will do this. Erik: That's one of the most satisfying things for me is, you know, you build products, you build capabilities, particularly when you're building tools, not use cases, not solutions necessarily, you, but you make these tools available with certain, Expectation and intention, but then you unleash them to the creativity and the capability of people who are facing very specific problems. And people do what people do. They hack, they tinker, they solve problems. And I never cease to be amazed by some of these stories that, you know, that I see coming out of our customer base or the stories that I hear, you know, like this one, what a creative way to figure out if a machine is going to need maintenance in a, in a couple of days. It's just great. So before we wrap this episode, I Do you have one more question that I have to ask, which is, is there anything that you think manufacturers need to understand about implementing machine vision in their operations that we haven't already covered? Kai: No, if I got this chance to talk about one thing, I think very general advice when I'm working with manufacturing, especially for the ones who never have the experience. So, I think the first step to have machine vision in place is to start small. I think I have a tendency, right now AI is so hot, right? You hear this, you hear lending. Sometimes when we get into helping the customer, they kind of find the most difficult problem they have. For example, this one customer, Automotive One, they got a problem. Later on we know they haven't solved this problem for 25 years. It's just one of our consultants from these firms that tell us that, well, this is exactly the problem 25 years ago they tried to solve. It's not because of the AI capabilities, it's because of all the logistics, like all the cameras, it's such a vibrant environment. If you choose that as your first competition problem to try to solve, it's going to be very, very hard, right? Not because of AI, not because of the technology, it's because of all the headwinds that our customers are facing internally. And then very, very quickly, those projects will lose momentum. Like, again, not technology, lose momentum. People felt like, ah, not moving forward. Do I want to sponsor this anymore? So really it's one general advice I would love to share is, if you are new on doing machine vision, pick out the problem. It doesn't have to be the most valuable problem now today. Do it very quick, make it work, get the momentum, and then start tackling the more difficult problem later. Erik: Yeah. I think you're describing an agile approach to optimizing your operations. I love it. I think that's just great. Well, listen, Kai, thanks again for joining us today on Augmented Ops, and I look forward to talking with you again soon. Host: It's a pleasure. Thank you, Eric. Thank you for listening to this episode of the Augmented Ops podcast from Tulip Interfaces. We hope you found this week's episode informative and inspiring. You can find the show on LinkedIn and YouTube or at tulip. co slash podcast. If you enjoyed this episode, please leave us a rating or review on iTunes or wherever you listen to your podcasts until next time.