Software 3.0 and The Emerging AI Developer Landscape with Swyx
===

[00:00:00] Hi, and welcome to PodRocket, a web development podcast brought to you by LogRocket. LogRocket helps software teams improve user experience with session replay, error tracking, and product analytics. Try it free at logrocket. com. I'm Tejas, and returning to join us today is Sean Wang, also known as Swix, here to talk about his latest talk, Software 3.

0 And the emerging AI developer landscape. Sean is a contributor to the latent space newsletter, documenting the rise of the AI engineer and is working on a company called small. That's S M O L A I. Welcome back, Sean.

Hey, thanks for having me back. Good to see you again.

I'm really excited to get into your talk. Especially cause we've already talked a little bit about AI and address and things, let's zero in scope a little bit and talk about your talk software 3. 0 and the emerging AI developer landscape.

I've seen the term web. 1. 0, Web 2. 0, Web 3. Also like in outside of web dev, there's the industry 3. 0 is a trending term. I'm curious, software 3. 0, what [00:01:00] is that? Where does that come from?

it doesn't have any lineage whatsoever with Web 3. So that's a very unfortunate comparison there, I think. But the origin actually comes from a very influential article created by Andrej Karpathy about six years ago called Software 2. 0. And he was basically trying to articulate the difference between hand coded Software where we write every single line of code ourselves with like if statements, loops and whatever traditional coding paradigms and then machine learned code where you write the layers of what the machine learning model should do, like the architecture of the model, and then you just run it through a lot of data in order to achieve weights and the weights themselves are the encoding of the knowledge.

So He was trying to articulate that difference that the software, the possible space of problems you can tackle with software 1. 0 is the problems that you can kind of code for [00:02:00] deterministically, and the possible space of problems that you can address with software 2. 0 is the stuff that you can address it by machine learning for example, computer vision and voice recognition.

It's not stuff that you'll never be able to hand code by yourself. And I think there's a fundamental realization that a lot of people should have with regards to how they write software 1. 0 code, which is a lot of the times, like, what do you do as a programmer, as a software engineer, right?

Like you write some functioning app and then you send it out there and you're, you look at your analytics and your metrics and all that. And then you adjust by adding in some features and adding in some if statements and all that from learning and essentially what.

Software 2. 0 is accelerated learning from data, whereas in software 1. 0 we learn from data through humans in the loop and designers in the loop. So I think that's a really fundamental realization there that like once you realize that sometimes you are just a very slow machine learning model.

you're writing all these algorithms, but yourself, sometimes you can just kind of machine learn the algorithms rather than writing them yourselves. [00:03:00] Okay, so how do you proceed from software 2. 0 to 3. 0 is the arrival of foundation models. And that's the change that has happens more or less in the last three years enabled by the transformer architecture becoming a thing which enables, Deep learning that's parallelizable and massive scale and obviously a lot more money and GPUs and data thrown at this problem.

So now foundation models mean that you do not have to collect a whole bunch of data to create models before you start delivering ML products into production. You can just grab one off the shelf, whether it's open source or closed source, it doesn't really matter. You take a foundation model and then you put that into production and then you can start collecting data to fine tune them if you want to.

Otherwise, the time to MVP of an AI product has significantly by reduced by orders of magnitude in the software 3. 0 paradigm. So hopefully that transition is clear. Software 1. 0 is hand coded code. Software 2. 0 is machine learned code on your data that you [00:04:00] collect. And software 3. 0 is just off the shelf models where you don't even have to collect the data.

Wow. That was an amazing answer. You alluded quite a few times to model architecture, the architecture of a model, et cetera.

How does that play into it? Maybe for our listeners, you could have maybe a sentence or two about what model architecture even is and why grabbing one off the shelf is beneficial to this software 3. 0 paradigm.

There's a lot of ways in which we can take that question. I would say that a very typical model would be these days transformer space models, a decoder only. A generative p train model a GPT itself is actually a type of architecture that you can reference the GPT one and two papers from OpenAI and they actually published open source code to do that, I would say the most definitive open source reference implementation of that kind of architecture is currently from Meta where they released the Lama two code, which is only a few hundred lines of code is actually very little code [00:05:00] and all the value of the code has now shifted from the code base to the data weights.

And that's a very common paradigm coming from software 1. 0 to 3. 0, right? Like, Meta can open source the code base. It doesn't matter. It's because there's only a few hundred lines of code, but they are not open sourcing their data set, right? Because that's actually now the much, much more valuable thing that they're not giving you they, they do somewhat open source the weights.

It's not fully properly open source, but for most people, they can actually use it in their commercial pursuits. And that's what most people care about. So this contrasts, by the way, with some of the other architectures that we might have pursued in the past. So, for example, LSTM networks, if you're in traditional NLP or convolutional neural networks, if you're In image recognition and there's a bunch of other architectures as well.

The RNN is one of the oldest ones and is actually making a little bit of a comeback as a potential challenger to the transformer, but all these are possible architectures where you [00:06:00] spec out the model in something like PyTorch. You define the number of layers that you're doing that you need.

And then you run it through a training process and then you start deploying it. And I'm. Cutting out a lot. But I do think that engineers don't really need to know the internals of these things. They need to know that impacts the products that they can make. And that's about it. And so I think this is classic engineering, where you're not quite a researcher, you're not quite a scientist.

AI NML is at a point where it's crossing over into the engineering sphere where a lot of the rest of us without that training backgrounds can actually get it. pretty far just by knowing how to use the end product rather than to make the products ourselves.

Yeah. And that's a conversation. I'm really excited to have. Before we do, I have just a couple more questions based on what you said. And the questions come from wanting to really answer the questions that I know we're going to get from the listeners. You mentioned foundational models.

I've heard similar terminology in the space, pre trained models. Is that somewhere close [00:07:00] to the same thing? Are they

Yeah I would say they mostly have 100 percent overlap. So, , some pedantic people might want me to point out that the term is foundation models, not foundation. No. And then there's also another term that's emerging called the frontier models. And so the frontier models would be foundation models that are extremely cutting edge and the largest of them that.

Most probably come under some kind of regulational scrutiny from Congress or some other government bodies because they are so big that they are potentially civilization threatening. But the rest of foundation models like the so example foundation models would be GPT 3 and 4, Claude. But also Whisper, also Segment Anything.

All these are foundation models where, again, you can get them off the shelf without actually training anything, and they zero shot transfer to tasks that you actually want to put them into use on your apps. And that's why the foundation model is... Stable Diffusion, for example, is also another foundation model.

And also, basically, they are... Big binary blobs of data sometimes four gigabytes, sometimes 180 [00:08:00] gigabytes, depending on how you quantize the models, but these models are just a result of millions and millions of dollars worth of training these models through GPUs, running them, running data through them based on some predefined recipes such that you get the end result of these blobs of binary data that you can actually run in inference mode.

Thank you. Through these models in order to make inferences and predictions. And when we say inferences and predictions, that's a very machine learning term., in terms of products, when we make AI is, AI products is really much more like generating texts or generating images, anything like that.

And , that's the fun linguistic challenge when you cross fields from the research field into the products field. Because at the end of the day, the product, in the product domain in the AI engineering domain, We just care about what we can do for our users, right? And once we can produce interesting things for our users that people want to pay for, then we get a lot interested.

But we have to respect that there's a lot of prehistory of research terminology that is completely different because they think about things in a different way.

Yeah, I once heard I think [00:09:00] it's Ashi Krishnan say this term stochastic gradient descent in the ML academic world just means try stuff randomly and have less errors over time. So.

There is I would say my, my favorite spin on the stochastic gradient descent is the graduate student descent, which is instead of trying stuff by machines, you just throw graduate students until you find something that works.

Okay, that's a good final question about what you just said, you mentioned nowadays with software 3. 0, it's very easy to grab these foundation models and put them into production. What is putting a foundation model of production mean? How do you practically do that?

Oh that's a whole topic of a conference. So think there's a bit of a U curve in the difficulty. So if you're just wrapping the open AI API. Then you just call it an API, just like you would call any other API in, in an app. There's not that much difference there, apart from maybe you have to be mindful of things like your context [00:10:00] limits, your privacy considerations, especially when people are putting in sensitive information into your stuff.

And then maybe, like, your rate limits. Like, if someone there, there are a lot of bots out there that will scrape any exposed OpenAI endpoints because this is, these are valuable things and expensive. Token call. So if you leave your API endpoints unguarded, or if you God forbid, leave your tokens out there, they will be scraped and used, and they'll run up your bill.

And then there's the path towards the local llamas where you run your Models locally. And that is quote unquote production just for yourself, right? You're not serving an outside audience. So you're just serving for personal use. And you probably want to run it on your local machine.

And so that's a one level of difficulty up from just calling an API, because typically you would want to run things like llama. cpp or whisper. cpp locally. And there's a whole stack that needs to be done there. Yeah. And then the hardest of all is actually serving your own custom models to a lot of users externally as though you were [00:11:00] a model infrastructure platform.

And there are many of these out there. You can actually buy them off the shelf or you can set them up yourself. I would say you probably have to be an infrastructure expert to be able to be run, to be running these by yourself from what I can tell. It's not that hard, like the mechanics of these things, you just have to understand basic principles like saturation basic principles like what the bandwidth is of, individual parts of the model architecture that you've chosen to be able to serve things well.

But I think ultimately. The secrets of high model flop utilization, right? Basically, when you buy a GPU, you have a certain amount of theoretical flops. Most people only operate at like 40 to 50 percent model flop utilization. So, if you want to go down that path, you are basically going to have to become a GPU infrastructure expert, which I am not, but I think in my mind, that is how the landscape flows, right?

Like either you use something off the shelf or you build your own.

Can you quickly just [00:12:00] define flop for our listeners?

Floating point operations. And a lot of this math really is just multiplying matrices again and again. And that's how we do everything from embedding tokens to predicting the next token that, that. eventually ends up towards either building a diffusion model or predicting the next token in, in a language model.

There's all these it's all math at the end of the day, which is actually pretty interesting and fun. But very intimidating. So, so ultimately like every operation reduces to a certain amount of flops, right? Larger models require more flops to, to operate. So how many flops can you generate?

In order to serve those models quickly. That is the fun, the fundamental problem of of infrastructure serving. There's one concern which I'll point out for people, which is the ability to batch and that is primarily the that the key trick to reducing costs per tenants.

If.

Thank you. And let's pause for a quick break and then come back to our discussion. LogRocket offers session replay, issue tracking, and [00:13:00] product analytics to help you quickly surface and solve impactful issues affecting your user experience. With LogRocket, you can find and solve issues faster, improve conversion and adoption, and spend more time building a better product.

Okay, 3. 0. I want to spend the rest of our time talking about a topic that I personally am really interested in. We started this discussion earlier and I'm really excited to get to continue it. The emerging AI developer landscape. This is a topic that I absolutely love.

And Swix, you'll be proud to know I've been officially wearing the badge of AI engineer since our discussion. But I wanna clarify this for everyone. So in your talk you mentioned AI is shifting right. And you mentioned a new role. I'm curious if you could expand on that just a little bit.

Yeah, basically, I think that the arrival of foundation models make it such that you actually don't need an ML team in house to ship an AI product. And that is a very different [00:14:00] situation than 10 years ago when you would very much need to do that in house. So what does that mean? It means that There's a few orders of magnitude more people that can, that will be trying to ship AI products that don't have the traditional ML and research backgrounds to fully understand all of it or to make all of it themselves.

But when you give them a shapeable tool, like a foundation model, they can actually go do, go make lots of money. and make a lot of people happy by building AI products and which is exactly what we're seeing happen in the sort of indie hackersphere and increasingly in the B2B sphere. So I basically would call this self specialization of software engineering, right?

If you imagine a spectrum from left to the left would be the research scientists, the people who are innovating on the algorithms and the architectures, like the ones that we talked about. And then on the further right of those are the machine learning engineers. Those were the, those people that are not research scientists aren't as good at calculus as the other folks, but they are good at model [00:15:00] infrastructure and serving and data pipelines and all that fun data science and all stuff At that point there's a, there's like a permeable boundary, which I dot, like draw a line between and I basically draw a line at the API layer, like, like, when stuff gets thrown over the API, whether internally within a company or between companies, as as an API foundation model lab Then you can consume it on the software engineer side as an API and put that into products.

And I did and so on the right of the spectrum is the software engineers, right? I do think like the traditional full stack software engineer, the one that is, Typically, front end only, or serverless, or front end and serverless, or whatever you call it, or full stack web dev. It doesn't really matter, because there's a lot of those out there.

There's, millions of those out there who don't have any experience with AI. And the real argument is that this sort of skills gap between the ML engineer and the research scientist Is crossing over into the software fields and there will be a new class of software engineer called the AI engineer that will [00:16:00] specialize in this stack because keeping up to date, knowing all the latest techniques, knowing how to put stuff into production and knowing how to advise companies on what is a bad idea to do because it's not ready yet or whatever, all of these are the domain of something that will probably be a specialist fields.

And so I've been calling it the AI engineer and it looks like it's sticking.

And that's so, okay. So just so I understand in the past I think even today, frankly, there would be people who think, who hear the term AI engineer and without further clarification would maybe consume it or consider it something interchangeable with a machine learning ML engineer. But what I'm hearing is a distinction where the AI engineer is one who consumes.

foundation models that expose APIs and use those foundation models as primitives for building apps that encompass them and more logic to serve users. So it's different from the ML engineer in that it is not academic. You don't need a calculus background, but all you need to be able to do is actually be a software engineer as the sort of as the crude version.

And [00:17:00] then the specialization is a software engineer who knows how to work with these foundation models exposed over. APIs. Is that accurate?

Yeah, I think so. And what's fun about this is that I think the AI engineers will be low status for a while because the machine learning engineer is a well established role with a lot of hierarchy and a lot of syllabus and curriculum, most of which is completely unnecessary. When it comes to the foundation models, so it'll be a very fun, disruptive, a few years when we figure out like what's the right like pay or a career path of an AI engineer is as it starts to separate from the ML engineer because they're enabled by foundation models.

I think I'm pretty strongly convicted that this will happen just because this is not a prediction based on the tech. This is a prediction based on economics. On pure demand and supply of relative, demand and supply on relative numbers of people at school sets applicable.

Wow. That's really an interesting perspective. I'm curious if you could speak more to the [00:18:00] economics of this. I'm an engineer and the fact that you tell me I can query things over the network or query things in general and build apps and call myself an AI engineer, I'm really happy with.

But could you speak a little bit more about the economics of it? Is it just that there's a ton of demand and...

yeah, it's purely demand and supply, right? Like, there's a ton of demands, not enough machine learning engineers and machine learning research scientists around to supply that demand. So an intermediate class will be created. And it will probably come from the software engineer side going down the stack rather than the ML engineer side going up the stack just because there's not there's a lot more of the software engineers so that is guaranteed as one thing.

I think socio economically software engineers also wants a way to jump in on the hype, which is why. A lot of people have issues with my usage of the term AI engineer, because a lot of people propose alternatives like LLM engineer or cognitive engineer. The thing is that these things don't roll off the [00:19:00] tongue as easily as AI engineer.

People want to associate themselves with AI. And so I think the people who do a good job of it will be able to put AI into practice for any company that approaches them and they'll be very high demands. And that mostly like, I think that was a very core inspiration for why I did this, which is that I noticed that a lot of companies were trying to hire this profile software engineer and a lot of software engineers wanted to.

Pick up more skills, but they didn't have any other ways to find each other And so I think once you have the industry collect and coalesce around a single term that identifies a skill set an interest and maybe a career path eventually and according and the Tools, they're competent with the papers.

They might be familiar with that becomes its own community and its own career and sub industry. So, I'm pretty interested in growing that obviously I've already made my bets. I definitely will be honest and admits that this is super [00:20:00] early, right? Like, there are people walking around with the title of AI engineer, but it's it's definitely still in a smaller minority.

But I do think that it will grow over time and it will probably exceed ML engineer by a lot. So this was backed up by Andrej Karpathy, who is the one of the figureheads of AI. I think he was a co founder of OpenAI, actually when he read my, my piece on the rise of the AI engineer.

And he said, yeah, it's probably true that there'll be more AI engineers than ML engineers. And I think this this is, Emerging as a category and we'll have to basically out the tech tree like right now. It's very undefined So it's just a bunch of people hacking on Twitter and Reddit and Hacker News But over time the courses will come in, the degrees will come in the boot camps will come in and I'm Very excited to see how that develops.

Just to be clear, since it's so young, there isn't currently a senior AI engineer title officially recognized, right?

themselves senior, whatever they want. Yeah, ultimately, your definition of senior is, has been around for five years. The Transformers architecture has only been [00:21:00] around for six years. So you be that senior on that front. But I will say like, because I do say that, it's still a lot of software engineering.

Maybe 90 percent of it is still software engineering. And if you qualify for senior software engineer. On that side, then you just need a bit more training on the AI side to match up.

yeah, no, the reason I ask is because I'm pretty sure someone's receiving a recruiter email asking for a senior AI engineer with, 20 years of experience or something. Your Your diagram of this on the left hand side is the academics, the machine learning engineers, and on the right hand side is like the front end serverless type of people.

I think is so awesome and I think , it can be generalized enough to where really you can use it to reason about a lot of the way the tech industry at large has developed. What do I mean by that? For example if we think about personal computing, even right on the left hand side, you had these mainframe folks that was super inaccessible to everyone else.

And then on the far right, you have us [00:22:00] today who. Use personal computers. But at one point in time, computers weren't personal. And they were largely in labs and over time, that, that line has expanded. And today we see it commoditized for everybody, mass market commoditization, right?

is experiencing something similar where AI to this point is personal. Like I literally use chat GPT every day and it wasn't before. I could never run that. That type of model locally, or maybe I could, and I just didn't know how is that a fair way of reasoning about things? And if so, would it be appropriate to consider other things today for maybe people wanting to come up with startup ideas of things that are currently gatekept behind, I don't know, academia or money or access to certain machinery that over time might make it, Into the mass market, ergo broader consumer style people.

I think that's never really been a hurdle. If you were smart and motivated enough, you would have figured it out at some point, but now it's just getting, the bar is just constantly getting easier and easier. So [00:23:00] at the end of the day. I would hesitate to offer startup advice just because I, I I'm not a a VC or anything like that.

I would say like, probably what is. Still successful for important for startups is to build things that people want that and to know your customer better than anyone else and serve them and make them happier, better than anyone else. And hopefully pick a growing market that is in demand for that.

I do think that there are very good opportunities in effectively creating a poorer version of what a professional would do. And that's effectively what. Some of these things offer, right? Like, Hey, you can hire an SEO copywriting experts to to work on your website copy for like 3, 000 an hour, or you can pay chat to BT 20 a month and do an 80 percent good job.

And that's what automation is going to do is really, you're really going to take away like the lower tier of all of these [00:24:00] specialists roles because now we've got, we've generated AI to do it and it'll. I think this is going to make sense, because obviously that's going to take away some people's jobs, but it's going to free them up towards doing higher value add jobs that machines cannot do yet.

Yeah. And if I was in such a situation I'd probably try to use that time to learn AI engineering and get ahead of it. And that, that may be, I don't know, that may be a prompt for some people listening. Okay. So the AI engineer is a new position that, is effectively encompassed by folks.

consuming foundation models and using them to solve problems. . You mentioned the stack the modern data stack of the AI engineer. What is the stack? Is it outdated yet? , does it change as fast as JavaScript? Is Tailwind good?

It's actually changing slower than JavaScript. So for a while I was saying that, JavaScript framework wars are over. And then now we have like Svelte and Solid and Inferno. I don't know, I don't know what the new thing is. HTMX is the new thing. Anyway. So, I would say 

It hasn't been that journey. I can go through the stack. So, [00:25:00] the thing that I announced at the talk and that we're doing a survey on actually is the software 3. 0 stack, which I obviously it's an idea brought over from the modern data stack, which you reference Monda stack is this sort of nice composition of.

How a data engineer should view their tools of the trade. And I think it's a nice map of like what you should learn and be familiar with, or at least consider if you need it within your company for your own needs. So the software 3. 0 stack instead of data warehouse, I basically have the system of reasoning is what I call it instead of like a system of knowledge or system of record and the system of reasoning is like the source of your.

Foundation model, right? Whether it's a foundation model lab like OpenAI Ora topic where it's closed source, but it's best in class and they provide it to you through an a p i or it's open source. And you have to take care of a lot of the model hosting issues yourself. And that's typically provided through hugging face, replicate based 10 Model Do com and Lambda Labs.

And so that would be like the most valuable ones now, right? OpenAI has a valuation of 40 billion. Anthropic now probably has a [00:26:00] valuation of 10 to 15 billion. HuggingFace has a valuation and all the others are much smaller. Those are by far the biggest chunk of the value captured so far.

And then we can go on to the rags that the retrieval augmented generation stack because that is the stack that basically personalizes and orchestrates the AI models. So what does that really mean? Retrieval augmented generation really means that in every natural language model let's just say like a GPT four there's a certain amount of context that lets you.

Put in some extra information or examples such that you can actually personalize your answer towards something that you specifically want. So, GPC 4 is trained on a lot of web general knowledge facts out there, but it's not going to know specific things about your company. It's not going to know specific things about your products or person.

So how are you going to pull in information and paste it in there and generate? All that stuff that is the subject of the retrieval [00:27:00] augment generation stack or the rag stack as they call it. So in that bucket are companies are the Vector DB companies. So most notably Pine Cone, which is valued at seven $50 million.

And then there's a long tail of others vis we v Chroma. All in all people really like to invest in database companies. These companies have raised 5 million this year. Which is a lot of money. That's more money than MongoDB ever raised its entire lifetime leading up to IPO. So, so there people are just like investing very far ahead on the database side of things.

And then the piping, the orchestration and application frameworks the two leaders here are Langchain and LlamaIndex. Langchain has raised 35 million and LlamaIndex has raised 9 million. And both of them are effectively will connect your LLM adapter, whatever LM you have, whether it's the closed model, closed source ones and open source ones will connect it to your your data source, whether it's your notion, your slap, your Gmail, your Google drive, doesn't matter and embedded in the center of a vector database, like a pine cone, chroma, VVM, OBS.

And then we'll [00:28:00] insert them into your context whenever you need to generate them. And that will be, that will serve as your personalization stack. And that's your rag stack. There are other companies that are focused on eliminating that process because it's like a janky process that nobody really loves.

But it is by far the best in class right now. So can you build... Can you build RAG right into the model itself instead of stitching together all these tools? There's this company called Contextual AI that pursues that. The founder is the author of the RAG paper that founded this whole field. And then there's other open questions as to can you fine tune new knowledge into existing models?

And that is currently completely unknown. So finally, those are the two most established parts of the stack. The system of reasoning and then the RAG stack. The part that is like completely open water right now is how people interact with the models, then what I've been calling AI UX. I held the very first AI UX meetup in San Francisco and AI UX is a big part, portion of the conference that I'm holding [00:29:00] in October.

And I think this is where front end engineers should really get excited. Basically when Chat2BT was announced in November last year, it was mostly a UX innovation, right? Like, instead of the sort of OpenAI playground that was not very inspiring just being able to thread together a chat and going back and forth seemed to unlock a lot of value for a lot of people, and that caught OpenAI by surprise.

Like, the reason that they actually dropped it quietly with a very small blog post is because they didn't think it was going to be a big deal, but it was. So how do we break beyond the chat box? How do we unlock the capabilities of this reasoning and large language models towards more intuitive?

interfaces apart from just the UX. So one form of that is, for example, GitHub Copilot, where, as instead of a separate chat box, GitHub Copilot forces you to, it just like watches as you type, and then tries to auto complete as you type. And that was something that they consciously engineered for over six months to get that experience.

[00:30:00] Because people find that if you have to, Context switch back and forth between your code and your chat box. They're not really going to use it as much, but if it just kind of auto appears and you can look at it in the context of your code, then they're much more likely to use it.

So I think there's a lot of innovation that's open there and currently there's no company that's like really owning that, except maybe for Vercel, which just recently released, announced v0. dev, which is their version of chat GPT for UI generation, right? That you can type in whatever you want, whatever UI you want to create, and it will create React and Tailwind code for you that you can just copy and paste.

Yeah, and this is where I guess the AI engineers also rise up and see what innovations we can drive. I'm inspired to think about the Announcement, I think, from OpenAI today if I'm not mistaken, where they announced audio and video and image inputs for ChatGPT as well.

That's rolling out soon. I didn't know, you mentioned [00:31:00] ChatGPT was just a UX innovation on top of... What already existed. Is that true? Was it like, could I query GPT 3. 5 and could I have built chat GPT before opening? I did, I guess that's my question.

So, yeah, there's a little bit of debatable facts here. As far as OpenAI is concerned, all the public statements from everybody at OpenAI says that is what they've considered ChatGPT to be. It's a pure AI UX innovation. If you read between the fine lines, it is not actually exactly that because they released GPT 3.

5 one day before ChatGPT. And so it was a slightly better model plus a new form of delivery. And so, what I often say is you want to bundle a new model with modality. And that's what tried to be T became for open AI, which is the way that they deliver their models to the broader consumers, right?

So today they announced GPT 4 Vision as well as baking in the conversational experience, which means they also, by the way, have entered into the speech synthesis [00:32:00] business. So, OpenAI Whisper is going from speech to text. Now they have it going the other way as well, text to speech. And all of these are models that they just ship within chat GPT without releasing an API for it, without open sourcing any of the code without even publishing any papers because now, and that really signals them shifting into much more of a product company rather than an infrastructure company or research lab.

Yeah. Which I, I'm really excited about this audio thing, as you mentioned, because this literally is Ironman's Jarvis in real life. More or less you have this thing that I was actually looking at, I was looking at the UX of the audio input feature for chat GPT. And it, of course it does this.

Like to me, I saw this feature and I was like surprised, but then shortly after I realized, wait a second, this is actually, it makes sense. And the feature I talk about is that when you're done speaking to it just. Automatically submits the text because it's an AI product. Of course, it recognizes when you're done speaking.

And I thought that was just really novel. And so you could literally just talk back and forth forever and say, Hey, I'm struggling with this [00:33:00] for loop.

it's amazing to see how good it is. We have, we've had this before. We have Google assistant, we have Alexa, we have Siri from all the other companies. Obviously, OpenAI is probably going to do a very good job, but it's not like these things don't exist before. And I think that's ultimately something that I do think a lot about in AI, which is that if you just look at things on a feature by feature basis, you can clone a lot of these features, like before opening, I does, but do you have the brand that people trust that says, like, this is probably one of the best products out there.

And now I'll just trust it default without even evaluating all the other options. And that's ultimately where you want to get to as a startup or a company. You want to build a brand that people trust where you've done the work and, it's mostly good, even though sometimes it's not that great.

I would say Apple has tested their faith with their customers quite a bit but most Apple customers will trust that, when, whenever something ships, it might be two or three years late in Apple, but it'll at least be good.

Yeah. All right let's wrap up. I wanted to address one point one, I think, important point before we wrap up. And that is Thank you. [00:34:00] So there's a number of skeptics around, of course, I think it's actually good to have a little bit of skepticism around things just to keep us balanced.

And the term AI blame has been going around and it's had some type of impact on a number of industries. So I'm curious if you could speak to one AI blame, but also in contrast to that and contrast to some of the skepticism around. What are some examples of AI that have been going really well?

And I don't mean well in the sense that chat GP and copilot helped me code faster, but well, in the sense that it's made somewhat of a more, I'd say, meaningful difference in the lives of people. I

So yeah, blame is a term I made up. There's a lot of ethical and legal issues. What I would and so specifically what I was thinking about with AI Play was Chegg, which is a publicly listed company that sells college textbooks in the U.

S. College textbooks are a huge racket. Everybody knows that you seriously overpay for, minorly updated editions of college texts, and you don't really have a choice because you're taking the course, right? So obviously Chegg has not been doing very well. And in the [00:35:00] most recent results, they posted really poor performance and they blamed ChatGPT.

They said everyone's just using ChatGPT instead of buying textbooks. And the stock went down 50 percent on that day. But if you zoom out, like the stock has been going down for a while. It's like, it has been going down since before ChatGPT. And so people find AI as a convenient scapegoat. For all of society's, issues that were going to happen anyway.

And so I do think that's a very funny phenomenon that I think is just common, right? Like, it's very natural to blame things that you don't fully understand. Or you, even if you fully understand it, it's just very convenient to have a scapegoat,

We did this kind of thing with Web three and NFTs and things like that. And I like I'm not saying AI and web three's in the same area. What I'm saying is it, that was new and there were skeptics and it was a scapegoat. So, is it fair to say

Yeah, so in... In a way, this is the reverse of skepticism, right? This is saying that AI is so successful that it is killing us. That is the reverse of skepticism. Whereas there's another form of skepticism, which is the stochastic parrots group[00:36:00] group. And this would be the ethical research group that came out of Google Brain that got fired very famously by Jeff Dean because of some disagreements in how the process that the stochastic parrots paper was published.

And basically all these what these people are saying is that these are just language models. These are just giant matrices that simulate thinking they don't actually think. They simulate, they generate plausible sounding text, but they don't actually have any real knowledge of the world because they've never lived in the world.

They're just trained on, text corpuses that we collected off of the internet. And that's obviously true to some extent. But this is all ultimately the discussion of does the map reflect the territory. It's a very age old fundamental philosophy, because if I can talk to you for 45 minutes on a podcast and sound like a AI expert without being an AI expert, how long do I have to do?

Do I have to talk to ultimately approach being an AI expert? And so there's some amount of like, , this isn't real, and this isn't actually [00:37:00] thinking, , it can't be reasonable it, we can't treat this as intelligence until it's like human intelligence, and there's some amount of, like, just look at the results, like, language models are now superhuman on most reasoning capabilities AP bio, AP math, whatever the MA the SAT, the GMAT, whatever it is.

The medical exams, the law exams , it is now superhuman and all those things. At some point, it is genuinely superhuman instead of faking being superhuman. And when that crossover is from being like, not really intelligence towards actually, it's actually intelligence. It's really up to you to decide.

But I can say that most of the time the Turing test has been so conclusively passed that we are spending more time blocking humans from. Trying to do things than blocking machines are trying to do things.

Yeah, I wish there was some type of committee that would just make the decision for all of us on what or when. The intelligence is considered intelligence, but I think

would you trust that committee?[00:38:00] , so I'll say so like, I think cynicism is warranted though I start off most of my talks by warning that There are a lot of signs of overheatedness of extremely high expectations of people being very hyperbolic with their AI predictions.

And that is just the result of the, the creator industrial complex is what I always call it. The news cycle needs everything to be at extremes in order for you to get to pay them attention and to pay them money. So everything is, nothing's ever as good as it seems. Nothing's ever as bad as it As the skeptics might make it out to be I do think like you get the most benefit just trying to build to solve for use cases that your customers have, and I think projects that where you can actually make yourself satisfy, it makes yourself more productive.

I think it's really good. I built a project for myself called small developer that generates Chrome extensions. Whenever I want, I would just like write a prompt for a Chrome extension and it mostly builds it out by itself. And that's super useful for me. And, it's not a product that I would sell, [00:39:00] but I do think that you as an engineer have the ability to experiment like that.

And I think it's a wonderful time to start building and exploring so for if people are interested in like the full list of like project ideas I have on latent The email course is completely free where you can just have seven days of guided projects through all the major modalities of AI.

And that's what I've been building with a friend of mine, Noah.

Oh my gosh, and that's just a free resource anyone can go sign up to today.

Yeah I just think like a lot of people ask me for like where to start. And so this is my answer to where to start. I think that people should just build with projects in mind. And obviously as a newsletter author, it's good to collect emails. So I just structured it as a

Awesome. Yeah, and , you're one of the people who pioneered building in public, so I guess they can also build that in public for learning

built in public. Yeah. One of my, yeah. My, one of my biggest inspirations was Westboss's JavaScript 30. Where a lot of people learn JavaScript just by building projects on a day to day basis over 30 days. And I quite like that. So I, this is my attempt at building [00:40:00] JavaScript 30 for

We will definitely link to your newsletter in the show note captions. 

So I understand SWIX, you're organizing a conference. It's called the AI Engineer Summit and I'm excited about it. I know there, that there's a way for people to participate online for free. I'm curious if you could say a little bit about that and help us understand it. Also, there will be a link in the show note captions for those who want to attend, 

Yeah, well, you don't need a link because the domain is something that we sprung for. It's AI. engineer, which is a very fun to flex the engineer TLD. And yeah, we, we have a two day conference. It's going to be live stream because we're completely full in person. But we have people from OpenAI, Amazon, Microsoft, Vercel, Notion, MagChain Lama Index.

Fixie, AutoGBT, the biggest open source project of the year. This will be their first ever conference talk. Superbase, Guardrails, every one notable I could collect in that space, getting all of them in one room and presenting the state of AI engineering. So if you want to check it out, head to AI Engineer, put in your email and see you [00:41:00] on the 8th to the 10th.

All right let's wrap it here. So listen, you are I think I've said this to you last week. You are literally legitimately one of the smartest people, if not the smartest person I know, and it's an absolute honor and privilege to be able to talk to you and poke your brain a little bit.

Ask my silly questions and get your highly insightful answers. Small AI building in public the newsletter, all of the work. Thank you for coming on this podcast and enlightening me and so many others listening.

It's very

Yeah, it's been a pleasure.