Video Editing in the Browser with Christopher "Vjeux" Chedeau === Noel: [00:00:00] hello and welcome to Pod Rocket, a web development podcast brought to you by Log Rocket. Log Rocket helps software teams improve user experience with session replay, error tracking and product analytics. You can try it for free@logrocket.com. I'm Noel, and today we have Christopher Sau joining us. He's known as vi. OU online. He's a front end manager at Meta. Done a ton of work on React Native and prettier. He's done a bunch of work on Xra. He started the c s and Js movement and so much more. And he's here to talk about his latest talk video editing in the browser. Welcome to the show, Christopher. Christopher: Thank you. Welcome welcome everybody listening, and I'm super excited about being on this uh, podcast and uh, super big fan of Log Rockets. Noel: Awesome. Awesome. Glad to hear that. I hope I got your handle is that, was that pretty close? Christopher: It was closed, so in fact I said So this means basically like video games in reverse in French, but that's fine. Noel: Oh, nice. Nice. Very cool. Very cool. . I spent a ton of, , the weekend delving into Diablo four. So Good. I'm in, I'm in [00:01:00] the video game head space right now. It's, it's been good. Let's talk about your talk. I guess before we do, do you mind giving us a bit of your background you have quite the resume Christopher: So yeah, my biggest thing is around like making the experience better for people. And so I actually started like way back playing video games and like editing and working on video games. And so the way I learned how to program was playing Warcraft three. So you, there was a map editor where you could like, build maps and practice. The interesting thing is that there was a UI to be able to say like, Hey, when the unit enter like this area, then make it move somewhere else and this generated code. And so I started using the image guitar and then I was able to Then go into writing the actual language, promming language behind the scenes. And so this is like how I started. And uh, yeah, you mentioned JLo four. So I played a lot of JLo two, and this is like how I started reverse engineering the game assets and everything. Yeah, I've had long journey there and in practice for video game it's been like more for [00:02:00] fun, but for work, I've been like using the same like ideas around helping the user experience into like the front end space. And when react came out, I was like, oh my God, like this is the best thing in the world. And so it helps like really create a community around it and make sure oh, it's super easy to like set up and install and make people understand like how. What makes it react? And then I work on React native and prettier on css, ngs on schedule, like a bunch of things always with the intention of like, how can we make people more effective more efficient at our job. Noel: Nice. I'm a huge fan of Xra as well. I use it just like for sharing personal notes online, like all kinds of stuff. We use it professionally here at Log Rocket as well. It's awesome. Cool. So I guess frame your talk a little bit for us. Video editing in the browser. What does the talk focus on? And explore. Christopher: Yeah. So one of the thing is as I can say, like with my resume, like I like to work on like really big ideas and like big projects that are like going to be groundbreaking. And one of other thing I've been trying to feel okay after [00:03:00] React, like what's after I schedule, like what's next? And one of the thing I've been doing during the pandemic is I've been watching a lot more videos. Because there's been no actual in-person conferences. And then I've been doing like video editing. So I got a green screen in my house and uh, I've started like playing with all of the video editors. But one thing I was like frustrated is it feels like being back like 20 years ago when there was no like AI whatsoever. So for example like, uh, I have a green screen and you're allowed to remove the green from the green screen. I have to choose like a color range. And it actually always messes up with like, I have some shadows and I cannot get it out like perfectly. And same right now, like we're doing a podcast and probably like when uh, you're going to edit the podcast, you're going to want to see, hey, what did I say? When did I say it? But right now all I can see is like sound waves. And this is like not very useful, but oh, we actually have really good speech to text a available nowadays. And so I would love to see like, [00:04:00] hey, the text actually, and now I'm able to say Hey, this is when we did the cut number three and it's finally like, uh, went all the way. And so now, oh, I'm able to cut this and use only these segments and , none of this. Like right now we can actually. Use it inside of the editors. And uh, this is where I was like, like I was like, oh, maybe I can like actually build it myself. And I found out that there was all of the APIs needed in the browser to be able to do it. So I embark in this journey to work on it. Noel: So what what are, what are like the biggest problems you encountered? How does building a video editor online differ from something like Xra? Christopher: Yeah, so one of the like biggest thing was actually the APIs themselves in practice. What you need to be able to do like a video is to take a video file. And extract all of the images from the video and then be able to manipulate the video and then like all of those images and then put them back together in a video file. [00:05:00] And right now the APIs in the browser, like super low level and uh, to be able to extract all of these images. And there's also like only half. Of what you need is implementing the browser and half of this have the rest is left as a negative side for the reader. But like there's no like really good uh, implementation right now in JavaScript. And so far I spent like a month basically trying to like, Cob all everything together, which I was able to do. And so my talk is all about what are the things you need? I've learned through the process that I wanted to share, and I wanted to share like a working end to end version of decoding and recoding and with the hope that somebody is going to actually build , the video editor that I set out to do in the first place. Noel: When you started, did you expect to kind of have to get as low level as you did in this project or was that kind of unforeseen? Christopher: Yeah, no I didn't expect any of this. And this is like, oh, I, saw like all of the blog posts saying, [00:06:00] Hey, you can do like, video editing, editing, and like all of the pieces that were needed in the browser. And I was expecting to like, oh, npm install like something and just like wire up together. I. But yeah, it didn't work. And then I had to go like one level deeper and one level deeper, and one level deeper. And so it's basically inception while like I kept going deeper and I didn't see like the ends. And finally I've seen the end. So now I'm going back up and I'm trying to get like everybody else in the like web community to actually do this journey with me again. But basically like shortcuts all of the like hard steps. So like people can just like start. At a higher level. Noel: So I guess what let's dig into the weeds a little bit. I don't want to go like too far down, but what were some of these hard problems? What were some of the stuff that you had to implement that was trickier than you anticipated or just more low level than you anticipated? Christopher: Yeah. So one of the things, for example in the main idea of the talk is the idea of what is image compression and what is video compression? And so one of other thing is in [00:07:00] practice the way this works is that you like. If you don't have compression, then each image is basically 1.3 megabytes for like just 1000 by 1000 image. And now if you extrapolate to 60 frame per second it is basically like one gigabyte per second video. And so this is not something you're going to load into the memory. It's going to like be out of memory like really soon. And so the whole image conversion and video conversion like basically changed the way the file formats is, and it's uh, imposed constraints. And so one of the key thing that people in the video like compression space are doing is they realize that like many images in the video, like second short images, look very similar to the previous one. And so what you can do is to start predicting what the next image is going to be based on the image before. And then uh, you can only uh, store the delta [00:08:00] between what was um, predicted and the actual. Image And now storing this delta is like much smaller and they do it both like forward in time and also backward in time. So they try both and see which one is actually the smallest, and they use this one. Now, one thing that happened with the API is that now you need to like, not only be able to say, Hey, I want this image just get it and decompress it. Now you need to be able to have a notion of dependencies. So if you want this image, you also need the previous image and maybe the next image and all this de dependency. So now what it means is like the actual a p i for the image decompression is stateful. And so now it's imposes a lot of like unnatural constraints where the API is basically, you send in a bunch of uh, like compressed frames and then at some point in the future you're going to get decompress frames, but it's not one to [00:09:00] one. And so now it's basically like, uh, messes up like all the way, like we're naturally like uh, used to actually write like software. So. Noel: Right, right. Yeah. Like, And I imagine that wouldn't be, so hard to work around if you're, if one is just implementing like a video player, right? Like then it's not difficult. But if you're in an editor where like there's like jumping around happening and like change is happening, you probably have to be a lot more, coupled , to those problems. Christopher: Yeah. And one other thing that's also interesting is uh, like none of this is documented. And so it's oh, I expected it to behave and like it didn't, I. And uh, I didn't know if it was like a user at all did I mess up , using the API or if actually the API has a bug or if it's actually expected, and this is only one example, but there's like many more examples of things that I didn't know what was the the issue and what was the issue. Noel: Yeah. Yeah. So I'm curious on a few points you mentioned there just like from a technical perspective. So are like, are these different methods of video and coding? Again, I'm not like [00:10:00] familiar with this Ram Atol, so some of these questions are probably pretty high level. Are they terms that like we know, does all video encoding work this way? Or does it depend on the file type or the Kodak used? Christopher: Yeah, so there's actually two different parts of the video encoding pipeline. Yeah. One is the Kodak. And so the Kodak is the thing that is actually doing all of the hard work of uh, doing the compression, doing the prediction, doing the delta, doing the encoding and all of this stuff. So this is so performance intensives that like people have tried to optimize it and right now, like even uh, doing it in the CPU is not fast enough. And so every single laptop or computer that you have, actually have hardware that is uh, dedicated to implementing the operation for the video like. And coding, like the pure and coding, and this is what the web code API and the browser let us use. And so with this, we have the absolute highest performance possible. Now, [00:11:00] the second part of this is as I mentioned, like there's a list of dependencies and uh, like long the video like file is and those kind of things. And so this is called a video like file format. And the beautiful format is the one you mostly think about, mp4 or AVI or m kv, those kind of things. And this is basically like a Jason. Just speaking of like web uh, thinking, it's like adjacent defines like, Hey, what are the dependencies where like all of the binary files like frames and this kind of things, and so the video the file formats this is called like maxing and de maxing. And so this part is specific to each file format and. In practice, there's no in browser library to be able to do that. And so you either need to re-implement it in JavaScript. And so reading this like Jason, like binary file format, Jason, and uh, like sending [00:12:00] the frames to the coded or you can. Actually like trans existing C or c plus first libraries that are doing this into JavaScript or like web assembly, and then you can use it from JavaScript. But uh, all of this, like right now, there's no like, oh, there's already a very solid library to be able to do all of this. And so I had to like figure all of this myself. And the other thing is like, debugging is really annoying because you're actually debugging on Bites. Noel: Oh, Christopher: Like data and uh, also like all of the video players, they tend to be very robust against exceptions or like, their error handling is like they're trying their best and if it doesn't work, it, they silently fail. This is like the worst you can have, like when trying to manipulate. This is oh, you don't have like syntax error at this line on this thing. You're just going to like silently drop something. And now, for example, I had an issue where the time of the video was like 10 [00:13:00] times longer than like the actual video I had. But the playback was working well and everything was working well except like the scrubbing like went way farther in the future. And now I'm like, how do I even start to debugging this? And so I basically compare like a video that worked like line by line bite by bite if it was the same. And I about, oh, this is different, so let me try to make it the same and see if he changes anything. And uh, the found the issue, but this is like very labor use and error problem. Noel: Yeah. Yeah. Was there like when you were going through this, because these libraries were swallowing all these exceptions and just failing gracefully as it were. Like, did you ever have to go reach out to people, find people who were working on these things and , get assistance to debug these problems? Christopher: Yeah, so I was able to get in touch with somebody Dale Curtises, walking on the web code like inside of the brother. So I'm lucky to like, be known enough that I can have this kind of relationship. And I also had somebody at work on F m Peg that was able to ask question and bounce around. I'm [00:14:00] lucky to be able to do that, and they were like very helpful and help me get unblocked. But uh, this is not like a sustainable in the long term. So I'm really hoping with this talk and like basically explaining the world like, hey, we can do it, so please get started and get like using it so the whole space can be better. . Noel: Hey, just taking a quick pause here to read an ad for Log Rocket. , log Rocket offers session, replay, issue tracking, and product analytics to help you quickly surface and solve impactful issues affecting your user experience. With Log Rocket, you can find and solve issues faster, improve conversion and adoption, and spend more time building a better product. You can try it for free@logrocket.com. Why, why do you think it is that like this space hasn't been explored more? Video feels so fundamental to the web anymore, right? There's, I don't know, people I would say I would. Guess that, people spending time online, some percentage between like 20 and 50% of network traffic on the internet is video data. Why is this [00:15:00] such a kind of unexplored space for it's the smaller players and independent developers and stuff? Christopher: Yeah, so I think there's two reasons. I think. One is like the pure performance aspect of this is that before we had the web assembly, like we only had JavaScript vm and this is like, Way, way, way too slow to be able to run any kind of video workload around this. And even with web assembly, as I mentioned to get like acceptable performance in terms of nowadays hardware, like you need ac hardware accelerated like pieces. And so right now people have done f ffm peg translate to waza, but. It's like very slow and like slow to the point. Like you cannot actually use it in any production workload. So I think this is the biggest one. And the second one is there's actually like, uh, walkaround solutions. And so what most people are doing is uh, they're actually running F F M Peg. In the backend and exposing F f m peg APIs [00:16:00] to like a rest and point or something. So you can have the same but it's actually running the code like in native C so you don't have to do any of the like hard wiring. But one of the challenges, this is now you need to like have all of the videos uploaded to the clouds. In order to be able to like manipulate them. And uh, this may be like personally specific, but currently at home, my connection is really good for the downloads, but my upload rate is like really bad. So whenever I'm doing video editing it would take like hours to upload uh, normal size video. So I have to go to work. In order to actually like, get the infinite upload. So this is uh, why I'm personally like frustrated by all of the video editing tools. I'm not uh, in the browser. I'm not like doing locally is because uh, it's basically unusable for me because I don't have the upload the bandwidth. Noel: right yeah, but I bet that is, that's pretty, pretty common. At least I don't know, just anecdotally, every, everywhere I've been, most people I talk to, yeah, they've got 10 x download [00:17:00] compared to their upload speed. Um, I understand when I've explored this space I've felt the same. And yeah I think that makes sense too, cuz like image editors, we've seen those really come online as it were in the past five to 10 years and they're like really rocket now. But yeah, I think that's probably a good insight is just there's technical limitations in that, how it works and then like it's hard to upload. So, Christopher: But what I want people to get out of is now we have all of the API actually fast in the browser. So this is why I'm like, now it's the right time. And also I think it's the right time because there's such an explosion of like video creators. There's a whole economy with reels, TikTok YouTube, like all of this. And I feel like, yeah. And I've been talking to many people in that space and they're all like semi frustrated by like the technology that exists and uh, from the, like all of the Adobe Premier and all of the like car video. I'm not seeing move as fast [00:18:00] to get into the, like the bandwagon. So I think that's like really Like a good inflection point where investing in this now is going to like yield like some massive benefits. Noel: Yeah. So , what were the recent API changes in the browser that made this possible? Christopher: So for this the web code is the, like biggest thing that makes it possible, which is able to talk natively. But the very interesting thing about the web Co is this is. Like right now we are having it for like videos, but in order to get it for videos, we first need it for images, and then we also need it for sound. And so those have been done in the past, like many years. And then in order to be able to work with image and sound, we need to be able to like, have some kind of like binary like container in the web and like work with it. And so there's the now I'm blanking on the name, but the Arab Buffer have been introduced like 10 years ago. And then uh, there's a whole like web [00:19:00] assembly thing that like is manipulating binary data so you can allocate memory and kind of things. I've been also like pushing . For this. So what we're seeing is this is basically like the pinnacle of all of this work is now able to do it. And web workers has also been like a big help in this space. So I would say like without all of those steps, we would not have been able to like, come up with the video, like use case. But now that they're all here now, I think we're able to do it. Oh and one more thing around this is uh, all of the shaders. So like for video editing, you want to walk around images and transformation. And so now we have like shaders and like the web G P U with AI is also like getting intraction. So I, I see a really bright future for all of this and everything coming together. Noel: Nice. Yeah. So for technical people who are interested in getting into this space and starting to explore, how do you recommend they start doing it? Like where should they start looking? Christopher: Yeah, so I created a repo[00:20:00] to be able to reiner the S 2 64 like video. So I would recommend taking this and then start building a video it on top or start like building some costa on top. And then they're going to at some point maybe want to use a different file format. Right now I. it only support mp4. Oh, and maybe they're going to want to do support m kv and now, oh, they need to start finding M K V Meel deel, like re-implement one themselves. So I would like my recommendation for this is always try to build something. It's not like India's oh, build something with it and see if it works or if it doesn't. Noel: Yeah. I feel like that's often the best way to learn, especially in the dev space when trial and error is very cheap. It's like, go try to make something, see if it works. Do you think at some point . We'll settle on like a more de facto set of tools for devs that want to be in this space to help them get started rolling. So it's not quite so much, DIY work as at Christopher: Yeah. Yeah. , and this is actually, like the[00:21:00] end of my talk is I want somebody to write a jQuery of video editing, editing in the browser. So I want to like, yeah, somebody like build like a library, so simple to use and is going to like abstract where all of those per key like uh, file format and code and everything, details, so you can just like start using it. Noel: Nice. Nice. , I feel like we covered a lot . Is there , anything else on this journey , or in your talk that you wanted to talk to us about and tell devs to, to give a look at? Christopher: Yeah. So I think one of the thing that's going to be interesting around this that is not part of the talk, but I think it's relevant to log rockets, is like how do we actually put this into production? Because like now we're talking about like massive, like video files. And uh, like, probably like big errors and those kind of things. And so I think there's going to be a very interesting space around like, how do you do error handling for these kind of things. And for video it's oh, can you get a reference for the video in like in the cloud or can you one of the other [00:22:00] thing that we're thinking about is uh, if you have like editors. That, like you have uh, somebody on YouTube or like Twitch, streaming, and then there's somebody editing the compressed version. You don't want to send the full quality to the editors. Uh, You probably want to send them like a low quality. And they do all of the edits, and then they send you like the like edit file Noel: Yeah. like the, another like diff of some kind, right? Christopher: Yeah. And then now you can actually do the like full video compression using the high quality version. And so this is like these kind of things around logs and around like defining like what is the schema of the scene. And uh, being able to like pass this around I think is going to be like very uh, important and uh, something to explore. Noel: Yeah. Yeah there's a whole interesting set of problems that feels like in there that we haven't. Really had to solve, like I feel audio maybe has some of these problems with the collaborative real-time editors and stuff that people are working on. But yeah, I think that there probably is a lot there for [00:23:00] those kinds of workflows that where you wanna be able to do things quickly on the fly, but you only really care about like the quality of the finished product. Like in the interim you can have just like, I just needed to do this little tweak to this part. Or, put encode subtitles like into the stream in real time or something like that. And yeah, I'm sure there's all kinds of cool stuff. . This has been super thought provoking, Christopher. Thank you. Christopher: Cool. I hope people are going to be excited about getting into the video editing space in the browser. Thank you so much. Noel: Yeah, of course. And yeah, we'll have a link to that get every boo you mentioned and yeah. And stuff in the show notes so people can check it out. Thank you again so much. It's been a pleasure. Christopher: Thank you.