Edited Audio with Deepika Manglani
===

[00:00:00] Alright, deep go. Welcome to the show. Thanks for coming on. Thanks for having me. Looking forward to it. Yeah, this'll be a fun one. You've been in a slightly different, I think, than what people typically think of for product leadership. You've been at the LA Times now for quite a while and in media and, and product leadership roles across media properties for like 15 years now across some pretty noteworthy publications.

Obviously LA Times is like known everywhere, but on top of that you have the interesting bit of not just product management, but also program management. So maybe can you give us the context of like. How you got to where we are right now, and then also what is so interestingly different about program and product management.

Like why is that an unusual thing? Absolutely. This is the first time I've seen them together, to be honest. So I don't mind being asked this question a couple times. I've never known the difference, so I'm [00:01:00] really interested to hear this. Yeah, people do use the terms interchangeably for PMs being product manager and project manager or program managers as well.

So you're right. I have spent 15 plus years in media and. Newspaper organizations between Tribune Publishing and LA Times. I was at Tribune for six, seven years living my dream product, job building products and managing multiple acquisitions. And then in 2018, hit and LA Times got sold. One thing at the time, me and a lot of others didn't realize, is that it wasn't part of the sale.

So the business was sold, but not it. So imagine what that means. No network, no infrastructure, no products, no platforms, no people. And yet the newsroom has to run 24 7 for 140 year old company. Right? So we were asked at the time, here you go, started from scratch. So initially I was helping sizing up what that migration, what that standing up will look like for six [00:02:00] months or so.

What that six months looked like was a lot of travel from Chicago my entire life became like travel work, go home to do laundry in Chicago, repeat. Very glamorous. Right? So then I was asked to come join full-time and build a product and PMO from scratch. Mm-hmm. While it was being built from scratch, I was one of the first few it hires.

And my idea actually, initially it was only program. Management. I wanted to add product to it as well because I had the product background, I had the product vision, and a lot of products that we were going to bring over from Tribune were some things that I had built over there. So my pitch to the leadership here was that product and program work in silos and so many companies.

Product comes up with a dream of what needs to be built and why we need to build this. Why is the demand for this product or need for this product? Program management [00:03:00] comes in and tells how and when. So I said These two usually work differently, and then they fight on slack, so why not bring them together?

And have them work together on a single mission considering the transition that we have to do while Newsroom continues to produce Pulitzer winning stories. This is not optional, but it's needed for survival, so that's how. We built and what came next is sleepless nights. We basically work nights, weekends, hiring teams, building products, picking platforms and licenses for solutions and tools.

Uh, starting from HRIS, we had to hire people, but we don't have HRIS, you know, for billing. We are buying. You don't think about like. You literally don't have the tools to hire the people so that you can have the tools. Exactly. And then you are getting licenses for things like, so the email solutions, productivity tools, but you need to [00:04:00] be able to pay them.

You don't have accounts receivable systems for a receivables, you have nothing. So we were using our sellers systems for a period of time under the transition services agreement. But to figure out which systems to bring first. Can we set up our own network first? Mm-hmm. Shall we bring our own email system?

Shall we bring HRIS, our first payables? Receivables, ERP. So all that followed along after I moved here, and it's been an incredible journey. We so can't believe we did it If Philip. Someone ask, but we know we did. And that stays the most remarkable work of my career so far. It reminds me of the very early days of a startup.

I mean, log rock. And I joined really early on, like we used office space from one of our investors, and then we had our office space. We sublet a room from another company. It was all begged, borrowed, and stole, except for the difference here is the LA Times was a company with 140 year legacy of award-winning journalism.

Yeah. We couldn't [00:05:00] stop publishing. We couldn't stop putting the stories out. The news out we have published every single day, so that wasn't a choice. So one thing I really want to dig into on that, and it's good that you kind of brought the 140 year legacy. Because in a lot of cases there's just a huge mass of this stuff that is there, but it's not really accessible.

And now you know we're talking a legacy that if you're not careful, can be lost forever. You guys are taking an interesting approach to this. Can we dive into this a little bit because I think people will want to hear about this. It's actually very exciting and very fascinating. We have a whole warehouse here where we have the actual newspapers from the beginning of time, from our first edition, which was in December 4th, 1881, up until the time we started doing PDF Archives, which was early two thousands.

So what we decided to do, this was actually year before last year. That we have all these archives and that's when gt mm-hmm. Models started coming out and Chad GPT broke the internet with [00:06:00] putting just the visual and the UI to AI and generator ai. So that's when we thought, how about we digitize all of these archives that we have.

From starting from the first edition, which are not very easily accessible today. Those were o CD back in the day by one of the vendors. But it was using that times OCR technology, the vendor at the time, o cd, about 12 plus million articles, but that they were not of the best quality because the newspaper given to them or the microfiche given to them had ink spells.

And sometimes, if you remember, they used to stamp on the front page of the newspaper. So it came with those stamps. What we realized is certain years were worse than certain others. Mm-hmm. So it made sense to regit, digitize them or re-scan them. Well, we partnered with some vendors that came to give us a code, and some of those codes were outrageous.

It was in millions of [00:07:00] dollars to take the micro fish and. Re-scan and create a fresh image of all those articles from going back in time. We were trying to justify the use case would kind of go through some ideation of what kind of products are we gonna build out of it? Because without ROI, we just didn't want to go down this path of excavations.

Is it correct to think that the older ones too, maybe were the least accurate? Like I could see like the printing technology print not as great. There's more ink spells, more problems and all that kinda stuff. And the layer of the peaker was also very different from what we see today. Yeah, it's very interesting.

The ads would be very random and the stories would have multiple jumps. We had stories that go on four different pages of jumps, which we don't see today, and then you have to stitch the story together. Yeah. When you are scanning, you have to see, read the jump, then scan the next page where the jump.

Continues and then stitch them together as one article. [00:08:00] Mm. Otherwise it won't make sense to have three or four different cutouts. It's so funny when you brought this story up. Because I mean, it was like, well that's not a hard problem. Like just we have ai, we can just do that now. But that interesting, like very short couple of years has been just such a transformational change that I've already kind of forgot that.

And so in this world it was millions of dollars to do that. Exactly. So I wanna say this was 20, 24 maybe. Yeah. So recent, like, we're not even talking that long ago. Yes. It 23 is when Chad GT started. Yeah. So this was the year after, and then we continued to talk to more vendors to see some get some better price.

And in parallel, we were doing this ideation exercise and justifying the ROI. So then fast forward to last year when some of these multimodals came out, and so with the engineers in our team started exploring and said, we don't need anybody to go to MicroAge and do OCR. We have the cutouts of the page.

So all the cutouts of the stories is what we have as an image. [00:09:00] We can just pass this image to the new Multimodals LLMs that we have and they can read and they can connect the dots and stitch the story together. And we started experimenting and we did the POC of a few articles and it was pretty phenomenal results.

So we started with the omni model and then the Multimodel that came out and then started another challenge of hallucination. Because you know, with AI and with these multi-model, what comes is it is determined to give you an answer. It is not trained to not give results. So when it does not understand or cannot interpret, it will make it up.

So we started seeing hallucination. When we started validating the stories and the outputs, then we started comparing, okay, what percent is the hallucination and is the meaning changing? And it sometimes. The meaning was changing. People's names were changing. That is significant. [00:10:00] The value that this kind of product and the stories bring is the fact that these are the stories from the time 1880s to 1960 that nobody else had because nobody else covered West Coast as much as the LA Times did.

People forget newspapers until, you know, quite historically, recently were way more regional. Even like New York Times and the big nationals were like LA was not really covered by New York Times. And I mean Boston Globe. No way. Exactly. So the value that this content brings, it's that it's written by professionals.

It's edited by professionals, and it's fact checked and verified. It's not a blog or some individual's opinions. These are the facts. So when those facts gets modified with any hallucination, that was not definitely acceptable. So after a lot of contemplation and playing with the temperature of what level of hallucination you want or what level of generative AI you want to use, versus this not, mm-hmm we [00:11:00] checked for the lowest temperature and still that was not acceptable.

So we explored a totally different route, which is a better or a newer version of OCR that is introduced by one of the models that has no generative model behind it. And it is simply converting the PDF and the images into text, and it preserves the raw text and the raw data and the layout of that data, and it may not clearly understand what it means.

The problem is when these AI models start understanding the context, that's when they start thinking. And if you don't want them to think, you gotta eliminate that part out of it. At least for our use case, we had to, it's funny because everyone else wants like a smart model and you must. In this use case, you want a stupider model?

No, no. Just write exactly what you've given. Don't think about it. Just, it's like a scribe, a digital scribe. Exactly. So we needed [00:12:00] exactly that and we found one that did exactly that, and we got 99 plus percent of success rate with that, which is as close to getting accuracy that we can get. So we are very thrilled and satisfied with that.

And now, as we speak today, we have digitized 300,000 plus articles. Our goal is to get to 500,000 in the next week, and that will be a good pilot that we can then work on building different products, ideation or training some of the internal models, or could be used as a rag. It could be licensed to somebody.

That's where we are right now with it. And then once that is successful and we have 11 plus million more to go, one just, it is amazing that we'll have access to just that kind of historic primary data because too many times this kinda stuff is lost or. It's not lost, but it's functionally lost because only so many people can go access [00:13:00] the microfiche records of the LA Times.

Back to, what? 140 years? That's, uh, I'm, I'm slow. Mono, is it 1880 or 1880s? 81. That's wild. There's so much cool stuff you can do with that, right? I hope this is gonna be like a, a bunch of product stuff you guys wanna release on this? Yeah. All the vendors that came in at the time gave us a lot of ideas. We have a lot of ideas.

Plus we have LA 2028 Olympics coming up. LA has hosted Olympics twice before, so this is gonna be incredibly useful, at least to our own newsroom team that can reference back to those incidents and games that were hosted here and correlate any similarities and stitch them together. Everybody wants to hear, oh, this has also happened in 1932.

That'd be such interesting ability to give kind of color commentary on huge global event, but like. At heart. This is a huge legacy that's going to be available to so many more people potentially. And not to kinda tie it to modern tech, [00:14:00] but it is interesting 'cause it's like that's an interesting use of AI that generally everyone's kinda worried about like, oh, it's gonna make a lot of content, it's gonna put some people outta jobs maybe, or it's just gonna make a lot of AI slap content and all that kind of stuff.

But this kind of opening up a new way of data access has so many applications. Like for LA Times side, you were able to use it to. How many articles did you say this was? Or pages of data? 12 million. 12 million? Yeah. I mean, that's. Absurdly, no one was ever gonna be able to access a fraction of that.

Similarly, like a lot of this innovation on our side is a product we've been able to take something like session replay and go from, you have to manually watch a few and good luck to ever getting a lot to, we can just kinda understand it and tell you what's important because of that gains in multimodal ai.

So interesting to see like how widely spread some of this stuff is, but. It'd be hilarious if you guys took some of the hallucinated output and like then did like had the editorial team do like funny versions of it and see like, can people pick [00:15:00] which is the real hallucination, which is the fake or something, you know?

That would be fun actually. Some of the potential people who may be interested in this, and someone recently suggested what are you doing with advertising? Mm-hmm. From those articles on those pages, I said nothing. We're stripping it out. We're only focusing on stories and the content. And they were telling us, actually, advertisements can have good market too.

I said, really? I said, have you read it? I said, yeah, I read it. And they sounded funny in today's time. Yeah. I said, that's what it is. The fun element of the ads back then, and people would want to see that some of the ads had so many impressions and they would get a kick out of how ads were written there.

I'm like, oh, that's something I had never thought. I, I mean, as a marketer by trade, I would a hundred percent. Look at a expose on like all of the ads over the years and how they evolved and all that kind stuff. They'd be incredible. I mean, speaking of primary sources, right? That too is a reflection of society and what was going on at the time.

In fact, the more we talk about this, there's just so many things you can do without so much [00:16:00] historical kind of primary data. What did that look like from a product initiative? Like how do you organize from a program management standpoint, I guess set up like pulling together that much? I mean, you guys are trying to hit 500,000 pages digitized shortly, but that's still only like a small piece of the totality.

It is a small piece. The program management side of it, like I said, I was, we were trying to work with a vendor using a different model in the past and we had a whole tracking sheet of different versions being created. Different temperature check, different competence score. And different success rates and comparing the data against each other from every execution to see what is the best combination of the output that we are getting.

Then scaling the infrastructure. Is it the infrastructure, why it's not reducing the results or is it the model and changing pretty much everything and it's like running a lab experiment. Yeah. And you know, writing down your results every single time to see what change brought, what result. That's when we realized looking [00:17:00] backwards at everything that we had changed and everything that it had produced.

That it was not going to get us where we were and the amount of time that we had spent behind it, thinking that the entire world is using the generative models and that is the way to go. Mm-hmm. And that's when we realized, take a step back. And that is not for us. And we gotta go back to basics. From a like actual kind of use of ai, right?

Thinking about how you would do that. Probably the best way to do like quality checking and just to ensure accuracy is not to just run everything through just once and go like, well, we hope it's right. What was the system for review and was there any, you know, you see across the board, like all these kinda wild system people talk about where there's, you know, we intake it three times and then we pass each interpretation through like a judge layer.

And, you know, maybe there's like a periodic sampling of by human or something to check accuracy. How did you actually kind of go about like measuring veracity and, and was there any kind of way that you guys were driving [00:18:00] higher accuracy or, or is this all like trade secret now at this point of how you guys were able to do it so well?

I can share some of it. So what we built was something called confidence core. Mm-hmm. In the current model that we are using, there is two different levels of confidence. Core one is the confidence core of the image itself. Mm-hmm. That is being processed, which is the image. Read how confident it is that it read the image correctly or accurately to what level.

So you can set the threshold for that. Anything below that threshold, we send it for manual review. Mm-hmm. So it gets processed, but it's not considered final. It's gonna be slacked for manual review. And then the second was the OCR score. So the image read score, and OCR score and the way we set up threshold was if either of them are below 70%, then kick it for manual review, and then somebody literally pulls up visually what the tool generated and the actual physical story page PDF, and compare [00:19:00] manually on how accurate it is and what is the difference.

Sometimes even a punctuation would show as a difference. Yeah. Like if there was an inks pill between a comma or a semicolon or something. But the number of stories that went in manual review after setting those scores to 70% were less than 1%. Oh, wow. That's one of the most interesting differences I've found with LLMs.

'cause we've all seen, it's similarly analogous to like speech to text, right? I think the speech to text that you're using is also not just plain old speech to text, it's also thinking Exactly. Because what you introduce thinking into it, think about and as literally a human standing there and thinking they're gonna put context in it, they're gonna put background in it.

They're not just gonna take the word for word. Yeah. And that's when it gets better and more useful for everyday use. So you also have World Cup coming up right before, in addition to the Olympics. There's World Cup coming up shortly. So [00:20:00] we have seen this in the past. You know, when, what was it? The year 2020 when Kobe Bryant passed away and everybody was wanting to put the best Kobe story out there.

Dig up the best Kobe pictures they have. So anytime a historic figure or a historic incident is happening again, what makes the story interesting? Everybody's gonna put stories out there, obviously, but what we have about the FIFA World Cup that happened in LA or that US boosted the last time. We have data and stories about it.

So if we can stitch that together and bring similar incidents from what happened back in 1994 when US hosted versus right now in 2026, that would make the story more interesting if there are generation of players that are on the coaches that were there and are still there, or any correlation that you can get.

Last time, I think it was in the Rose Bowl, and if it's happening there again, then it's the same [00:21:00] venue that connects the story from what happened last time versus this time. So all these historic events is a great time to have this data available. Because at the time of Kobe or any such incident, there is a whole team of researchers, librarians that sit together, spend hours and hours, dig into the article archives that we have, flag it, punch it, file it for the editorial and the newsroom to go through it, and then they go through it, study it, and finally put a story out.

This whole process will cut down with this data being put behind semantic search. There's one other thing I'm curious to talk about. 'cause like looking at this and thinking about like how that can enable, you think, oh, someone who's already reading about the World Cup, being able to serve them up context on what went on last time, or you said they kind of related stuff, seems like it'd be something you would really want to do.

But how do you kind of do that in a more personalized sense? We're, we're knowing like what [00:22:00] each person has read and how people have thought about kind of being what they want. When you look at a lot of newer media platforms, like you have things like Spotify or TikTok, a huge thing is the algorithm that just kinda serves you up the next item and the next thing.

And it seems like there's no news company doing that usually right now. But does this kind of work start to put in place like the infrastructure needed to do that kind of work? Or is that something people just don't want? That's something I definitely want. I want it, I mean, living in LA and the amount of time we all spend in traffic, sure there are news channels out there that you can tune into, but after a point in time, they're all repeating the same news.

So I would love to have an app or a tool that reads news to me based on my history of reading news, on what content I'm interested in. So we are doing this personalized. Experience for our subscribers when they go on the website or on the mobile app, we are using [00:23:00] Bandits to show them based on their interest, what they have subscribed to and what they have read before when they're revisiting, to bring similar content to them to continue engagement and have them stay on the site or the app for a longer time.

A lot of companies are doing that for sites and apps. But for audio form of content, there are podcasts that you are recommended on Spotify or whenever you get your podcast based on what you last heard. But live news, that is my dream to be driving and just hitting a button and hearing the news that I'm interested in or what's on top of my mind based on the physically what I read in news versus what I heard last.

And then, you know, it continues to learn. Based on, I say, or continue reading, or I just continue listening. Just like music. Yeah. You know, like apps have started that years ago. We have a feature that you can listen to the stories from the app while you're driving, [00:24:00] which I use that. But it just, the auto plate doesn't exist and it's not personalized.

So that is my personal desire as a consumer. You know, we missed an entire section that we had planned to talk about around some of the really interesting, really focused, kinda more digital products that you guys are creating on the team over there. Because the digitizing of the history and, and just where news is going is such an interesting topic right now.

It's a really interesting vertical. You get to play in every day there. I'm really jealous. But speaking of you get to do it every day. I'm sure they want you to do more of it today. Probably have to give you back to them now at this point. So I guess I'll leave it off with, if people wanna kind of come pick your brain about digitizing history or just where the news is going, which I think is an interesting topic.

Is LinkedIn the best place to reach out? Is there somewhere better? LinkedIn is a great place. Hopefully we can have you on again in a little bit, see how the progress is going, hear what other new innovation you gotta come up with. But in the meantime, thanks. This has been a real blast. I appreciate you coming on.

This is a lot of fun. Good pleasure to all mine, and I would love to talk to you again. Anytime you're here in LA hit me up. Thank you so much, [00:25:00] Deepika. Good to see you. Good to see you too. Take care.