LaunchPod - Deepika Manglani === [00:00:00] We have digitized 300,000 plus articles. Our goal is to get to 500,000 in the next week. That will be a good pilot. We can then work on building different products, ideation or training some of the internal models or could be used as a rag. We have a lot of ideas. The FIFA World Cup that happened in LA or that US posted the last time, we have data and stories about it. So if we can stitch that together and bring similar incidents from what happened back in 1994 when US hosted versus right now in 2026, that would make the story more interesting. All these historic events is a great time to have this data available. Because there is a whole team of researchers that sit together, spend hours and hours, and then finally put a story out. This whole process will cut down with this data being put behind Semantic [00:01:00] Search. Welcome to Launch Pod, the show from Log Rocket, where we sit down with top product and digital leaders. In this episode, we're joined by Deepika Menani, VP of Product and Program Management at the LA Times. Deep Pick's Career in media spans over 15 years, culminating in her current role. Where she's bringing the 140 year old institution into the future. In this episode, Deepika shares how her team is using AI to preserve a unique trove of historical data. Over 12 million pages of news archives from as far back as the 18 hundreds, what this digital archive and maturation of AI enables for future storytelling, media innovation and news personalization. And while combining product and program management was critical to navigating massive transformation at the LA Times through a period of heavy m and a activity. So here's our episode with Deepika Ani. Jeff: Alright, deep ago. Welcome to the show. Thanks for coming on. Deepika: Thanks for having me. Looking forward to it. Jeff: Yeah, this'll be a fun one. You've been in a slightly different, I think than, than what people typically think of for product leadership. You've been at the LA Times now for, for quite a while and in media and, [00:02:00] and product leadership roles across media properties for like 15 years now across some pretty. Noteworthy publications. Obviously LA Times is like known everywhere. But on top of that, you have the kind of interesting bit of not just product management, but also program management. So maybe can you give us the context of like, how you got to where we are right now. And, and then also what is. So interestingly different about program and product management. Like why is that an unusual thing Deepika: Absolutely. This was, this is the. First time I've seen them together, to be honest. So I don't mind being asked this question Jeff: A couple times that I've never known the difference, so I am really interested to hear this Deepika: yeah, people do use the terms interchangeably for PMs being product manager and project manager or program managers as well. So, you're right, I have spent 15 plus years in media and newspaper organizations between Tribune Publishing and LA Times. I was at Tribune for six, seven years. You know, living my dream product job building [00:03:00] products and managing multiple acquisitions. And then 2018 hit and LA Times got sold. One thing at the time, me and a lot of others didn't realize is that it wasn't part of the sale. So, the business was sold, but not it. So imagine what that means. No network, no infrastructure, no products, no platforms, no people. And yet the newsroom has to run 24 7 for 140 year old company. So we were asked at the time, here you go start it from scratch. So, initially I was helping sizing up what that migration, what that standing up looked like for six months or so, what that six months looked like was a lot of travel from Chicago my entire life became like travel work go home to do laundry in Chicago. Repeat, very glamorous, right? So, then I was asked to. Come join full-time and build a product and PMO from scratch while it was [00:04:00] being built from scratch. I was one of the first few it hires and my idea actually, initially it was only program management. I wanted to add product to it as well because I had the product background, I had the product vision, and a lot of products that we were going to bring over from Tribune were some things that I had built over there. So my pitch to the leadership here was that product and program work in silos in so many companies. Product comes up with a dream of what needs to be built and why we need to build this. Why is the demand for this product or need for this product? Program management comes in and tells how and when, so I said These two usually work differently, and then they fight on Slack, so why not bring them together and have them work together on a single mission? Considering the transition that we have to do while newsroom continues to produce Pulitzer winning stories. [00:05:00] This is not optional, but it's needed for survival. So that's how we built and what came next is sleepless nights. We basically work nights, weekends, hiring teams. Building products picking platforms and licenses for solutions and tools. Starting from HRIS, we had to hire people, but we don't have HRIS, you know, for billing. We are buying licenses. Jeff: You don't think about like that. You, you literally don't have the tools to hire the people so that you can have the tools. Deepika: Exactly. And then you are, you are getting licenses for things like, you know, email solutions, productivity tools, but you need to be able to pay them. You don't have accounts payable systems, you have receivables, you have nothing. So we were using, our sellers systems for a period of time under the transition services agreement, but to figure out which systems to bring first. Shall we set up our own network first? Shall we bring our own email system? Shall we bring [00:06:00] HRIS or first Payables? Receivables, ERP? So all that followed along after I moved here and, it's been an incredible journey. We still can't believe we did it if you look back as someone asked, but we know we did, and that stays the most remarkable work of my career so far. Jeff: It reminds me of the very early days , of a startup, like, I mean, log Rock. And I joined really early on, like we used office space from one of our investors, and then we had our office space. We kinda like, we sublet a room from another company. It was all, you know, , begged, borrowed and stole. Except for the difference here is. The LA Times was a, company with 140 year legacy of, of, award-winning journalism. Deepika: Yeah. We couldn't stop. We couldn't stop publishing. We couldn't stop putting the stories out, the news out we have published every single day, so that that wasn't a choice. Jeff: so one thing I I really wanted to dig into on that, and it's good that you kind of brought the 140 year legacy is because in a lot of [00:07:00] cases there's, there's just a huge mass of this stuff that, that is, it's there, but it's not really accessible. And now, you know, we're talking . A legacy that if you're not careful can be, you know, lost forever. You guys are taking an interesting approach to this. Can we dive into this a little bit? 'cause I, I think people will want to hear about this. Deepika: It's, it's actually very exciting and very fascinating. We have a whole warehouse here where we have the actual newspapers from the beginning of time, from our first edition, which was in December 4th, 1881. until the time we started doing PDF archives, which was early two thousands. So what we decided to do, this was actually year before last year. That we have all these archives, and that's when GPT models started coming out. And Chad GPT broke the internet with putting just the visual and the UI to AI and generative ai. So that's when we thought, how about we digitize all of these [00:08:00] archives that we have from starting from the first edition, which are not very easily accessible today. Those were OCR back in the day by one of the vendors, but it was using that times OCR technology. So the vendor at the time, o cd, about 12 plus million articles. But they, they were not of the best quality because the newspaper given to them or the microfiche given to them, had ink spells and sometimes they, if you remember, they used to stamp on the front page of the newspaper. So it came with those stamps. , What we realized is certain years were worse than certain others, so it made sense to regit, digitize them or re-scan them. Well, we partnered with some vendors that came to give us a code and some of those codes were outrageous. It was in millions of dollars to take the microfiche and. Re-scan and create a fresh image of all those articles from going [00:09:00] back in time. We were trying to justify the use case would kind of go through some ideation of what kind of products are we gonna build out of it because without ROI, we just didn't want to go down this path of, you know, excavation Jeff: is it correct to think that like the older ones too, maybe were the least kind of, accurate, likely, because I, I, I could see like the printing technology print out as great. There's more ink spells, more problems and all that kinda stuff. Deepika: yeah, and the layout of the paper was also very different from what we see today. It's very interesting. The ads would be very random and the stories would have multiple jumps. We have stories that go on four different pages of jumps, which we don't see today. So, and then you have to stitch the story together. When you are scanning, you have to see, read the jump, then scan the next page where the jump continues, and then stitch them together as one article. Jeff: Mm. Deepika: Otherwise, it won't make sense to have three or four different cutouts. , Jeff: it's so funny when you brought this story up [00:10:00] because I, I mean, it was like, well, that's not a hard problem. Like just we have ai, we can just do that now, but it's that interesting, like very short couple of years has been just such a transformational change that I've already kind of forgot that. And so, , in this world it was millions of dollars to do that. Deepika: exactly. So I wanna say this was 20, 24 maybe. And then. Jeff: So recent, like we're not even talking that long ago. Deepika: Oh, it was 23 is when chat GPD Jeff: Yeah. Deepika: So this was the year after and then we continued to talk to more vendors to see some, get some better price. And in parallel we were doing this ideation exercise and justifying the ROI. So then fast forward to last year when some of these multimodal came out and some of the engineers in our team started exploring and said, we don't need anybody to go to microfiche and do OCR. We have the cutouts of the page. So all the cutouts of the stories is what we have as an image. We can just pass these image to the new Multimodals LLMs that [00:11:00] we have. And they can read and they can connect the dots and stitch the story together. And we started experimenting and we did the POC of a few articles and it was pretty phenomenal results. So we started with the omni model and then the multi-model that came out. And then started another challenge of hallucination because, you know, with AI and with these multi-model, what comes is it is determined to give you an answer. it is not trained to not give results. So when it does not understand or cannot interpret, it will make it up. So we started seeing hallucination when we started validating the stories and the outputs. then we started comparing, okay, what percent is the hallucination and is the meaning changing? And then sometimes the meaning was changing. People's names were changing. that is significant. The value that this kind of product and the stories [00:12:00] bring is the fact that these are the stories from the time, 1880s to 1960, that nobody else had because nobody else covered West Coast as much as the LA Times did, Jeff: People forget like newspapers until, you know, quite historically, recently were, were way more regional. Even like New York Times, , and the big nationals were like LA was not really covered , by New York Times and I mean Boston Globe. No way. No way. Deepika: Exactly. So the value that this content brings, it's that it's written by professionals. It's edited by professionals and it's fact checked and verified. It's not a blog or some individual's opinions. These are the facts. So when those facts gets modified with any hallucination, that was not definitely acceptable. so after a lot of contemplation and playing with the temperature of what level of hallucination you want or what level of generative AI you want to use versus not, , we checked for the lowest [00:13:00] temperature and still that was not acceptable. So we. Explored a totally different route which is a better or a newer version of OCR that is introduced by one of the models that has no generative model behind it. And it is simply converting the PDF and the images into text. and it preserves the raw text and the raw data and the layout of that data. And it, it may not clearly understand what it means. The problem is when these AI models start understanding the context, that's when they start thinking. And if you don't want them to think, you gotta eliminate that part out of it, at least. For our use case, we had to Jeff: It's funny because everyone else wants like a smart model and you almost, in this use case, you want, you want a stupider model? Deepika: Yes, Jeff: No, no. Just write exactly what you're given. Don't think about it. Just, it's like a scribe, like a digital scribe. Deepika: Exactly. So we [00:14:00] needed exactly that and we found one that did exactly that and we got 99 plus percent of success rate with that. Which is as. Close to getting accuracy that we can get. So we are, we are very thrilled and satisfied with that. And now, as we speak today we have digitized 300,000 plus articles. Our goal is to get to 500,000 in the next week. And that will be a good pilot that we can then work on. You know, building different products, ideation or training some of the internal models or could be used as a rag if it could be licensed to somebody. That's where we are right now with it. And then once that is successful, then we have 11 plus million more to go. Jeff: one just, it is amazing that, that we'll have access to just that kind of historic primary data because too many times this kinda stuff is lost or it's not lost, but it is [00:15:00] functionally lost because only so many people can go access the microfiche records of the LA Times back to what 140 year sets. I'm, I'm slow mono math right now. Is it 1880 or 1880s. Deepika: 81. Jeff: That's wild. But like there's so much cool stuff you can do with that, right? I hope this is gonna be like a bunch of product stuff you guys are gonna release on this because Deepika: all the vendors that came in at the time gave us a lot of ideas. We have a lot of ideas . Plus we have LA 2028 Olympics coming up, has hosted Olympics twice before, so this is gonna be incredibly useful, at least to our own newsroom that can reference back to those incidents and the games that we're hosted here, and correlate any similarities and stitch them together like everybody wants to hear. Oh, this has also happened in 1932. Jeff: , Yeah, I mean, that'd be such interesting. Ability to give kind color commentary on, on a huge global event but like at heart, [00:16:00] this is, this is a huge legacy that's going to be available , to so many more people potentially. Not to, not to kinda tie it to modern tech, but it is interesting 'cause it's like, that's an interesting use of AI that generally, you know, everyone's kinda worried about like, oh, it's gonna make a lot of content, it's gonna put some people outta jobs maybe, or it's just gonna make a lot of, you know, AI lop content and all that kind of stuff. But this kind of opening up a new. Way of data access has so many applications, like for LA Times side, you were able to use it to, you know, pull. I get how many, how many articles did you say this was, or pages of data? 12 million. Yeah. I mean, that's absurd. Like no one was ever gonna be able to access a fraction of that. Similarly, like a lot of this innovation on our side is a product We've been able to take something like session replay, and. Go from, you have to manually watch a few and, and good luck to ever getting a lot to, we can just kinda understand it and tell you what's important because of that kind of gains in [00:17:00] multimodal ai. So interesting to see like how widely spread some of this stuff is. But it, it'd be hilarious if you guys took some of the hallucinated output and like then did like, had the editorial team do like funny versions of it. And see like can people pick which is the real hallucination, which is the fake or something, you know? Deepika: That would be fun actually. We were talking to some of the potential people who may be interested in this and someone recently suggested , what are you doing with advertising? From those articles on those pages, I said nothing. We are stripping it out. We are only focusing on stories and the content. And they were telling us actually advertisements can have good market too. I said. Really? I said, have you read it? I said, yeah, I read it. And they were, they sounded funny in today's time. I said, that's what it is, the fun element of the ads back then. And people would want to see that some of the ads had so many impressions and they would get a kick out of how ads were written there. I'm like, that, that's something I had never [00:18:00] thought. Jeff: I, I mean, as a, as a marketer by trade I would, I would a hundred percent look at a, a expose on like all of the ads over the years and how they evolved and all that kind. That'd be incredible. 'Cause I mean, speaking of primary sources, right? That too is a reflection of society and what was going on at the time. It's funny the more we talk about this , there's just so many things you can do without so much historical kind of primary data. What did that look like from a, from a product initiative, like how do you organize and kind of like from a program management standpoint, I guess. Set up like pulling together that much. I mean, you guys are, are trying to hit 500,000 pages digitized shortly, but that's, that's still only like a small piece of the totality. Deepika: It is a small piece. , The program management side of it, like I said, I was, we were trying to work with a vendor using a different model in the past. And we had a whole tracking sheet of different versions being created. Different temperature check, different competence score, and different success rates, and comparing the data against each other from every execution to see what [00:19:00] is the best combination of the output that we are getting. And then scaling the infrastructure. Is it the infrastructure why it's not producing the results or is it the model and changing pretty much everything. And it's like, running a, running a lab experiment and, you know, writing down your results every single time. To see what change brought, what result. that's when we realized looking backwards at everything that we had changed and everything that it had produced, that it was not going to get us where we were. And the amount of time that we had spent behind it thinking that, you know, the, the entire world is using the generative models and that is the way to go. And that's when we realized to take a step back. And that is not for us. And we gotta go back to basics. Jeff: from a like actual kind of use of ai right? Thinking about how you would do that, probably the best way. To do like quality checking and, and just to ensure accuracy is not to just run everything through just once and go like, well, we hope it's right. What was the system [00:20:00] for review and was there any. You know, you, you see across the board, like all these kinda wild system people talk about where there's, you know, we, we intake it three times and then we pass each interpretation through like a judge layer. And, you know, maybe there's like a periodic, you know, sampling of by human or something to check, to check accuracy how did you actually kind of go about like. Measuring veracity and, and was there any kind of way that you guys were driving higher accuracy, or, or is this all like trade secret now at this point of how you guys were able to do it so well? Deepika: I can share some of it. So what we built was something called confidence score. So in the current model that we are using, there is two different levels of confidence. Core one is the confidence core of the image itself that is being processed which is the image. Read how confident it is that it read the image correctly or accurately. To what level? So you can set the threshold for that and anything [00:21:00] below that threshold, we send it for manual review so it gets processed, but it's not considered final. It's gonna be flagged for manual review. And then the second was the OCR score. So the image read score, and OCR score and the way we set up threshold was if either of them are below 70%, then kick it for manual review, and then somebody literally pulls up visually what the tool generated and the actual physical story page PDF, and compare manually on how accurate it is and what is the difference. Sometimes even a punctuation. Would show as a difference, like, if there was an inks pill between a comma or a semicolon or something. But the number of stories that went in manual review after setting that those scores to 70% were less than 1%. Okay. Yeah. Jeff: That's one of the most interesting. Differences I've found with LLMs is like, 'cause we've all seen I guess [00:22:00] it, it's similarly analogous to like speech to text, right? Deepika: I think the speech to text that you're using is also not just plain old speech to text. It's also thinking, because when you introduce thinking into it, think about it as literally a human standing there and thinking they're gonna put context in it, they're gonna put background in it. They're not just gonna take the word for word, and that's when it gets , better and more useful for everyday use. Jeff: So I mean, you also have world Cup coming up, right? Yeah, in addition to Olympics, there's World Cup coming up shortly. So Deepika: We have seen this in the past, you know, when. What was it? The year 2020 when Kobe Bryant passed away and everybody was wanting to put the best Kobe story out there. Dig up the best Kobe pictures they have. So anytime a historic figure or a historic incident is happening again what makes the story interesting. Everybody's gonna put stories out there obviously. What we have about the FIFA [00:23:00] World Cup that happened in LA or that US posted the last time. , We have data and stories about it, so if we can stitch that together and bring similar incidents from. What happened back in 1994 when US hosted versus right now in 2026? That would make the story more interesting if there are generation of players that are, or the coaches that were there and are still there, or any correlation that you can get. What last time, I think it was in the Rose Bowl, and if it's happening there again, then it's the same venue. That connects the story from what happened last time versus this time. So all these historic events is a great time to have this data available because at the time of Kobe or any such incident, there is a whole team of researchers, librarians that sit together, spend hours and hours, dig into the article archives [00:24:00] that we have, flag it, punch it, file it for the editorial in the newsroom to go through it, and then they go through it, study it, and finally put a story out. This whole process will cut down with, you know, this data being put behind semantic search. Jeff: there's one other thing I, I'm curious to talk about because like. Looking at this and thinking about like how, how that can enable you. You would think, oh, someone, who's already reading about the World Cup. Being able to serve them up context on what went on last time, or like you said, they kind of related stuff. Seems like it'd be something you would really want to do, but how do you kinda do that in a more personalized sense, where we're knowing like what each person has read and how people have thought about kind of being, you know, what, what they want. When you look at a lot of newer media platforms, like you have things like Spotify or, or TikTok, like a huge thing is the algorithm that just kinda serves you up the next item and the next thing. And it seems like there's no news company doing that hugely right [00:25:00] now. But does this kind of work start to put in a place like the infrastructure needed to do that kinda work, or is that, is that something people just don't want? Deepika: Well, that's something I definitely want that Jeff: I want it, I mean, Deepika: living in LA and the amount of time we all spend in traffic assure there are news channels out there that you can tune into, but after a point in time they're all repeating the same news. So, I would love to have an app or a tool that reads news to me based on my history of reading news, on what content I'm interested in. So we are doing this personalized experience for our subscribers when they go on the website or on the mobile app. We are using bandits. To show them based on their interest, what they have subscribed to and what they have read before when they're revisiting, to bring similar content to them to continue engagement and have them stay on the site or the app for a longer time. a lot of companies are doing that for sites and [00:26:00] apps, but for audio form of content. There are podcasts that you are recommended on Spotify or wherever you get your podcast based on what you last heard. But live news that, that is my dream to be driving and just hitting a button and hearing the news that I'm interested in or what's on top of my mind based on the physically what I read in news versus what I heard last. And then, you know, it continues to learn based on, I say skip or continue reading, or I just continue listening. Just like music, you know, music apps have started that years ago. We have a feature that you can listen to the stories from the app while you're driving, which I use that, but it just, the autoplay doesn't exist and. It's not personalized. So, that is my personal desire as a consumer Jeff: We missed an entire section that we had planned to talk about around some of the really interesting, really focused kinda more digital products that you guys are creating on the team over there [00:27:00] because the digitizing of the history and, and just where news is going is such an interesting topic right now. It's a really interesting, vertical you get to play in every day there. I'm really jealous, but, speaking of, you get to do it every day. I'm sure they want you to do more of it today. probably have to have to give you back to them now at this point. So I, I guess I'll leave it off with if people wanna kind of come pick your brain about digitizing history or just where, where the news is going, which I think is an infinitely interesting topic. Is LinkedIn the best place to reach out? Is there somewhere better or. Deepika: Is a great place. Jeff: Hopefully we can have you on again in a little bit, see how the progress is going here, what other new innovation you guys come up with. But in the meantime, thanks, this has been a real blast. I appreciate you coming on. This is a lot of fun. Deepika: The pleasure is all mine, and I would love to talk to you again. Anytime you're here in LA hit me up Jeff: Thank you so much, Deepika. Good to see you. Deepika: Good to see you too. Take care. Bye.