The following is a rough transcript which has not been revised by High Signal or the guest. Please check with us before using any quotations from this transcript. Thank you.
===

sergey: [00:00:00] Our content is available for at least 250, 300 million people worldwide that wanna watch it. When I first joined, the goal really was about data unification, standardization, and then building out some core infrastructure that would allow us. To serve our digital audience. We did that by building out, uh, what we now call the, uh, Televisa Univision, uh, household graph, a representation of our, our best guess of who we think is interested in Spanish language media across the United States.

Uh, we take what data is available and make. Um, either infrastructural products or data products that will either make our own lives easier, our internal clients' lives easier, or make the company more money at this point. The digital business has grown at least 10 x in the four years that, that we've been a part of it.

hugo: That was Serge Fogelson, VP of Data Science at Televisa Univision, the world's largest Spanish language [00:01:00] media company whose shows news and sports reach more than 300 million people worldwide. In this episode of High Signal, I talk with Serge about what it means to build data science inside the world's largest Spanish language media company.

His team isn't just unifying data for the sake of it. They're using it to help shape cultural experiences, fuel a tenfold expansion in digital streaming, and rethink our audiences, connect with content across borders. We explore the role of data science in serving bilingual and multicultural communities in the trade-offs between internal data products and revenue driving ones, and how to bring modern machine learning practices into a company where creativity and content still lead.

If you enjoy these conversations, please leave us a review. Give us five stars and share the podcast with your friends. Links are in the show notes. Let's now check in with Duncan Gilchrist from Delphina before we jump into the interview. Hey Duncan. Hey Hugo, how are you? So before we jump into the conversation with Serge, [00:02:00] I'd love for you to tell us a bit about what you're up to at Delphia and why we make high signal 

duncan: At Delphia, we're building AI agents for data science and through the nature of our work, we speak with the very best in the field.

And so with the podcast we're sharing that high signal. 

hugo: We covered a lot of ground with Serge, so I was wondering, Duncan, if you would just let us know what resonated with you the most. 

duncan: Serge's data modernization and. Innovation story at a massive traditional business. Televisa Univision is chockfull of real examples of practical wisdom, focusing on landing small, clear improvements, no single points of failure, making sure you know what problem to solve, and not just bringing in LLM Hammer to every text nail.

And in so doing, moving the needle for 300 million households that watch, it's an episode touching on such a different and important part of the data ecosystem. Let's get into it. 

hugo: Hey there, Serge, and welcome to the show. 

sergey: Thank you so much for having me Hugo. 

hugo: A real pleasure [00:03:00] man. So, televis Univision is the largest Spanish language media company in in, in the Americas.

Um, give us the quick origin story and what you're up to with, with data there at the moment. 

sergey: Absolutely. Sure. So. Za Univision is actually the combination of group of ZA in Mexico and the Univision company, which is based out of the United States. The combination happened a couple years ago, and it was really the joining of two core components of, I would say, like a modern media company, right?

ZA is actually the company that creates the vast majority of the content that Univision would then a license and put on all of its broadcast channels. So it really didn't make sense, I think, economically for them both to be, uh, to be separate entities, right? One basically is the largest customer of the other.

So I think it made a lot of sense, especially in the current media environment, for them to join forces. And then it's really like a meeting of equal partners. So on the. Tel Za side. Tel Za [00:04:00] is actually, it's the, it's the largest media company originally, the largest media company in, uh, Mexico. They have a lot of, uh, TV channels.

I think there's a, a cable network. There's a lot of kind of legacy media there. And then on the Univision side, Univision is the largest Spanish language company in the United States, uh, that has. Television stations, obviously a radio channel. They own, I think some other ancillary properties. Uh, but this combined company really is the, a force to be reckoned with from the perspective of Spanish language, media content, and both distribution and, uh, creation.

And it's actually, you say it's the largest in the Americas. I think it's actually the largest, um, in the world, uh, which is, uh, which I guess makes sense given that the Americas or the, the vast majority of Spanish speakers live, uh, in the world anyway. How many people do you serve? I would say at least our content is available for, for at least 250, 300 million people worldwide that wanna watch it.

And then obviously we also get, we're regularly, at least in the United States, we actually out [00:05:00] deliver or have larger audiences for soccer matches than the English language channels. Right. Even with, if you look at just the English language, speaking demographic, they actually, uh, and I like to watch.

Soccer on, on, on our channels. 'cause it's, I think it's a very, very different experience, 

hugo: but, okay. I kind of imagine, and to be honest, I think maybe I've told you this before, I, I went to a soccer match in Argentina at, uh, saw Boca Junior and that this was one of the wildest experiences of my life. I think they've even stopped allowing the away teams.

Supporters to come now. 'cause of how wild. 

sergey: Because of how, yeah, because of how out of control it can sometimes get. Yeah, yeah. Yeah. It's pretty, it's an interesting, it's a, it's definitely I think a, yeah. Soccer match is definitely an interesting experience in, in South. So I 

hugo: am interested, you are VP of data science.

Mm-hmm. And I'm wondering if you can just tell us a bit about. The data function there and, and the history of it and where we are, where we're at now. 

sergey: Yeah. So I joined the company just before the official merger of, uh, the two entities. So I joined in 2021, and there was a very [00:06:00] limited, I would say, at least on the Univision side of things, there was a very small original, I would say digital business that have very basic analytics built on top of it.

There was also a legacy and still is a legacy, uh, research function, which basically means analyzing traditional over the top linear television content that serves Spanish language media. Right? So it's basically, we're talking about, at least now it's a little different, but back then it was basically just analyses of Nielsen ratings and who was watching what on tv.

So it was a. It was more of a a, I would say a legacy research function and then a very minimal digital business where we were serving basically just a video on the univision.com website on a few of our legacy applications. So Univision back then had, for example, apps that were hyper-local. So if you were in.

Los Angeles, there was a u Los Angeles local Univision app for [00:07:00] your mobile phone, for your Android or your, or, or whatever, or for your iPhone. Uh, they also have had, and still have a, uh, digital radio application called Euphoria and then a sports app. TUDN, but basically all of these applications and a news app as well.

But all of those served a very small kind of hype, hyper passionate, core base of users. It didn't really have the scale of what Univision now has or vis Univision now has from the perspective of, uh, digital content and delivery. Anyway, so when I joined, we were. Basically in the middle of a, uh, re-imagining of what the data function could look like, specifically anchored within, uh, the digital ecosystem because we got lots of signal there.

The signal was rich and it allowed us to be able to start creating personalized experiences for our consumers and also to just really understand our consumers really well. So when I first joined, the goal really [00:08:00] was about data unification, standardization. Then building out some core infrastructure that would allow us to not only serve our digital audience, but at the end of the day where a business really was to monetize that digital audience.

And so we did that by building out, uh, what we now call the, the Tel Univision, uh, household graph. Which is a representation of our best guess of who we think is interested in Spanish language media across the United States. So it captures, I mean, primarily Spanish language, media, consumers, um, but they don't actually necessarily, right?

We don't, uh, ethnicity is a protected class, so we're not actually looking for, uh, people of Hispanic ethnicity. We're really just looking for people that have engaged with that content and that. Allows us to cast a slightly wider net, but at the same time, it makes our job a little bit. Harder, I think, than your standard, broad based, traditional media company where your tam, your total addressable market, [00:09:00] at least in the United States, is literally every person that lives in the United States, right?

For us, it's not every person. It's just those people that we think are likely to consume Spanish language media at some point in the future. So it could be those hardcore soccer fans, uh, that don't speak any Spanish. They like to watch it on our platforms, or it could be those, uh, first generation, uh, Spanish speakers to fully acculturated Americans or, or people that live in the United States that came from a Spanish speaking country, but that still want to have some kind of a connection to.

To their culture or what have you. But yes, so let me, let me answer your question though. Initially, like I said, it was all about data unification, building out data infrastructure and building out core components. So that's the, the graph, some ancillary components on top of the graph, things like audience profiling, lookalike modeling, and then some basic attribution analyses around and pain delivery, things like that.

And then afterwards, after we had gotten our legs underneath us and we got, we had some kind of [00:10:00] stable ground to work on. We dove in and actually worked on, um, I would say classic data science kind of things around, uh, personalization and really your standard, I would say, modeling capabilities around, uh, trying to understand who's gonna come back, who's gonna churn, who's gonna upgrade.

But that really happened after the second, I would say evolution, big evolution after I came, which was the launch of our. Fully owned, fully created, uh, Spanish language, a direct to consumer streaming service called vix. That service was launched like seven or eight months after I came, and it's really, it's been, uh, incredibly successful, but we now have millions and millions, I think over, uh, 10 million, uh, streaming subscribers, uh, for the paid version of the service.

We also have, uh, a free version of the service. We have a totally free tier. You can just watch whatever you want that's not behind the paywall. And then you get ads every 10, 15 minutes. This is similar to other, uh, what are called, uh, advertising supported streaming services [00:11:00] that exist in the marketplace.

So, yeah, so that happened then. And at that point, what was really important for us was to build out. Infrastructure for personalization. And so to that end, we've built some algorithms. The team has built some algorithms for personalizing the experience on VIX as well as infrastructure to, to tailor the messaging that we send to people either within the app or outside of the app there into their inboxes or in push notifications, et cetera.

But we built A-A-C-D-P, an in-house CDPA customer data platform that allows us to basically manage those relationships and to be able to message to them effectively. Yeah. And now we're, we're kind of in, um, uh, maintenance mode with some of these things, uh, like with the graph. Uh, but then on the flip side, we're also doing some, uh, some new and more interesting work with, with a deeper personalization.

On the application as well as a deeper and more interesting award around messaging, tailoring, personalized messaging experiences to use or when they're not on the app so we can bring them back. 

hugo: Incredible. And you've been there just [00:12:00] under un under four years, right? And to have gone from the basics of data unification, infrastructure, to building out all these, um, incredible products that deliver so much value while continuing to.

Make sure the infrastructure and all the data is working as is is an amazing story. I am interested, I'm not sure how much you can talk about this, but previously you were VP of Data Science and modeling at Viacom CCB S, now Paramount, and I'm sure there's so many things you learned and built there, which inform what you do now, so you're able to come in and really affect change.

sergey: Yeah, yeah, absolutely. So I think the biggest, the biggest opportunity I had when I was there was to get. Really into the weeds with what the now Paramount digital ecosystem looked like. And really before the launch of, uh, paramount plus the back, then that ecosystem was basically very similar to what existed at Univision.

It was just. Larger, right? Because the user base was significantly larger and the kind of the variety of data that was available, the [00:13:00] variety of systems that were capturing that data was also be very varied. There were, uh, there, there was data that was, um, that was stored in basically Adobe Omniture logs.

There was data that was stored in ad servers, uh, in, in various states of, I would say, cleanliness. And so I really got to understand how to take disparate data sources. Put them into one kind of general schema that we could then use leverage and ultimately connect to external data sources that we would license, right?

So for example, like you, we ourselves, whether it was Paramount or Univision, uh, didn't directly have access, uh, demographic information. About our users other than what was explicitly asked about. But we could license data from various third party vendors to be able to either, in some cases, deterministically, in other cases probabilistically link our first party data and get some sense of who it was that was consuming our, our applications.

So [00:14:00] being able to do that at at a Paramount really allowed me to kind of. Instead of start literally from, even though we started from zero, conceptually, I wasn't starting from zero when I joined Univision, which was for me, the kind of, one of the more attractive reasons why I was excited about, about joining the, the company here, right?

It was the ability to, the ability to take what I had learned. At a Paramount and actually build things really from scratch. And I want to, I wanna take a moment and say, it wasn't just, it was like me and my team, we have a really incredible data engineering organization, Univision, and uh, we have really great product folks that I've been working with that that one set the overall vision.

The other really got data to me and ship data out for me that things that I didn't necessarily care about wanting to have to do. And allowed, uh, me and my team to really just focus on, on the stuff that, that we can do well, which is take what data is available and make either [00:15:00] infrastructural products or data products that will either make our own lives easier, our internal clients' lives easier, or make the company more money.

And we've, we've done a reasonably good job of that. I think we've, in terms of revenue, I think at this point. The digital business has grown at least 10 x in the four years that, that we've been a part of. And, and my team can, you know, some amount of, I would say, credit for, uh, for that success. 

hugo: Yeah. I'm really appreciate you mentioning all the other moving parts that go into data being a successful function.

'cause it's easier. Speak about it in, in a vacuum and not realize how many other teams are super important here necessary, and then how much, um, top down support and, and making it part of the culture is, is key. I'm also so glad that we've been able to have a conversation about data delivering value without, we haven't even talked about generative AI yet, or anything along along those lines.

And something I love about the work, you've done a lot of generative AI applications these days. Rest upon, you know, the, the [00:16:00] supreme importance of. Um, and we don't talk enough about embeddings in, in the space at the moment, and your work and your team leans heavily on embeddings across projects. So maybe you could just give us a 32nd refresher on.

Embeddings and then how they fit into your machine learning stack and what quick wins they enable. 

sergey: Sure. Yeah. So an embedding is, uh, it can be either low or high dimensional representation of some object. That object can really be anything in a, in a space, in a mathematical space that allows you to do interesting things with them.

Specifically, you can do vector math. With those, uh, embeddings, right? So, so, um, as an example, an embedding can be a string of let's say 16 or 20, or eight or 32 numbers, right? And those numbers are real valued, right? So they can stretch between negative infinity and positive infinity. And, uh, those numbers can be any two basically embeddings [00:17:00] that you have in the same space can be subtracted, added, combined in interesting ways to generate new embeddings, right?

So that's what an embedding is. Now, the, the kind of the secret sauce is how do you get to an embedding, how do you go from data that exists, whether it is text on a webpage, or a picture, or an audio file, or even just tabular data in a spreadsheet that you can then convert entities in those things, right?

Whether that entity is a person's voice. Or a word of text, or perhaps a title, a video title in our library into one of these things. And what we've done is we've basically taken some of the algorithms that I would say at this point, now these algorithms are. Ancient news from the perspective of lems and training those models, but they're still incredibly, um, effective.

So things like art or [00:18:00] glove or word tove, these are, people talk about them as infrastructural or architectural components, but they're really just algorithms. That you can take and take, I would say unstructured text or unstructured data, or what you think is unstructured data and then create embeddings out of them, right?

And so what's interesting about all of these algorithms is that they were originally created for text mining and for generating embeddings of text, right? So you take. A bunch of texts. You parse it in some way, you generate tokens, and then you pass those tokens through these algorithms and you generate and you generate these embeddings out of them.

What's interesting about these algorithms is they don't, they don't require that the text themselves be words, right. They can actually be anything. What's critical about them is that they be sequential, meaning that there's some. Meaning behind a token where token is broad abstract, A comes before or after token B.

And so what we've done [00:19:00] for us is we treat our, if you will, sentences or our documents as user histories or really histories of it could be histories of actions, it could be histories of consumption behavior, it can be histories of really anything. And we order those histories, right? So in a very simple case, it could be.

User X. First watch show A, then show B, then show C, then show D. And so that for us becomes the tokens in this case becomes become the shows themselves. And then the histories become these documents or sentence, however I wanna, you wanna think about them, right? You can pass those things through the same exact algorithms, right?

Bert or glove, or word to vec. Or if you have enough data, you can even pass them through if you wanted to. These large scale transformer architectures. And then you can leverage the, so to step back a little bit, embeddings really are just the intermediate components. They're intermediate, sometimes fixed, but sometimes variable components within these [00:20:00] architectures that are buried in various layers, right?

You can think of word, tobacco, or glove as very, very small. Neural networks where the component just sits right in the middle. And then you have Bert, which is a slightly larger neural network architecture where that, where those embedding components might sit slightly higher up. Or then you have LMS where you have incredibly huge, incredibly deep architectures that can get incredibly complex and you can actually take embeddings from different components, right?

And look at them, investigate them, do interesting things with them, and they represent something useful about how that. Architecture represents the, uh, the entities that, that you're embedding, whatever those entities are. And I know Philanthropic has done some really fascinating work with doing that, with, I'm sure you're well aware of how they've done that with, uh, with text and using sparse auto encoders to sparse, sparse stuff out.

We're not doing anything like that. We are a small but mighty team. We have these much smaller architectures. Uh, we don't have the resources. Uh, to be able to do those kinds of things. But what we do is we take these embeddings, um, that [00:21:00] now represent in our case, for example, shows or, uh, regions of the country or, uh, demographic groups.

And we do interesting things with them. We find similar demographic groups. We find similar shows. We find similar geographic regions, meaning the regions behave similarly from the perspective of our content consumption. And that allows us to scale small audiences. So, for example, we can take a, uh, group of users who perform some behavior that we find interesting, for example, subscribing to our premium content, and we take their, basically the embeddings that represent those users.

And we find other users that have embeddings that are very similar to them, whether are not yet subscribers. And we can then try to find them in the wider world and try to get them to come back to our, to our pla, come to our platforms and to subscribe. And we found some success there. Uh, where we find the most success ultimately for these embeddings is in our recommendation systems, right?

Uh, we have embeddings of all of our shows, of all of our content. We're also [00:22:00] exploring getting embeddings of things even deeper than that. So, for example, uh, embeddings of core actors that, that exist across our library. And finding interesting similarities or differences among them, and then using that to our advantage.

Uh, we can do the same thing with sports teams. There's lots of, there's lots of kind of the, I would say the space of possibilities here is pretty broad and you can do interesting work with, so dunno if that answered your question. It does. And 

hugo: I'm, I don't want you to give away any secret sauce or trade secrets, but when you do embeddings of.

Shows. Do you do it around titles or scripts or even videos or, 'cause you can embed there's, yeah, 

sergey: yeah, yeah. So we do, we, right now what we do is a little bit of, I would say like, um, synopsis level. So the problem, the issue here is if you have a, there are, I don't know how familiar you are with Spanish language, at least entertainment content, but when it comes to novellas, these are very long running, right?

They're like soap operas in Spanish, right? They have. Hundreds, potentially thousands of [00:23:00] distinct episodes. Their, I would say, their, their themes vary greatly over the course of the, the, of the novellas. So what we try to do is we try to just take the core components of what they represent, what their kind of general themes are, and we embed those things.

So from, basically from synopsis or summaries. The, the other thing, like I said, that word that is critical for us that we extract, the way to think about it is like in a parallel for. Content based recommendation systems versus collaborative filtering systems where you're using user consumption behavior to recommend things, right?

We care a lot about, not necessarily what the content is about, but how it's being consumed. At the end of the day, if there is a random sports league, for whatever reason, that has a really high affinity or a high. Embedding based similarity from the perspective of content consumption behavior to a tele novella or to an original VIX movie.

[00:24:00] We would rather leverage that, right, than to say, oh, this movie is similar to a. From a content-based perspective, this movie is an action movie, and so based on our embedding, it's similar to these other action movies. If we don't actually see any, uh, concomitant similarity in their consumption, it's probably not, and very rarely is it actually going to drive increases for us in terms of overall engagement on our platforms.

But that interesting connection. Thing that you would think is serendipitous between, for whatever in that, that sports league or maybe that specific, uh, team and this piece of content is really interesting. And we can take that, use that to our advantage to, uh, to personalize the experience on our application.

hugo: Fantastic. Thank you for so much for that, that deep dive. I'm wondering if we could just tie it back to the household graph. So if you could remind us what the, this flagship asset of the household graph is and how embeddings play into this. 

sergey: Yes, absolutely. So the graph is, like I said, it's a representation of who we think households in the United States [00:25:00] that we think are very likely to interact with Spanish language media.

Broadly construed, meaning basically if it's media that appeared on Spanish language properties, we want to know about who's consuming them. We don't. Necessarily care about why these people are consuming them. We just wanna know that they are because it means that they are in our, um, basically I would say our addressable universe, if you will.

Right? Um, they might not always be in our addressable universe, but we know that at some point in the past and then potentially in the future, they will also, uh, be in that universe. The graph itself is actually fully anonymized, so we have no, there's nothing that ties a specific, uh, person's. PII. So they're personally identifiable ID information to any component of our graph.

And that's really critical because the whole point of this is to make it so that we don't have to necessarily deal with a lot of the, kind of, the issues around, around privacy, uh, that exist in the United States. And at the moment, really, they exist [00:26:00] worldwide, but specifically in the US right? And so the building blocks of the graph are just a few components that are reasonably persistent and, and stable, right?

So they're persistent in the sense that they exist from session A to session B, to session C. And they're stable, meaning they usually refer to the same entity in our case, either a household or a device over a long enough period of time, or long enough means usually weeks or months. And so those components are IP addresses, which, uh, basically usually tie there's a single IP address that is associated with a single household, if you will, in the United States.

And then a digital identifiers that. Exist in the digital ecosystem. So that is, uh, device IDs, advertising identifiers on mobile phones, and then, uh, CTV identifiers. So these are, again, advertising based identifiers that exist on smart TVs, which are becoming now that almost [00:27:00] everyone has a smart TV in their, in their house, or many people, I don't wanna say almost everyone, but up.

Now at this point, a majority of households in the United States have at least one smart enabled television in their house, whether that's their television directly or a Roku device or some other streaming device that they plug into their into their tv. What we do is we have a proprietary algorithm that takes sightings of these IP addresses and device identifiers from.

All corners of wherever we, we can basically get this data, whether it's our own first party data or data that we license, uh, from elsewhere, and we household them. And this is something that I've, myself, have been doing now for the bulk of my career. I mean, one of my first gigs when I first started as a data scientist was working for an ad tech company that only did this, that only built what were called graphs.

Uh, for them. They, they built it for the entire United States. Uh, for us, we, again, we don't care about all of the us, we just care about the, the Spanish language component. The Spanish, uh, language media component, but regardless. [00:28:00] So we, we standardize that data and we find basically clusters of stable identifiers that, that appear to, to hover around or, or coexist on a single stable IP address.

We call that a household. We do some filtering on it. We do some thresholding on it, and we get what we call our graph. And then now that we have these stable identifiers that can be, that are persistent, uh, long living or reasonably long living, long persisting. Tied to an individual household, we can then take any various other identifiers and basically map them back into these households that we see and do all sorts of interesting things with that now.

Let me, let me answer the second part of your question, which is the embeddings. How do we get to embeddings? So now that we have this framework for saying, okay, the identifiers A, B, C are part of household X, we can then start building and creating embeddings at various different levels across various different entity types [00:29:00] against the graph.

We can generate household level embeddings, which are going to be course. But they are going to give you some understanding of the geography and the household level demography of that household, right? So you can embed, uh, the household in a space that tells you, okay, these are higher income households, mi, middle income households, lower income households living in these specific parts of the United States.

They're coser, but they give you kind of the, the, the general, I would say. The first level C when it comes to a targeting or activation or personalization, right? And then you can also do device level embeddings where you, you basically can, the device level embedding does, doesn't really matter as much at the household level.

'cause devices, again, themselves are persistent. But what you can also do, I would say the other big thing that you can do is, uh, you can do, uh, I would say like context level expansion within the household to inform device level embeddings. So. For example, you can take knowing that device [00:30:00] A, device B, device C, device A, being A, an iOS, uh, tablet device, B being a Android mobile phone, and then device C.

Being a pro enabled smart tv, you can. As part of your embedding process for those devices, you can say, Hey, this device belongs to a household. This iOS, uh, device belongs to a household with a, at least one smart enabled television. Mm-hmm. And that's really important for us because what we ultimately wanna do in general, when you're at your house.

What we always want people to do is to watch our content on the largest possible screen that we can find, right? Because ultimately, larger screens, uh, almost always lead to more engagement, and then ultimately they lead to a better consumer experience. And for us, they lead to increases in revenue, right?

Like the DPMs that you can get for delivering content on a smart tv. Significantly higher than the CPMs. This is the basically the [00:31:00] revenue that you can generate from a thousand impressions. So a thousand ads served significantly higher than what you can get from a mobile device. So those are kind of the ways that we can use the graph to inform and enrich the kinds of embeddings that we can generate, um, either against our user base.

Or against our, uh, content. 

hugo: Amazing. So I'd love to switch because you've got working on so many fascinating things. I'd love to switch gears to, to, to vix, which you mentioned earlier on in our chat. And because recommendation and personalization are such important parts of the landscape, given how much media there is.

Generally, but also on with everything you are working on, I'm wondering how the first version of your recommended system worked and how it's evolved now. 

sergey: Sure, yeah. So our, yeah, our recommendation system has definitely undergone some, uh, some evolutions because of the really the great work that our personalization team is doing again.

Small but mighty. And they're really a wonderful full stack data scientists. They can do basically everything. They can create [00:32:00] features, they can build model, but they can even chip the predictions either by standing up realtime endpoints or by, uh, pushing them in batch into various, basically, databases that are a client apps can interact with.

I wanna say that they're, they've really been, uh, wonderful in, in, in taking us as far as we have, so. Our initial recommender system, we wanted to stand something, uh, reasonably quickly, that that was reasonably performant, that we could also do some explanations or we can actually interrogate a little bit because as, as, uh, these systems become more and more complicated, the interpretability around what it is that's driving certain recommendations for certain users.

Basically becomes, IM impossible. It becomes very difficult to say why it is that these specific titles in this specific row recommend to this specific user. So our initial system was really a, I would say it's, it was like a hybrid system. We used a factorization machine. That was a [00:33:00] combination of what I would call collaborative filtering and or what most people would call both collaborating and content-based filtering, where we used user features, item features.

But we also used basically, uh, matrix factorization, I would say, on steroids to generate our, uh, recommendations. So there were good things about this. It was reasonably performant. And in fact, when VIX first launched, we actually had a third party generating our, uh, recommendation. So our first test was really.

Just about can, our homegrown system, uh, that we've built outperform the, uh, third party model that we were paying quite a bit of money for, uh, to support. And thankfully it did, uh, which was great. It did outperform the, uh, their model of. Which was, I think, validating for our data scientists, but also showed that we could actually in-house the, the, the intellectual property around, around the system.

'cause in many cases it could be the, the secret sauce for the continued growth or your streaming [00:34:00] service. So there were some drawbacks with the system specifically around, uh, being able to generate, uh, I would say there were. I would say like cold start recommendations or recommendations for users that weren't in our original, I would say training set or in the set of data that we used to build the model against.

So you can imagine if a user, if our model gets updated, let's say every six hours, if a user had, has had some amount of consumption within that. Between those six hour periods, and they're not in our training model. We can't do a, I would say like a, an upsert or a real time prediction against that user because that user's key.

The thing that represents that user wasn't in the model's training set, so it couldn't really do anything with that. So we had this issue with. More recent users basically getting default recommendations where? Where Default cold start recommendation. You can think of it as like a trend that we've [00:35:00] noticed over some period of time and we're just recommending that all new users that trend, so we needed to get away from that.

That was the first thing, and as a result, it also meant that we couldn't really do any kind of interest. Real time stuff with those, uh, with those users. We couldn't send their data to some real time endpoint that could then give a fresh prediction. So we moved away from that, uh, factorization machine model, and we moved to something that was a little more, uh, modern, uh, more flexible and didn't require, didn't require that user.

Were in the training set and we moved to a sequential recommendation architecture. So these are for full, reasonably new systems. Again, we're not using the, I would say the. Really hardcore next gen transformer architectures that are being leveraged. But something I would say that's still reasonably a recent and performant, uh, sequential architectures where what they do is you can inject basically a user and item features, but the actual users.

The, there's no real entity that represents a [00:36:00] specific user. They're really just streams of, again, tokens with some added side information, either against the token or the stream itself. Or the stream itself is the user, and then the tokens or the items, uh, the pieces of content that they interacted with in some meaningful or interesting way.

And now. The complexity here though is, okay, great, you can do this. You now don't have to. The user themselves doesn't have to appear on the training set in order for us to be able to generate predictions against them. But now we're using a much more modern architecture. We're using deep neural networks, which, and we have to train them at scale, which means that you have to start using cpu, which are not.

Fun to work with at all. They are really challenging, especially when you have to string multiples of them together. There's just lots of, there's lots of, I would say more like ML ops that's involved with these kinds of infrastructure than the, the simpler, I would say, more, more classic like factorization based models.

So we had to stand all of that up. And that's [00:37:00] where we are today, where we're using these, uh, sequential recommendation models to generate our, to generate our recommendations that, that power various components of our, of our application. And we're now doing some, we're starting to do some additional more interesting work.

So we have a. We just recently in the past couple of months launched a Eclipse product for our mobile customers. So we're gonna start doing some interesting personalization work there. The Eclipse product is basically a combination of short form video content. It's a combination of news sports.

Entertainment, but all in a single, basically swipeable feed that you'd be familiar with. Obviously if you've, if you consume any TikTok content or content on Instagram or any of these other, uh, large scale, uh, video applications, this, uh, content that we have, none of it is user generated. It's all basically created in house, and so we're gonna be basically building the, not.

Going to be, we are building the infrastructure to support [00:38:00] personalization there as well. 'cause we know that ultimately not everyone wants to see the news and sports content. Some people just wanna see the, the racy scenes or the funny scenes that we have in our, um, entertainment content library, our deep content library.

And some people really just care about breaking news, uh, or wanna watch a bunch of reels of different, you know, uh, highlights of different goals from the Champions League or wherever. So we're starting to do some more interesting work there. And what's interesting about that is you have way more feedback, as I'm sure you're aware, with the reason that that TikTok is, is such a boon for personalization or these other short form video platforms or a boon for, for personalization.

Is that the. Amount of interactivity and the amount of signal that you get is much, much higher. The frequency of the signal, right? From the perspective of the long form content that we have, the show or the movies or whatever, all we get is positive feedback, right? What people clicked, the positive feedback is explicit.

The negative feedback is more implicit, [00:39:00] but with the short form content you have. Equivalent, both positive and negative feedback, right? A user swipes away a video. It means they didn't want it, they didn't like it. Uh, where if they actually watch a video for an extended period of time, or they click into it and go to the series page or go to, for example, a live match that's happening, or something like that, it's, it's just a much, much higher level signal for both positive and negative intent.

I'm very excited about the work that we're doing there. 

hugo: Fascinating. So thanks for that kind of whirlwind tour of everything you've been doing On the recommended side, I'm really excited about the seeing where the clip stuff lands as well, because as you've hinted on and stated in implicitly, a lot of this work has been done with user generated content.

So doing it from this perspective is fascinating. And I'll actually, in the show notes link to an episode that we have with Roberto EDRi, who's VP of data. At Instagram who he launched reels talking about, oh wow. The process of launching reels and how it was really existential for him. And meta as as well, of course, because of TikTok and how the metrics weren't great for the first several months, maybe [00:40:00] up to three, three to six months, but they made a huge bet there.

Then paid off. Fergie. It's so wonderful how we've been able to talk about the amount of value you and your team and the organization have been able to deliver using data and not generative ai, but I would have to fire myself if we didn't talk about a bit about generative ai. So, yep. Last time we spoke, you told me how, um, tested LM generated metadata against existing features, and I'm wondering if you could.

Explain what the motivation was here and what lift you saw and what surprised you. 

sergey: For sure. Yeah, so our, our content catalog is very deep. It's very old, but from the perspective of metadata attached to that content, it was, I would say. In many cases, for newer content, it was well curated, but for legacy content, it really was lacking in the sense that for some of these, right, for Fortnite systems, for some of these shows, we have shows or pieces of content that are 20, 30, 40 years old that don't really have good [00:41:00] metadata attached to them.

We might not actually know what actors, or we know what actors were in there, but the metadata that we had didn't attach the actors' names to. In some cases, the, for example, the genre or the sub genre classifications for some of these, for some of this content were very broad, like to the level where things were just classified as entertainment or drama.

Where it didn't provide the kind of richness, and I would say subtlety in tags that would allow a, regardless of how robust the recommendation engine or system was, wouldn't allow the depth of kind of ability to see secondary, tertiary interesting interactions among our pieces of content. Right? It basically just said, Hey.

Token A and token B are related to each other, but we don't really know why or how. And we made a bet that if we could extract useful data, basically automatically [00:42:00] by passing the content through, um, either single modal, initially LLMs. Now multimodal models, models that take both video and audio. And basically prompt them to generate high level moods, themes, things of that nature, apply them to and tag our content with them.

We would be able to see some incremental lift in the performance of our models. And so that's what, uh, we did a team, uh, within the company, the, the r and d team that's been exploring, uh, using a lot of this, uh, this new infrastructure, uh, basically past the majority of our catalog. Through, uh, some of these systems, through some of these models extracted a metadata tags from all of our content and we actually wound up having to do so.

This is where I would say the kind of, some of the reservations around generative AI came from for us, we had to actually do some post cleaning against that. For example, one thing we found was that the, uh, automated, uh, this [00:43:00] automated approach did really well for content with actual. People in it, right?

So when it was non-animated, but it was actually really not great with animated content, typically that content that, that might be more geared towards children or just animated, they would flag, uh, things that were clearly comedic as being suspenseful or as horror or something like that, which was really not helpful when, when you're, when you're trying to recommend.

For example, if it's very likely that the person consuming a piece of content is likely to be an adult that has no interest in children's content, you don't want to be using metadata that's flagging children's content as suspenseful or whatever. So what we wound up doing is we, we wound up cleaning a little bit the metadata tags and then applying them and running an AB test between our, our classic a tagging solution that we had and then.

Generative ai, basically a past solution, and we found actually some significant improvement. It was on the order of 10 to 15% [00:44:00] increases in overall engagement against the original model that, or against the original metadata that we had. So we did find some success there. I would say in this case. It was this a good, good place to use a generative ai?

Because it didn't require, there was nothing at the end of the day that was directly surfaced to user. It was behind, I would say it was part of the infrastructure for our algorithm as opposed to something that was just directly being surfaced to users like, right, like we weren. For example, having a, an a, a AI system or an LLM directly behind a, for example, a search bar on our application where you could type whatever you wanted and then all of a sudden get content that, that, that fit that.

Because I think there, I think it's a lot, you have to be much more careful in, in, in leveraging something like this Here, the, I would say the barrier to success was significantly lower because basically it was like, Hey. Either metrics go up or they don't. And in our case, they went up. We, we kept [00:45:00] at it.

hugo: Absolutely. And I do, I love, I, I love the fact that you, you, you spoke to, you know, you can have significant success when you don't serve to users as well. And to be clear, a lot of the successes in generative a AI I've seen, for example, in people building customer service, chatbots have been.

Conversational AI customer service agents that the customer service human users to then, um, get information to serve to a customer as opposed to directly serving it. And we have, we're, we have seen in the past couple of years. A lot of excitement and a lot of possibility of LMS and generative AI possibility as opposed to capability is how I frame it.

And we have seen them fall short in a variety of cases. And this was like a positive data point for you, but I'm wondering if there are any horror stories you can share about lms? 

sergey: From my perspective, I think, I wouldn't say they were horror stories. I would just say more like you have to be, I am much more currently in the camp.

Where cls, uh, just in general are really, really good at [00:46:00] making their outputs appear incredibly confident. The problem is, so as a person that that has been doing what I would call like classic data science and machine learning for the vast majority of my career, and I'm sure you're the same thing with you Hugo, we really like understanding the level of uncertainty with a piece of output.

I really like seeing. Error bars on things, right, and currently there are no error bars that you can see when you have outputs from LLM. Infrastructure. There's just, it's just you get some output. And there's no way to say that, that the likelihood that this output, whatever it is, especially when it comes to text or code or whatever, is in the LLM itself, has some certainty around.

And so what winds up happening is it's a hundred percent certain, about a hundred percent of every, and that is very, very, and and it's not only from the perspective of, I mean it's from the perspective of the way that, that [00:47:00] the LLM actually communicates. The output is just. Incredibly. Like this is how it is, this is what you do.

Oh, it's super easy. Here you go and it's conversational and you're like, oh man, this sounds like I'm just talking to my friend about whatever. And the problem is, you know, and as I think everyone that's actually a practitioner in the field knows is you have. To check almost everything as an output, especially when it comes to code generation documentation, really any kind of output that you have there when it comes to text and you get cases where, so in our case, we were, we, I had, uh, data scientists on the team was.

Trying to interact with an LLM to speed up development of a, of an actual, of a specific, uh, basically workflow where we were trying to automate automatically tag something and it gave us very clear instructions, uh, that looked great on paper, except between steps five and six and it just completely invented something and didn't [00:48:00] invent it, right?

Because again, they really. If LLM spit out a blend of whatever it is that they were trained on, it basically gave us documentation from a completely different cloud provider. So we told it explicitly that we were on cloud provider A doing X, Y, Z, and it basically gave us a step. So it was cloud provider one for steps one through four and then part of step five, and then for whatever reason, for steps five and the back half of step five and step six, it just took stuff verbatim.

From a completely different cloud provider and popped it in. And so we were going, and then my, my, my colleague was like, I don't understand. I can't find any of this. And then I dug around and was like, that's actually because the LLM told you something that. Just wasn't for us at all. So, so I would just say that, that at this point, I think the way to really, to really take LLMs into the, I would say like the next level of usability.

There has to be some way that the, for specific use cases, that there's some [00:49:00] kind of a. Confidence assigned to the output that the LLM makes. There has to be some way to provide probabilities against the outputs. I don't know what they are, but the companies themselves, they have hundreds of people working on this stuff, and I think that there has to be a way to be able to do that because ultimately if you can't, if you.

Assign those. You basically have to have your end users check everything, and that makes it so that it actually takes, in some cases, longer leveraging the outputs of the LLMs than it would've been for the user to just do the thing themselves, which frankly was the whole point of why everyone's so excited about generative ai.

Right? It's supposed to. Shrink our time to ship. It's supposed to shrink. All of these things, it's supposed to, it's supposed to minimize friction and yet in this case, it was actually increasing friction. 

hugo: Absolutely. And I even have a personal example that happened to me the other day with check GBT, where I said, can you provide a link to, to verify what you just told me?

And it said, [00:50:00] here's a link. And it was a link that gave me a 4 0 4 error. And I said, that link, I gave it a screenshot. It was like that link didn't work and it said, oh, sorry, here's the actual link. And I said, that didn't work either. And I was like, okay, this is the final link. And I was like, come on. Come on, dude.

And what I'll do, I'll link in the show notes to an episode. It was our first episode of this podcast section with Michael I. Jordan, the Michael Jordan of machine learning, where we had an extended conversation around the lack of ability of these models to reason under un uncertainty. And I think. Part of the key won't be LLM based.

It will be actually information retrieval based, like RAG or otherwise, or BM 25 or whatever it is, but ways to extract and ground, um, information in documentation and then reference. We are gonna have to wrap up soon, sadly. There are so many great things I, we've talked about and I want to dig deeper into.

But a couple of last questions. I, I do think. In terms of managing expectations, I'd be interested in your experience where you are Serge, because many execs feel a very understandable pressure to bolt on LLMs and generative AI to everything. So I, I wonder how you [00:51:00] manage expectations and help people understand when simple models can actually be.

More performant and even cost less and have lower latency. All of the things, right? 

sergey: Yeah. So I think the thing to do, the big thing to do is, I know execs are, uh, if there's anything they're short on, it's time, but ultimately you have to have a conversation around basically. Operationalizing. Right? Like the first thing I think that you, that, that great data scientists learn, they, when they just start out in the field and they want to become better, higher level expert practitioners, it's about operationalizing the problem.

Right? And so operationalizing the problem is basically converting some diaphanous vague thing that they wanna do into something that, that, that actually achieves something akin to the target outcome that they want, right? Via some mechanistic. Approach, whatever that approach is. And so when you start having a conversation with them about what is the problem, what is that core [00:52:00] problem, that's when you can quickly actually figure out whether, yes, it requires maybe something that is a generative AI or an LLM, or it requires something simpler.

Maybe it is just a machine learning model. At the end of the day, you can get very far if you have the right data. Even with a linear model, right? Like you can do a lot with a linear model. I think it's really just about, about understanding and operationalizing what it is that they're trying to do and that maybe what you do the way you L-L-M-F-I, the thing is.

You use it as part of, for example, your feature extraction process, or you use it as part of your, basically your feature engineering process so that the exec at least can truthfully and rightfully claim that, yes, we're using LLMs and AI in our workflows, in our systems, or whatever, but you're not using it in the way that could actually be ultimately.

Harmful to the [00:53:00] long-term viability of your business? I mean, I think that, I mean, we've seen now many cases where. Where businesses have had reputational damage as a result of using these systems without, without any kind of barriers. And so I, I think really that's where you do it. We also have established at Univision, uh, colleagues in, in, in collaboration with some colleagues, what we call an AI fit framework, where we basically just take.

We have a bunch of questions that we ask people like, how frequent do you want this thing to be done? How much time is it taking? How okay, are you for this specific task? Uh, with it occasionally having error? Is that critical that there are no errors? Or if, is it fine if it's there are occasional errors?

How comfortable are you with those? And then we basically score their responses and tell them, look, we think this is. In fact, a good place where we might be able to inject some generative AI capabilities somewhere in the workflow versus, no, this really isn't that. I think what this is, you should just take your Excel spreadsheets that you've always worked [00:54:00] on and just.

Put them into a database, we can give you a dashboard and then that'll answer whatever these questions you might have are, or we will, whatever. If it's, it's more, I would say, almost like classic, I would even say it's like an industrial engineering problem where you have a workflow, you have a bottleneck in the workflow, and we replace that bottleneck with some simpler, more automated system.

Uh, but that system itself doesn't necessarily. Have to be generative AI based. 

hugo: Yeah. I, I, I really like the way you're thinking about managing expectations there. And so to wrap up, I'd love to know if you have to give executives. One piece of advice on responsibly investing in AI over the next 12 months, what would it be?

sergey: I would say do it slowly. I would say do it carefully. I would say do not over commit with any, um, specific partner. I am of the belief that because the field is changing very rapidly, you don't want to put all of your eggs into a single basket, uh, because [00:55:00] you might. Ultimately pick a loser, frankly. Right?

Even though there's whatever, 4, 5, 6 companies that are at the very forefront of this stuff. I would be, frankly, I myself would be surprised if every one of them is still a viable company in 12 months, just because of how resource intensive, how capital intensive LLMs are. So I would say do that. And then ultimately what you wanna do is you want to be very clear and have.

Very small initial clear ways to be able to embed them. If you just throw them willy-nilly into your business, you could really either, either cause harm to your business long term, or you could just pick the wrong horse and ultimately wind up having to invest. You might have a working functioning LLM or generative AI based solution, but 12 months from now you might have to re-architect it with some other provider.

hugo: Love it. Thank you so much Sergey, uh, for such a wonderful conversation. 

sergey: Yep. Thank you so much, Hugo. This was really great. [00:56:00] 

hugo: Thanks so much for listening to High Signal, brought to you by Delphina. If you enjoyed this episode, don't forget to sign up for our newsletter, follow us on YouTube and share the podcast with your friends and colleagues like and subscribe on YouTube and give us five stars and a review on iTunes and Spotify.

This will help us bring you more of the conversations you love. All the links are in the show notes. We'll catch you next time.