The following is a rough transcript which has not been revised by High Signal or the guest. Please check with us before using any quotations from this transcript. Thank you.

===

eric: [00:00:00] The main challenge is that there's a lot of companies that treat their data scientists as a support function, right? Their role is to help the various business teams like product management and marketers. And the challenge with that is that it's, it's really the ideas are coming from the business teams to the data scientist.

And that could have some value, but it leaves a lot on the table because there's a lot of things that we can only get the other way from data scientists. 

hugo: That's Eric Colson, data science and machine learning advisor, former chief algorithms officer at Stitch Fix, and former VP of data science and machine learning at Netflix.

He's seen firsthand how companies fail to use data science properly, and what it takes to get it right. Let's get into it with Eric. Hey there, Eric, and welcome to the show! Hello, Hugo. Great to be here. It's so much fun to be doing another podcast with you. It's been maybe five, seven 

eric: years since we recorded one?

I was going to say, I don't know how long it's been. I [00:01:00] would guess probably about that. It was definitely not in the last three years. It 

hugo: definitely was in the before times, before COVID as well. Before COVID. And you were, at the time, you were chief algorithms officer at Stitch Fix. 

eric: Okay. That makes sense.

hugo: And we talked a lot about what you did at Stitch Fix and we're doing that and what you'd done at Netflix before, in both in roles that really elevated the role of data and the data function at organizations that were already, uh, data centric. So much has changed in the space. Since we last did a podcast, so much has stayed the same as well.

I'm just wondering to start off how you see the role of data science evolving in organizations today. And what are you most excited about? 

eric: Well, let's see. I mean, I do think if the data is becoming more and more central to companies, companies finally are, you know, understand this like, man, it's, it's not really, you know, the network, it's the data, right?

It's the data that produces that we can learn so much from. So I think we'll continue to see that, that it [00:02:00] becomes more and more central to companies and that. Companies continue to organize around the data functions, in fact, and we'll see even more representation at the sea level. More chief data officers or chief algorithms officers, that kind of thing.

And what excites me about it is it's actually not really is relevant to the businesses as it is to society. The learning that comes out of studying complex systems can be really valuable, right? And businesses are like The lab rats of the market, right? So at the macro level, we have all these startups that are trying different ideas and some of them managed to find product market fit.

And these are for things that you would have never even imagined. Right. Um, and so if we didn't have so many startups trying these things, we wouldn't have found the successes that we had. So that's great learning. At the micro level, this is what's more, even more McNair to my heart is in terms of any one company, you have some of these companies that have these high frequency events, like millions or billions of customer interactions that are so good to learn from.

[00:03:00] And again, these are things that we probably would have never dreamed of, right? That some of the learnings that we can pull out of this data. And I think some of this learning is applicable outside the role of business to society. Large. And I think this could have big implications for how we see the world and how we even do policy decisions and so forth.

And so it could be some really valuable stuff that, um, can be applied to society at large. And it's exciting that this it's a field called data science. That is really the medium for this type of research. 

hugo: So I love that framing. And I think in a lot of ways, we forget that we're still in the early days of data science a lot because the space has moved so quickly.

I'm interested in, we've seen the data science function leveraged in a lot of different ways and something you've written about, and I'll actually link to your wonderful essay called Beyond Skills, Unlocking the Full Potential of Data Scientists. You've written about the untapped potential of data scientists and [00:04:00] particularly their ideas.

So why do you think so many organizations? Failed to leverage data scientists ideas effectively. And what are the most significant missed opportunities? 

eric: Well, I think the main challenge is that there's a lot of companies that. Treat their data scientists as a support function, right? Their role is to help the various business teams like product management and marketers and merchants finance and so forth to help them in their efforts, which, of course, they should be working with those teams, but it should be more of a collaboration, right?

The way they work together is important. And what happens in a lot of companies is they're treated as a support team. They're valued for their skills, things like Python and SQL and statistics and so forth. Um, as well as just being a resource, a body that they can dump work on, right? So people in finance or marketing, they have some data related tasks they need done, and they ask the data scientists to do it for them, right?

And the challenge with that is that it's, it's really the ideas are coming [00:05:00] from the business. Teams to the data scientist, and that could have some value, but it leaves a lot on the table because there's a lot of things that we can only get the other way from data scientists to the business, right? So you need to at least establish a bi directional flow of ideas, and that's what doesn't seem to happen when you treat them as a support staff, just valuing them for their skills.

Even like well intentioned companies, a lot of companies say, Oh, we don't do that. Our data scientists, we want their ideas. They may say that, but sometimes they behave in ways that suggest otherwise. Um, one of the symptoms of this is when they're The data science team has handed down requirements, right?

There's some new initiative. It's clear. It's going to be a data science initiative, but it came from one of those business teams and it's already, the problem's already been framed and spec'd out and they just need the data scientists to execute it right. And, and that will limit the way the, which the data scientists contribute.

That's one bad symptom. If you're handing off requirements, that's probably a warning sign. And the other one. [00:06:00] Is simply overwhelming data scientists with tasks, right? This happens in most companies. I talked to a lot of people, and this is a very common symptom is the demands for data ad hoc requests, dashboards, data pools, all that kind of stuff is vast.

Like everybody in the company needs some information. And if at all. If you have to ask a data scientist for any bit of it, then they can easily get overwhelmed. And unfortunately, there's it creates a vicious cycle because anything you ask a data related question, it tends to invoke more questions than it answers.

Right. And so then there's a lot of follow up work and it's just next thing you know, you've consumed 100 percent of your data science resources. So those are some of the symptoms that happen. We're just not allowing their ideas to come to the table. And so In my experience, this is the missed opportunity because I think the value of data scientists is not does not reside in their ability to their skills to answer questions or [00:07:00] complete ad hoc requests, but rather in their ideas that they can bring forward.

So by ideas, I really mean like. Capabilities new things that can move the company a better or new directions, right? So these are usually manifest as algorithms, right? You know, you might think like a new recommendation algorithm, right? It's a good example, or even it doesn't have to be a recommendation, like even in inventory management systems.

Oftentimes there's a place where you can plug in. New algorithms to figure out how to buy more merchandise or buy it more effectively in the optimal number of units. So these are all examples of business ideas, but they're very unlikely to come from the business teams, right? Cause they. rely on certain qualities that only the data scientists have.

Specifically, the ideas from data scientists are uniquely valuable because they have these two things, these different cognitive repertoires, and these different sets of information. Now it's, we should go [00:08:00] into each of those cognitive repertoires. This is a term I got from the complexity researcher, Scott Page, and he describes it as like, um, a set of tools or ways of thinking that an individual can draw upon to frame a problem.

So they're not the skills they have, but rather a way of thinking about a problem. And every function has kind of their own sets of cognitive repertoires, like marketers have theirs and merchants have theirs, finance people, et cetera. And data scientists certainly have their own set and they tend to be really relevant to business.

So these are the cognitive repertoire includes things like the knowledge of the various machine learning techniques. This is things like, you know, boosting and regression and deep neural networks, all these types of techniques for, uh, training models. And I don't. Kind of as a skill, that's, you know, the ability to execute on it is one thing, but it's the ability to recognize when it's applicable to a business problem.

Yeah, a way of thinking, right? Yes, exactly. I mean, hey, we should frame this problem as, you [00:09:00] know, Markov chain or something like that, right? That is cognitive repertoires, right? And these cognitive repertoires also include knowledge of classic papers or other framings that can be reused. These are, you know, the classic papers like the secretary problem, a news vendor model, traveling salesman problem, right?

All these, you know, you know, Muslim written, you know, right. 50 years ago, but they're wonderfully relevant to a whole myriad of business problems, right? And the data scientists have had those in their education or their training and many other functions have not. And so those again are relevant, are wonderfully relevant to businesses, right?

And so they can bring that framing to An idea, the framing is really important, right? The way you frame a problem could determine its robustness, its scalability and so forth, just as a really trivial example. I remember the early days of Stitch Fix. So. You know, the founder [00:10:00] had maybe a handful of employees and she had hired some contractors to build some things for her.

And one of which was, you know, a styling algorithm, like a recommendation engine. And even her, she would put like air quotes around the word algorithm, right? Because she knew it wasn't the greatest thing. It was like akin to the, what they used to call in the eighties and nineties called expert systems.

It was a lot of if then statements like. If this dress is blue and the customer said she likes blue, then that's a decent match. So let's add 130 points to the score, right? The score was sort of this arbitrary unit that had really no meaning other than higher was better, right? And that was about it. But there was no way to interpret 130 or 1500, right?

There was just some number and it was generated by all these if then statements that kind of came out of either like conventional wisdom. Or frankly, we're just kind of made up using, you know, your intuition. So what we did when we first were able to put a data scientist on this is what we didn't just say, Hey, can you come up with some new rules or new, new amounts to [00:11:00] assign to those conditions?

We didn't ask them to do that. Instead we said, can you reframe this problem? What would you do? And the data scientists came up with, you know, a fairly basic, but, and fairly obvious, but effective framing. They framed it as logistic regression. So they said, well, we're okay. We want to see, you know, which of these pieces of merchandise are the best selections for this customer.

So let's, let's run them each through a logistic regression. Algorithm and let's have, let's, let's use a score, but let's assign it some meaning. We'll scale it. So there's a bounded between zero and one representing the probability of purchase. Right? So a number like 42 would say 0. 42 would be a 42 percent chance that the customer is going to buy this thing.

So it gave it some meaning. It also adhered to forced adherence to a logic curve, which is. You know, fairly well represents consumer decision making right where it's a little more elastic in the middle and less so at the ends and just even the fact that you can never get to 100 percent certainty. You can't ever say with 100 percent [00:12:00] certainty she's going to buy this and likewise you can't say with 0 percent certainty she won't buy this.

So, you know, that logic curve, uh, you know, uh, enforce good principles. And then, of course, instead of making up numbers to assign to these rules. A logistic regression algorithm can learn its parameters or learn the value of its waiting for the parameters from data, right? And so this is going to be a much better solution.

It's going to be easier to interpret. First of all, it's got meaning to the score. It's bounded. And it's going to be more accurate because it's parameters are learned from data. It's going to be easier to extend as you add more and more of these parameters, more features in the model, the same basic algorithm one, and it'll get retrained and add.

And they say, so much easier to scale and extend. So while it's very basic, it still requires at least undergraduate statistics. Right. And that may not be something a product manager or merchant or marketer may have in their repertoire. And so that's why when you come to the data scientists. Don't tell them exactly what to do, like give them [00:13:00] the problem and let them kind of frame it.

And that way you'll get, you know, you'll leverage their cognitive repertoires and bring that to the table. And we've seen this a lot. Like there's so many other ideas. That was logistic regression, uh, fairly simple, but it's also like, you know, using embedding to, you know, figure out customer preferences or even using, you know, a genetic algorithm to design clothes, right?

These are all ideas that born out of those cognitive repertoires of the data scientists and then brought forth. To the business. I love it. And 

hugo: I love that example in particular as well because A logistic regression is an algorithm, which we all have people who work in data science have strong intuition for now as well.

That's also something we can explain and show a figure of to non technical stakeholders and may not, and it may be a relatively good first model. And it's something that we may get significant lift on if we do more deep learning techniques or whatever it is. We may not as well, but it is something as a baseline model that then we can build on to develop more sophisticated, if we think more lift is [00:14:00] important, right?

eric: Absolutely. I mean, I, and I think that is a great way to start to start with something that's very transparent, like logistic regression, where it's very interpretable. You can actually poke around and, and, and get sort of the, the first derivative of each of those parameters. And it's, it makes sense. Then I think what you should do is you can add the complexity, like you can move to neural networks or something that may improve your accuracy.

But you might diminish your ability to understand it, and you should make that trade off consciously. Now, Stitch Fix has moved on from logistic regression. They're on these 

hugo: crazy transcripts. So it may increase significant costs, like compute costs, but also headcount costs for maintenance. It may increase latency concerns as well, right?

There are all types of trade offs we're talking about here. But data scientists, once again, have an intuition for these things. 

eric: Right, and you should be conscious about those trade offs. Like, if it's Just a tiny bit more accurate, but it may not be worth all the complexity or opacity that it brings. Right?

And so you. You may say, well, it's good, [00:15:00] but not good enough to justify that additional burden, right? We, we, we'd rather be transparent or have albums that are transparent. And so it's a, it's a great thing to be cautious of. Right. And I do advocate the start for the simplest thing and then. Increase your complexity as it's justified, but right.

So this is a good example. There's a cognitive repertoire is that those things that the data scientists have the knowledge of those things. So you want to bring them out, right? You don't want to just impose your will on them, get them from that. So that's one reason. The other reason you want to get their ideas from data scientists is because they have different information than you may have.

Right. So all the teams in a company, you know, marketing, merchandising, product managers, they all have different sets of information or, or let's, let's call it like. Maybe deeper knowledge of different areas, right? For example, product managers may better know what customers or how customers are using the service, right?

Or marketers may be more in tuned with the addressable market and how people perceived the brand. But data scientists, they have information too [00:16:00] that is, they're more privy to than the others. And that's because they are deeply ingrained in data all day, all night, playing with things. And in so doing.

They see a lot of patterns and distributions and relationships that nobody else has seen. And a lot of these are completely unintuitive, right? A lot of these are stumbled upon. By accident, and those ones that are particularly aren't too, sometimes can be very valuable. So oftentimes, these new insights can be brought forth from data scientists, and they can lead to really, you know, startling new novel ideas.

I think the one example I gave in, uh, the article, uh, was about, uh, a data scientist, you know, tinkering with some data, noticing that the customer segments the company used were not meaningful to At least to explaining customer behavior, right? So these were segments or personas that the marketing team created and customers opted into.

So they self identified with them. These are groups like edgy and casual and Bohemian chic. And on [00:17:00] the signup flow, there was pictures of these personas and the customers would opt into one. And they would say, yeah, this one's me right here. All right. So a customer saying, yep, I'm preppy. That's me. And so out of curiosity, the data scientist was digging through the data to see, well, what are these different personas by?

What are the top products from the preppy people and top products that the edgy people are buying and so forth? And to her surprise, they were like the same list. They bought all groups, bought all products at about the same rate. So the groups were completely undiscerning of behaviors, and that was really curious to her.

So she dug further and what she did was, you know, so that was the inspiration. I was like, Oh, that's interesting. I got to look into this. Nobody's asking her to do, but what she did was very clever. Instead of using these marketing defined labels. She decided to let the data do the talking. So she created features out of customer behaviors rather than what they say.

So things like what they click on, what they view, what they like and dislike, that kind of stuff. [00:18:00] And she used these unsupervised methods to kind of find the directions in the data that explain those variation. I think it was principal components plus matrix factorization, if I remember correctly. And the outcome of this little curiosity digging was she was able to create this multidimensional space where I think the number of dimensions like.

Well, there's a vast number of embeddings and she was able to collapse it down to a relatively few dimensions, like 12, I think it was, or 12 or 14. And these dimensions, you can place the customers within that space, right? So groups of adjacent customers became like a cluster that you could actually put a label to after the fact, right?

So you didn't create the label and, and first you actually. You found the cluster of customers and then assigned a label after the fact. And she had to come up with some clever words like femininity, or I think there's one now like rustic something. So she had to come up with labels after the fact, but the beauty is these groups were now very meaningful, right?

They [00:19:00] had different behaviors. They bought different products. They were not all, they're all behaving the same. And this was a major breakthrough because we had believed those marketing defined. Segments were real and we were like, you know, you know, managing inventory towards them or, and our marketing messages were geared towards them.

And to find out they're all buying the same stuff, well, that's not good. So instead with these new segments that were more data driven, we're able to make major improvements to. The recommendation engine that was the first obvious application, but also like to the way we manage inventory, like inventory buying algorithms and also marketing messaging, right?

All that got changed to adopt this new way of defining style that was data driven. Wasn't what the customer said. It was more like what they did. And again, that started from a curiosity project that no one was asking for, right? The observation drove the idea, not the way other way around. So we don't say.

I have an idea to now, let me go find the data to support that idea. Instead, the data [00:20:00] presents the opportunity. And then we start saying, well, what ideas can we do to leverage that bit of data? So this is what I mean when I say data or data scientists have access to more or different information, different from the rest of the company.

And again, that's why you want to leverage those ideas. These are things that couldn't have possibly been asked for by a product manager or marketer or merchant. Not even a data science manager could have asked for these things because they weren't really. So much conceived of as they were revealed by the data.

So really it comes down to those two things, the cognitive repertoires and their observations in the data that allow data scientists to bring really valuable ideas to the table. 

hugo: I also, I just want to say, I'd love the idea of. The cognitive repertoire allowing data scientists to reveal patterns in the data, and then that being able to impact business decisions, I do like this example a lot as well, because not only does it tell us that perhaps some teams, even with domain [00:21:00] expertise, find things challenging without having data scientists, cognitive skills, or parts of the data scientists, cognitive repertoire, but also highlights how self reporting bias can be something we very much need to consider.

But on top of that, it really does this idea of clustering these things to allow the patterns to reveal themselves and impact business, I think is so useful. So I am interested in, we started this conversation talking about how a lot of data functions are treated like service. And this seems like a challenge of incentives among other things and challenges between short term incentives, which are like, we need dashboards now and medium and long term incentives.

So I'm wondering just culturally, how you think about managing these incentives and creating an organization where. Data scientists can be leveraged for their ideas. 

eric: Yeah. It's a, it's a great point and it's not easy. And this is where all the, the soft skills come [00:22:00] out, right? How do you get so that, you know, motivations are aligned, you know, data scientists want to do this type of work and the company wants them to do that.

So that works. So how do you just let it come together? And there's a, you know, a few. Tactics you can do. Uh, I remember, so at Netflix, we had a, one of the, I use, I don't know if it still is, but, you know, years ago it was like context, not control. And, you know, I've adopted that even going, uh, after leaving Netflix, I, I brought it over to Stitch Fix.

I changed it a little bit, say, give them context, not tasks. Right. So my counterpart, the original. To make sure that you understand the difference between the two. I think it's very, uh, Statistics. He actually had a better phrase. He kept saying, tell me the problem you're trying to solve. I'm sure he wasn't the original author of that phrase, but he said it very effectively, it was it over and over again.

I think everyone in the company could recite it. So anytime, you know, somebody came to him or one of his engineers with a task, Hey, I need you to change this to this. They said, hold on. Tell me the problem we're trying to solve, right? Give me the context and maybe I can come up with something better.

Right. it was being suggested there. And so. [00:23:00] Likewise, I followed suit, I did the same thing with my algorithms team, you know, and I would encourage people that if you are going to come to them with some new opportunity. Let them do the framing, right? Tell them the general thing, the context, maybe even some constraints, and then let them come up with a solution, right?

Because that leverages that cognitive repertoire, lets that come to the table. Oftentimes, you don't even have to ask explicitly, you don't have to go around to data scientists and say, hey, what ideas do you have? All you have to do is invite in some meetings, anywhere where context is shared. I think context will collide with those counter refer to ours or their extra information that they have and it spawns new ideas in the article.

I kind of give that narrative about like, you know, I did a scientist attending an operations meeting and somebody I think from merchandising is saying, gosh, you know, we really need to figure this out. We need to know how to buy enough inventory, but not too much. And that just triggers the data science.

Oh, my God. This is the news vendor model. I know this one. I could solve this right. And it almost can't help herself but to start writing down the solution immediately. Right. And [00:24:00] there's just millions of examples of this type of thing. If you receive the context and you have that kind of cognitive repertoire or that extra information.

You can often solve these problems or come up with a new way of solving our new idea for something. And so just expose them to context is another way to do it. I also mentioned in the article about getting rid of the JIRA queue, right? In my opinion, JIRA is just not the way to engage with data scientists, you know, for several reasons.

As you enter in that, whatever it is you're entering into Jira, it tends to strip it of all context. Right, right. It's just becomes a pithy little command almost. And you enter that, uh, so there's not any context conveyed. And then I also think they're just frankly too easy to submit. Right. If something that's important enough, then take the time to set up a meeting or data scientist and meet with them face to face where context and ideas can be shared.

Right. That'll be much more effective. The other. I mean, the simplest thing to do is just make them accountable for impact. You can just say, Hey, you're in charge [00:25:00] of making some revenue or improving retention. And that'll almost immediately flesh out some ideas from them. It'll also help them reprioritize and it'll help them, you know, if they don't want the context, they'll go out and seek it, right?

It's a, it has an amazing impact. It's just straddle them with a little bit of accountability. 

hugo: I love the idea of everyone being responsible for impact and value. I do wonder, so firstly, I'm going to break the cardinal sin of, and ask two questions at once, I think, but they're so coupled, I feel like we, for this to be the case.

We really require buy in from leadership on the data function to let the data function be free, right? In a lot of respects. Because it isn't cheap, among other things. But on the other hand, do we need to decouple the building of algorithms and analytics from Infrastructure and deployment in order to, because in my mind, if you build a recommendation system model or system, it's intricately your ability to deliver [00:26:00] impact is actually intricately tied into how it's deployed and even the front end for it.

So how do you think about this matrix of considerations? 

eric: So I think what you're asking about is how do you decouple the. Near term things like, Oh, implementing this specific algorithm and maybe you'll get a win versus building the infrastructure to not just run that algorithm, but any future algorithm.

Right. And that is a tricky balancing act. Again, that's, that's part of the judgment of, you know, the data scientists and, you know, engineers do the same thing on their side, right. Whether they're building a transaction processing system, they're not asked to build, you know, the backend, they're usually asked for front end things.

They're using their judgment. I was like, wow, they're asking for something very specific, but let's build something more general because there'll be more stuff coming, right? Same thing happens by the data science side is you need to really invest in a, uh, um, a platform that sort of is anticipating then the future needs, right?

Right now, we're just going to be running algorithms that are linear [00:27:00] combinations of things. And so we can build that now, but let's also lean in that that probably won't always be the case that we're going to, you know, get into ensemble methods and other things that are more complex, so we'll also need to.

Build that stuff too. And so it's a bit of a juggling act of, you know, when to, you know, how much resources to put on near term things that could have a, you know, very visible when versus, you know, the behind the scenes stuff, the infrastructure that's, I guess it's less visible to the business people.

Right. But it, but it's absolutely necessary. And yeah, it's a bit of, I mean, that's where you rely on really good judgment on the leaders on your data science team is to make those decisions. 

hugo: Fantastic. Something we're speaking around, I think is experimentation and trial and error because you've cited two successful examples, but data scientists, us data scientists will try a lot of things that don't work.

And we want to do, we want to do it very rapidly as well. And I think I want to set the seed and lead the witness slightly. [00:28:00] By giving an example where we hear things like 80 percent or 90 percent of ML models don't make it to production. And that's bad. My response always is maybe that's great. If you're testing enough models, enough good models, and 90 percent don't make it into production, but the 10 percent that do.

A fantastic do their jobs. And not only that the winds of them outweigh the losses by the ones that didn't work, we went right. So I'm interested in your thoughts on, from your experience, the role of trial and error and experimentation in data science and in business at large. 

eric: Yeah, well, I've heard this similar statistics that, you know, that lots of data science related projects failing.

And I haven't dug in enough to know whether it was like, well, was it. A successful, you know, trial, maybe you tried something and then the idea didn't work out or was it the execution itself felt like, I'm not sure which one it is, but I will say that I believe for some companies [00:29:00] trial and error is the way to go and.

You know, what do I mean by trial and error? It's hard to define, but like, let's go with the opposite. The opposite, I think, is, uh, I'll call it overly rigorous planning and execution, right? That's where, you know, the company may take, you know, there's always a lot of ideas. And instead of trying a lot, they take just a few, one or two.

They put all their eggs in the basket, you know, a few baskets. And they Do endless research to plan them and, you know, make sure that they convince themselves that these things are a hundred percent chance of success. Right. They plan them out to the nth degree. They've done their marketing research, their customer surveys.

They, these are for sure going to work. The only thing that could possibly go wrong is execution error. Right. Well, the problem is that a lot of companies haven't done a lot of experimentation, right? Um, because one of the first things you learn when you embark on experimentation is that you realize you're wrong a lot.

A lot of your brilliant ideas do not pan out. So all, almost all business ideas are intended. [00:30:00] To improve things, right? Improve revenue, improve retention, something like that customer experience. But the reality is when you properly measure them with a randomized controlled trial, like you were saying, like 80 to 90 percent of them fail, they, they either do nothing at all, they don't, there's no statistically significant result.

They didn't move the needle anyway. Or they actually fail. They, they, they fail. They did the opposite. You intended to improve revenue, actually hurt revenue. And that is a startling thing to learn, right? It's very sobering. And I believe most companies do not know this, right? It's only the ones that do a lot of experimentation that have learned this.

Once you learn that. You realize that, Oh, I got to do something different. I can't bet on just one or two ideas. And so the opposite of overly rigorous planning and execution is trial and error, right? This is where instead of trying just a few ideas, you try more, right? Because even if you're. The assumption is you're not going to, it's hard to change your success rate.

So even at the same success rate, if you just try more [00:31:00] ideas due to just sheer volume alone, you're going to find more wins, you're going to get more failures too, but if you can figure out how to make the failures as cheap as possible, then it could be very lucrative. So I think it's a general philosophy that could apply to a lot of companies.

So a lot of functions like marketing, merchandise, all the functions could benefit from doing this. That said, I think data science has, has it better in most cases, they have these three properties that I think make them super amenable to trial and error. So I distilled it down to these three. And sometimes I describe it as four or five, but hear me out.

The first one, the first reason why data science ideas, and I usually mean algorithm ideas are more amenable to trial and error is because they're cheaper to explore and try let's talk about that. Take the separate this to exploring idea versus trying idea exploring really refers to, um, you know, doing enough research to flush it out to know if it's viable to see if it's a promising idea.

And I [00:32:00] think these data science ideas could be explored really cheaply. I think, in fact, it's happening all the time from your data scientists. They're a curious group. As I described the earlier case, somebody was asking her to look into, you know, which, you know, finding new customer embeddings. She did that out of curiosity, and I think that happens more often than people know, and the reason for it is.

That the data is right there at their fingertips. So it makes exploring idea really easy. So example being, let's say there's a new data set that came to the company, maybe something social media posts are now included in a data warehouse and all your algorithm developers have access to that. You don't even have to ask.

They're going to hit that really quickly. They're going to jump on it. They're going to just simply open a new tab in their Jupyter notebook and they're off to the races sifting through that data. And. It's amazing because they can explore it really quickly. Just, you know, first getting, even within a few hours, they can get a sense of the distribution of values for the data, and they could even also, if they have the right infrastructure in just a [00:33:00] few hours, even.

Try it as new features in an existing algorithm, right? This is, you know, against the historical data, not a real AB test or anything, but they could add the new features and try them out just to see if there's signal there. And if it's promising, then maybe they can take it to the next step to be a real trial.

So I think this happens, this exploration of the ideas is constantly happening with your data scientists, even if you're not asking them to do it, right? They don't have to ask permission. It's because the data is right there. At their fingertips. Like by contrast, I like to tell this story because it's an apples and apples, apples to oranges comparison.

But like, I remember a colleague exploring, she was a marketing and she was exploring the idea of a new loyalty program, now apples to oranges, a loyalty program versus a new feature in an algorithm, but very different, but the reason they were somewhat apples to apples was because they expected impact was about the same, but for her to explore this loyalty program idea.

She didn't have the data at her fingertips. She has to go, she has to [00:34:00] get time to dedicate to this. She does have to ask permission. She has to go outside the company sometimes to look around, do some research of what other companies have done. She may have to, you know, engage with consultants. She at least has to, you know, chat with engineers, say how hard would this be to build something like this, etc.

It's a lot of information that she doesn't have at her fingertips and she has to gather. So it's much more expensive to do. Gather and I think the key is that she's going to ask permission to do that. So I have this idea. I'm going to go look into it. Can I get some time versus the data scientists?

They're not asking. They're just going because the data is right there at their fingertips. So That's the difference between, you know, low cost exploration. Now that's just to gather enough information to say, should I keep going? Right. And in the case of, you know, the data scientists exploring a new, you know, features for an algorithm, that signal may be, or they may have flushed it out enough that they can actually, they're competent enough to try it.

In production, right? And this is where infrastructure assumptions come to play, but if you have a good infrastructure, [00:35:00] that code can be improved a little bit just to enough to drop it onto a data platform and it could take care of, it can abstract the data scientists from all the complexities of distributed processing, automatic failover, containerization, right?

I know like Metaflow does a lot of this stuff. Metaflow as well. After my time at Netflix, I'm going to build our own type of solution. That's districts, but this could really enable data scientists to actually try something in production for remarkably cheap, nearly free, right? They may still not have even asked permission.

They can allocate themselves a small amount of sample to try a few, you know, some customers to try out their new version of the algorithm on. Right. And so the main difference is no capital outlay is asked for, right? There's no, there's no money. There's no funding needed from finance to try out this.

This new algorithm idea by contrast, like, all right, if we go back to that loyalty program idea, that's a great idea, but you're going to need funding. Right. Cause you may have to have engineers build this thing and you have to get some designers and so forth, or you might even, you know, [00:36:00] depending on. The nature of the loyalty program, there may be prizes and awards that you need to, you know, be able to purchase.

And also an expectation 

hugo: to keep it running from the demand side as well, right? From users. Oh, absolutely. 

eric: Yeah. Yeah. I, I, I, I think, yeah, we'll, I'll get into that when I talk about optionality, but yeah, so. So there's a big difference in the cost profile to explore and try ideas. I think the algorithm ideas can be done relatively cheaply with with some infrastructure assumptions versus a lot of the other ideas they do need capital outlay, right?

And that's a big difference. So that's the first reason that makes them really amenable to trial and error. The second is what I call evidence. Data science ideas like algorithms typically come with some evidence as to their merit, right? So during the exploration phase, for example, you know, you could.

Use some historical data, add your new features to an existing algorithm and you can run it and you can get some feedback in the form of, you know, how does it improve the accuracy or AUC? [00:37:00] And this is good feedback for the data science to have. Cause if there is no signal, like it's not doing anything, they might just put it away and move on to their next idea or go back to whatever they're working on.

But if there is signal, we have big increases that you see that gives them more confidence to go forward, to go to the next stage, the trial stage. All right. Well, I better fix up my coding a little bit, cause I'm going to actually try this in production. Right. So that little bit of evidence. It's really helpful, right?

It tells you when to stop. It can also compel you to keep going and it is hard because for other functions, like again, I use that loyalty program example, you know, she may explore the idea, you know, gathering all that information we talked about by the end of the day, she may be like, well, all I have is a bunch of assumptions.

I don't have the actual empirical data to this decision on whether or not it's a good thing. I have some assumptions and that's about it. So thank you. That's a big difference, right? Uh, on the amount of evidence you get, but that's just from exploring. Now, again, [00:38:00] data scientists, they can try this out in production.

They can allocate themselves some sample and try it on a real AB test. And that's trial feedback. Now that is really solid evidence, right? So the exploration stuff, I think it's better than nothing that feedback you get in there, but there's no guarantee just because you have. AUC is through the roof.

There's no guarantee that it's going to manifest in production. It's also fairly cheap to try out too. Again, if you have the infrastructure set up, you can easily allocate yourself an AB test and try it out on real customers. By contrast for these other functions, marketing and merchandising product.

Oftentimes it's just not the case. A lot of these things just simply can't be AB tested. Like imagine a brand campaign. You can't AB test that or opening a new store, a physical store. You can't AB test it. New partnerships can't be AB tested. Right? So you can't really get. That evidence that you need in the case of a loyalty program, actually, probably you could, most companies won't, but you could feasibly.

Right. You could try it out on a couple hundred thousand customers and you only, you only expose it to them. Right. And so they get the, [00:39:00] you know, the perks of, you know, every purchase gets some points or something like that and you can try it out and you've read it for several months, you know, enough to get you the feedback you need, but it's going to be costly.

You have to get the engineers, you have to get the designers, you have to get a customer service on board, even though it's only going to be as an experiment, it's, it's. You know, it's still quite a investment to roll that out to get that evidence. So by contrast, I think that algorithm ideas are generally fairly cheap to get the evidence you need, whether it's during the explore phase or the trial phase, it will produce some pretty solid evidence that can give you confidence.

The last reason, the third reason is that data science ideas are more amenable to trial and error is optionality. What I mean by this is. When you try an idea and it doesn't work, you don't have the obligation to keep it going. You can generally pull it down pretty easy. Now, some ideas are hard to back out of, right?

You know, like that loyalty program, right? Suppose you did try it out as, even as an experiment, and you rolled it out to [00:40:00] several hundred thousand customers, and you let it bake for several months, six months, um, and you get your results back and you find it, it's just not doing what you thought it was, it's not moving, you know, maybe actually even hurting.

Retention, it's actually down, right? So you might have to make the very difficult decision to dismantle it, to pull it down. And it's not so easy. You have to send out, you know, notifications. So the customers that have been exposed to it, right. It's a, Hey, I know you've been using this new feature, but we're going to be getting rid of it.

You know, maybe some of them have earned a lot of points. They'll be disappointed that, you know, you have to figure out some way to compensate them. And also when you send out notices like that, usually the press picks up on them. And you can imagine. You know, the article e commerce company pulls back experimental loyalty program, right?

And that could be embarrassing even internally. That merits pretty broad communication. You have to send it to employees to say, Hey, this thing we were trying, it's coming down. It didn't work right. And in my mind, there should be no shame in doing that, right? That, [00:41:00] you know, it was a well reason hypothesis was a great idea and it didn't have the outcome we wanted.

And so we're pulling it down. Yeah. Great. That was awesome. You tried it. But I do remember a certain case. It wasn't this loyalty program, but it was something similar where that email went out saying, Hey, after a year of trying, we decided to abandon this idea. And again, I had that reaction. I said, Oh, that's good.

They tried something, but a different exec replied all and said, we need to post more on this. We need to figure out why this didn't work and this and that. And I thought to myself, Oh God, that's going to quell future innovation so much. Nobody's going to want to try things after that. So by contrast.

Algorithms are usually very easy to pull down. They're usually baked into the system behind a feature flag, meaning we're going to try it out for this many customers for this long and it'll automatically shut off, right? They're behind the scenes. Customers don't even know that we're swapping out and trying a new algorithm on it.

So there's no notifications needed. If it doesn't work out, you just revert it back to what they were using before. No messages go out, which means there's no press to deal with. And even [00:42:00] internally, I don't even know that you need to send a note out internally to say, or now broad communication anyways, to say that didn't work, you know, maybe locally with an algorithms team, you might want to share that knowledge that we try this didn't work, but no need for broad communication on that.

So that's optionality. I think algorithms are really easy to pull back. It's kind of baked into their system, baked into the way we deploy them. Anyways, versus other things that are highly visible to customers, right? So those are some of the properties that I came up the low cost exploration and trial the evidence and the optionality these make these algorithm algorithm ideas very amenable to trial and error and I came up with that upon reflection after sitting for years and years in these executive meetings with my peers like the CFO CMO where we all had similar pressure to deliver business value and quarterly or half yearly we would all kind of present to each other Our big ideas that we're going to be trying out.

Right. And I remember thinking, boy, their ideas, [00:43:00] they all have these big capital outlays, right? They had to ask for funding from finance and they really had no evidence. They just had strong conviction or opinion to lean on. And then almost all of them were going to be very public. So if they failed, they'd have to like, you know, do this very public.

Apology, either internally or externally or both to pull them down. So I felt like I always had a little bit easier and it made me really appreciate my peers. Like God, they deal with far more uncertainty and far more risk taking than I do. And so I actually really, it was a lot of gratitude. I'm like, wow, they're doing the heavy lifting in the company.

Like I'm not taking that kind of risk. That's them. I mean, I remember even the facilities manager would have to take more risk than I do. Right? So we were always running in an office space. And so this facilities manager, she had to sign like a five year lease, right? And so again, that's a big capital outlay.

There's no evidence that she can use to say, where are we going to be in five years? There's going to be enough space, not enough space. Like she doesn't have any data that's going to really help her with that. And it's going to be very hard to undo if she's wrong. [00:44:00] Right? So that was just the facilities manager.

I appreciated her because she deals with more uncertainty. Here I am. Running my basic expense of employees was the AWS bill. And at the time, I didn't even do reserved instances. Right? I just used the spot instances. They were much more expensive because I didn't want to commit to, you know, one to three years of, of, of, of reserved instances, even though the cost was cheaper.

So it just really gave me a fond appreciation for the business teams and how much they bring to the table and their risk taking and their comfort with uncertainty. 

hugo: I love that. And I do love that. You also mentioned the ability to roll out new features on new algorithms or challenger algorithms to small parts of the user base, even internally first.

And that type of thing to test things. Cause I always said with some of the best products out there, I used to joke like Google search, for example. There's no actual search product that we all use, like each of us is using a slightly different version depending on a lot of different things because of the constant rapid iteration [00:45:00] that software being in the software game affords us as well.

So something we've danced around is the asymmetry of wins and losses in experimentation and how data enables rapid iteration. I'm wondering how you recommend companies scale experimentation while mitigating the risk, the risks of failure as much as possible. 

eric: Yeah, it's a fascinating thing to look into, right?

And I generally speak, I do advocate that companies should try more ideas rather than fewer, right? They should scale their experimentation. And this isn't obvious when we discuss the outcome, right? What I share with people that most of our ideas fail, right? You get some statistics. If you look at Ronnie Kohavi's book, I think it's called Trustworthy Experimentation.

He has some great stats in there. Things like, you know, Netflix. Like 90 percent of their ideas fail, right? And, and Google, I think it was like 96 percent of their ideas fail and Airbnb of like 80, 85 percent of their, their ideas fail. So the success rate for ideas is very low. And [00:46:00] while those might be indicative of online experimentation, right?

All each of those companies were probably doing things online, things like marketing messages and buttons, like whether you do rounded corners or squared corner buttons, these kinds of smaller things, I actually believe it to be a more general. Phenomenon, right? So at Netflix and Stitcher, we were able to experiment more broadly in the company in different areas like operations and finance and merchandising content.

And we would try these large scale experiments on things like inventory decisions or even entire product lines were experimented on. Right? And we found the success rate to be similar, right? It's a You know, they mostly fail, even when you're testing on these other things, right? I, it led me to do like my own kind of research calling around different colleagues in entirely different domains, like pharmaceuticals and even government policy.

I have a friend who does large scale AB testing on government policy, where like they could take an entire. Region and try an experiment on. And [00:47:00] sadly in those domains too, most ideas fail. So there's at least this anecdotal evidence that this is a more general phenomenon, but most of our ideas fail. And so when I explain this to people, they ask, well, if, if most of your ideas are failing, aren't you just eroding business value?

Like, shouldn't you stop trying? And the answer is no, even with very low success rate. Trying new ideas and decisions can still be very lucrative. And this is unintuitive to people. The key to understanding is that you got to understand that experimentation along with. Optionality creates this asymmetry between, between the winds and losses, right?

Simply put, the failures are mitigated and the winds are amplified. So I need to explain this. Uh, we have to go a little bit into AB testing 1 0 1 for, and this would be relevant for the companies who don't currently do any AB testing. So just the simple messages before you, you roll out your idea [00:48:00] to all the customers.

You should try it first as an experiment, so you know, allocate yourself. A sample of customer, a random sample of customers that will get your new idea or our new decision and a random sample of customers that will not. They'll experience the absence of your idea, right? And then you let that bake for, it could take a few months and then you compare their behaviors and that will get you to causality.

Did this intervention really make a difference? And when you do that, you will find that more often than not, you're wrong. You're great idea. Didn't do what you thought it was going to be right. It either did nothing at all, or actually hurt the very metric you were trying to improve. Now this is sobering, but the good news is you didn't roll it out to all customers.

You tried it on just a small sample. Uh, and that is part of what creates this asymmetry. So the key is you should try it as in as small a sample as possible. There's power calculations you can do to. to kind of inform you on how small the sample you can get away with. So [00:49:00] such that you can still get a good read on a result.

And it varies dramatically depending on the company. Like, so at Stitch Fix, it was, you know, roughly like 50, 000 customers is what we typically use that Netflix released. When I was here, it was like 300, 000 customers, but I've read, uh, papers on, you know, the big search engines, Google and Bing that use like tens of millions in their sample.

And of course, what matters is, you know, the, the size of the effect you're trying to detect. As well as the variance of that metric, but you can let the power analysis tell you what size sample do I need to detect effective, at least what X and you should really make it no bigger than what you absolutely need, right?

As small as possible, because it's likely that your idea is not going to do anything. And in the event, it actually hurts things. You only hurt a few. It, the exposure was quite small. It was just maybe 50, 000 customers or something. And so, and of course, as soon as you know that you can terminate the experiment, right?

You can stop it right there, which further mitigates any downside. [00:50:00] But on the other hand, if you get one that wins, right? So that sample of 50, 000 customers, all of a sudden. Is higher retention, higher revenue, higher customer satisfaction. Everything's going great. You are not limited to that 50, 000. You can now roll it out to all 5 million or however many other customers you have.

You can roll it out to all the relevant customers now, right? This is going to greatly amplify that outcome. Not only that, it's not just a one time thing, but for all future periods, let's say you ran it for a quarter. And you got your results. Now you can roll out to all customers for all future periods.

So your upside is nearly unbounded, right? It can be greatly amplified at the same time. You're greatly mitigating the downside. So that creates this asymmetry. And what it means is that even with just a few successes, they can great, they can outweigh the cost of all those failures. Many failures, right?

VCs. Know this in and out, right? This is the whole VC community in a nutshell, right? The VCs they'll invest in like a few dozen companies just to find one or [00:51:00] two winners. And it's been the same for 

hugo: business for a long time, movie studios and record labels historically have done exactly like mostly losses and then.

You get Elvis or whatever it is, right? 

eric: Exactly. I mean, Bill Gurley, he's, he's one of the most well respected VCs in Silicon Valley. He was our VC at Stitch Fix and he used to say, I can't do his Texas accent, but he used to say, if I invest a dollar in a company and it fails, I lose a dollar. But if I invest a dollar in a company and it wins, I win like a hundred dollars.

Right. So meaning his downside is limited to what he put in, but the upside is unbounded, right? It could be many orders of magnitude bigger than the downside. And so they've learned this quite well. And the same thing I think applies to our ideas. Yep. Most of them are going to fail. Uh, but the few that succeed can greatly outweigh the cost of the failures.

Uh, it does depend on some assumptions by the way, right? Like what is the distribution of outcomes, right? And most companies won't know this until they've done enough experimentation to build up [00:52:00] a database of these things that they can analyze, but you can even run simulations and even if you assume a Gaussian distribution, it may have a.

A negative mean, meaning your average idea is gonna be negative and it's Gaussian, meaning the chances of a really big event aren't very high, but even running with the assumption of a Gaussian, you can find that, you know, even with 80 to 90 percent of your ideas failing, it's still worth it to keep trying because of that.

Asymmetry, right? And there's good news out there is that, you know, there's a paper out, it's a few years old now, but it's called AB testing with fat tails, and they studied, I think it was Bing, which does thousands of experiments, and they got a good sense of the distribution of the outcomes for Bing, and they suggested it was more fat tailed, right?

So this makes it even rosier picture with the fat tails, that means there is some likelihood that you might find some really big win. Right? Some crazy win that pays for hundreds or thousands of losses. And that resonates with me. I mean, that matches at least, you know, anecdotally, my experience where you try things, [00:53:00] many fail, you get some moderate wins, and then every now and then you get this outside win that really lights up, like, you know, that'll get you kudos for years to come, right?

And so all this is to say it pays to keep trying, even in the faces of a high probability of loss. It pays to keep trying, because that exposes you to some chance that you hit the big one, right? And so, that's really the justification for scaling up experimentation. 

hugo: Amazing. And I love that you referenced the AB testing with fat tails paper, which we'll link to in the show notes.

Funnily, this is the second time we've discussed that paper on this podcast. Ramesh Jahari, who's at Stanford, but does a lot of advising as you're probably aware on online experimentation for Airbnb, Bumble, Uber, this is a paper that he is very adamant about the importance of particularly when running large online experiments at massive tech companies, anyone interested in this type of stuff, please do check out that paper.

Um, Eric, we're going to have to wrap up soon. Sadly. There's so much more fantastic stuff to talk about, [00:54:00] but I would love, particularly as data teams aren't cheap and I want all leadership teams to have as much buy in as possible, so that's what I love. Your ideas of rethinking data science teams as revenue generators and impact generators as well.

I'm just wondering specifically what structural and organizational and cultural changes are required to enable this shift. Um, and what impacts can it have? 

eric: Yeah, it's, it's one that I think is more available to companies than they know. Like I think if they were to make some tweaks to how their organizers are organizing as well as maybe some technical changes, I think they could enable.

They're data scientists to be, you know, revenue generators or, or whatever their objective function is, maybe improving retention or profit, whatever it might be, because it can be so clear to me that it's the algorithms that are driving a lot of this impact. Right. Again, we can, they're very easy to even test.

We can do that and get the causal impact that they're making, and it can be tremendous. And. [00:55:00] Not only that, but it can be done independently of other functions, right? And if that's the case, then why shouldn't you saddle them with a little bit of accountability and say, well, we've got to put your money where your mouth is, you can try your ideas, but you've got to return something at the end of the year, right?

It's hard to know which ideas will hit, but you can try a portfolio ideas. Likelihood likelihood is you'll get some when, and you should, um, make the data scientists accountable for that. So what I like to do is decouple the. Algorithms from the applications that house them, right? So you can think of a recommender system for an e commerce company, right?

It may be engineers and product managers that own the website and the app and even the page that contains that, you know, suggestions for you page, right? So they build that page, but that little space in the middle where the recommendations actually go, that is the results from an API to the algorithms team, right?

So the engineers are on the page, they call the API, it returns product IDs. And they have to do final assembly, like [00:56:00] get all the content, the images and. Product descriptions to render the page, but that decision of what to put there, that was the algorithms team, right? That's not engineering responsibility.

They are just to render the page, the products and their ordering should come from the algorithms team. Likewise, you know, there could be, like I mentioned this earlier, inventory management systems own also owned by engineering. They built it. It's a transaction processing machine, but, and they're mostly is to manage the state of inventory.

You know, something gets shipped. You mark that item as out to customer, right? Or something like that. And maybe it gets returned. You can mark it as in stock. That's the primary goal of the system and engineering builds it. But there are places where there's decision points to be made, like, such as when to buy more of a product and how much to buy, right?

And that could be coming from outside the system. From an algorithm, right? And it, you know, outside the system, it has more information, more time to process data. And then just the results can be inserted back into this transaction processing system. So this is how we [00:57:00] set things up at Stitch Fix, right?

And we had many such systems. We had, there was recommendation engines. There was this, you know, there was a matching algorithm that would, a customer would come in. We'd have to match them to a stylist. And that was again, built by engineering, but we. Had a call to an algorithm API that would insert that logic.

Even things like a visitor hits the webpage. What, what landing page should we show them? Algorithmically determined, right? The engineers are building the scaffolding, but. They're going to make a call to the algorithms team and we'll tell them, Oh, show them this page, right? And if they show them the wrong page, that's, that's on us, right?

Or the page was ineffective at converting because that's our fault, right? So it's a separation of duties. And I should explain that, you know, Stitch Fix and Netflix, I was never part of. Engineering. It was separate. It's easiest to talk about Stitch Fix where we actually had a CTO, chief technical officer that ran engineering.

And we also had me, a CAO, chief algorithms officer. I ran algorithms and we were peers. Both reported the CEO. [00:58:00] Engineering. You know, had owned a lot of the applications, but they left these spaces for the algorithms team to insert their logic. And so that was good decoupling. It actually even leads to good coding practices and so forth.

Now it does take some trust, right? So creating those spaces in those application, it takes cross functional work. You have partners to work with. You have engineers have to, you know, remove whatever logic was there before. And instead, call your API. And this is where the trust comes in. The engineers are like, this is an API written by the algorithms team.

Like, what about our SLAs and all that stuff? And say, well, we'll meet them, right? And sometimes it takes us a little time to meet their SLAs. Even though these were not very stringent, like 500 milliseconds, or even a full second in some cases, they were not low latency. Things, but we had to, we had to get good at that.

And thank God we had some good people to do that, to earn the trust of the engineers, but once you have it, the, the magical thing that happens is you're no longer reliant on engineering for trial and error, [00:59:00] right? Engineers, they have a very different workflow. That's why I recommend them being in separate departments.

You know, they like to work with. A lot of upfront design, and then they build their code very robustly from the beginning. And then hopefully they can move on to another thing versus algorithms. People we'd like to learn as we go, we need to iterate on things. And so a lot of times, even that first implementation, it's just a start.

And our best ideas are soon to come just after that, versus engineers, they're like ready to roll off to something else. And they don't want to be burdened with a lot of changes and stuff. So by this way, we're able to decouple. Right, if we took some time together cross functionally to build the space.

But now we're decoupled and that means that, um, we can, as an algorithm tribe, we can iterate as much as we want. We can try all of our experiments and we don't even have to burden engineering with anything. They're just calling the same API. They don't even know that we're running experiments behind the scenes.

And that was a huge unlock because You know, at a lot of companies, the [01:00:00] engineers are the hot commodity. You need them to make all your changes that, you know, everybody's competing for engineering resources, whether it's marketing or merchandising. Operations. They all want the engineers time by asking them to do this for us, create the space so we can plug in our, our output, we are freed up now.

We don't need to burden them with our changes. We can now work at our own pace and, and try, you know, dozens of different versions of algorithms. And there's not even any coordination needed. We can just try. On our own accord. And so when you have that, that really sets up the case. Well, okay, algorithms, you're autonomous.

Now, that means you should be kind of on the hook. You can't just be trying anything you want, and you can't be satisfied with no results at the end of the year. You have to come up with something. And so you should saddle them with some accountability, right? You challenge them with like, all right, Well, we want to see at the end of a year or two years, like so much improvement from you revenue or retention, whatever the metric is.

And, but now you've granted them the space to do this. They can work far [01:01:00] more effectively in their own way with this trial and error. And I think it really lends itself to better roles, right? So there's nothing more satisfying than. Uh, when actually creating some business impact that was not just a narrative, but causally detected through an A B test.

That's something you go home and you tell your spouse about like, wow, I had a great impact at work today, right? So that's setting that up, you know, by creating these spaces for algorithms to insert their stuff, you set that up. But knowing that it's you're no longer reliant on somebody else for success that you can't, you have to wait to get the engineer's time to try that crazy idea of yours.

You can actually just try it on your own. It can also even give data scientists justification to say no to, you know, that all the constant ad hoc requests, it gives them an opportunity to cause like, well, actually, I would like to work on that ad hoc request. It sounds really interesting, but. I got to get my, I have to hit my revenue quote, right?

And so I have to focus on that algorithm that I was working on. So it's, it's, it's, it makes the, the trade offs much more clear. [01:02:00] They can even make those decisions that we talked about earlier on when to work on infrastructure or refactoring code. I'm a big fan of refactoring code, but it's not something any business person would ever ask for, right?

It's not visible to them. And oftentimes the developer, the algorithm developer will know better when it's time, okay, this thing. Now that I know what I know, I should rewrite this, right? And they can, you know, prioritize that appropriately. You know, when is it time to lean in on delivering that revenue goal versus I need to worry about next year's revenue goal as well.

So if I don't fix up this infrastructure, we're not going to get there. And so I love that it kind of puts that in their hands. It's like, you know, from the Dan Pink's. Dan Pink's book, uh, drive from two years ago. He says, autonomy, mastery, purpose. I like to modify it to autonomy, mastery impact, right? This gives data scientists, the autonomy they need.

It allows them mastery to get really good at doing these things and then impact, right? They can measure their impact and there's nothing that's stopping them. [01:03:00] I also think it makes for happier engineering roles too. They don't like to be burdened with these handoffs. So data scientists, Hey, are you going to make this change?

And we're just as guilty as. Stripping out the context as anyone else. Right. We were just handing it off and they don't like that constant iteration. And like, it's certainly like a, for a wild idea on a hunch. Right. You're not going to be thrilled that, you know, if you're an engineer, you want to make this like, you want me to make this change on a hunch?

You don't even have any evidence or, you know, like, Oh, I feel pretty good about it, like, no, I'm not doing it. Right. But if it's your own time, right. If that's all you're wasting is your own time, then you'll, you'll try your wild ideas as well. And I think that's a healthy thing to do as we talked about. It may be huge upside there.

You don't know, you got to keep trying ideas. 

hugo: So 

eric: I think, I think it's a healthy thing to do that decoupling, decouple the algorithms from the application. That house them that grants the autonomy and then that makes it appropriate to straddle your data scientists with a lot of revenue commitments or, you know, improving retention, whatever it is you're optimizing for for 

hugo: sure.

And also this does [01:04:00] avoid engineering and platform and infrastructure. People feel feeling like a service center as well. It gives them more autonomy around how they build what they build. And I do. I haven't thought about this a lot, but I have thought about. How to reduce or change the incentives for marketers or people on commercial teams, treating data scientists like a service function.

And I do wonder whether introducing some sort of cost for them as well, if they make a request that a data scientist does, there should be a cost for them. Because there's a potential benefit as well. I don't know what that would be. 

eric: Yeah. I don't know either. It's a, it's a tricky one because like I said, you know, some amount of ad hoc requests is healthy, right?

You do want, it forces exploration of the data in ways you might not have done otherwise, but it can be absolutely incessant, right? You can just bury. And I know a lot of companies try this. Well, well, let's get more data scientists. And that's not going to help, right? Because like I said, it's a vicious cycle.

Asking more questions begets more questions. And you can add 10x the data scientists and you can still bury [01:05:00] them. 

hugo: It's kind of like when you increase the number of lanes in a highway, it doesn't improve anything. We've seen what Los Angeles turned into when they tried that. They're still doing it. So, to wrap up, I'd love, there's so many practical takeaways in here for people working.

In data for ICs and team leads and chief algorithm officers alike. I'm wondering for senior data leaders listening, what are the top lessons or changes you hope they'll take away from our discussion on how to better leverage their functions? 

eric: Well, I think If I could say one thing, it would be just to, to not do merely what's asked, but rather bring some ideas to the table, right?

The, I think the, the, the worst thing you do is just do what people are asking of you. That's not going to bring out the best of your data scientists, right? It doesn't leverage your cognitive repertoires or their extra information that they have. You got to bring those ideas to the table. The other thing is to really embrace trial and error.

I mean, again, you have to have the right conditions. I wouldn't recommend it in medicine or manufacturing where [01:06:00] the cost of iteration is. Exorbitant, but in a lot of domains, e commerce and streaming media and all that kind of stuff, it can be really effective. And to embrace trial and error means a few things.

You got to lower the cost of trying, right? So get that infrastructure in there, develop the muscle to, uh, do experimentation and abandon the failures and all that kind of stuff. Um, and this way you can leverage that asymmetry I mentioned that, uh, the wins are huge, the losses are not so bad. And the other thing is to create those spaces.

So that algorithms can play more autonomously. So you're going to work with your engineering team to create all those different places where you can plug in the results of algorithms. The last thing I'll say is you're going to be a good partner, right? These are other teams, as I mentioned, a marketing merchandising product.

I believe they face far more uncertainty than you do, and they take far more risks than you do. And so appreciate what they bring to the table as well. I have a great fondness for all my, you know, former peers at Stitch Fix and Netflix who did those things. I'm just glad they did it [01:07:00] and I didn't have to do that.

And I think that's, those are some of the big rocks that really move the company forward. So I was very appreciative of them and it is amazing what could happen when all these. Different disparate teams are working together, combining their repertoires to create something that's better than any one of them could have done on their own.

hugo: Fantastic. So don't do merely what you're asked. Bring your ideas to the table. Orient the team to make real business impact trial and error and be a good partner. I think those are wonderful takeaways for everyone, Eric, as always, I love speaking with you and appreciate all the wisdom from all the work you've done and continue to do in advisory roles to, to the table.

I'm also, and we didn't discuss this today, but I'm just, I'm super excited that you're finding more time. To write and think and contribute to the space in, in, in having these types of conversations in ways that you perhaps weren't able to when you were leading passive functions as well. So that's super cool.

eric: Yeah. Thank you. No, it is, it is a fun thing. It's hard to let it go, right? I'm no longer working at a company, but [01:08:00] I still tinker. And I love to talk to anybody who listened to me about this kind of stuff. And so I do find myself picking up the phone to talk to different people in very different companies.

And it's so amazing to hear their stories. And yet. There's a certain commonality across every, all the different companies that leverage data scientists. They all seem to have the same problems. And so it's great to kind of have some space to be able to think about, well, how do we solve these things? How do we, you know, get data scientists out of those ruts that they're in at these companies?

And how do we spread that knowledge around so that we can create better roles? 

hugo: Without a doubt. And to your point, coming back to the start of our conversation, We've always thought that data will be a valuable resource for organizations. But as it turns out, it's becoming more and more the basis of a lot of defensible modes, even with the rise of LLMs, which may make you think, Oh, everyone has access to the same models.

Lo and behold, if you can fine tune them or use prompt engineering for your own, using your own data, that's where your mode becomes far more [01:09:00] defensible. So it is a very exciting space and the conversations around. Kind of the principles of data, like AI stuff aside, just the general principles of how to leverage data to improve product, customer experiences, all of these types of things.

Yes, absolutely. Cool. Thanks once again, Eric. Thanks so much for listening to High Signal, brought to you by Delfina. If you enjoyed this episode, don't forget to sign up for our newsletter, follow us on YouTube, and share the podcast with your friends and colleagues. Like and subscribe on YouTube and give us 5 stars and a review on iTunes and Spotify.

This will help us bring you more of the conversations you love. All the links are in the show notes. We'll catch you next time.