The following is a rough transcript which has not been revised by High Signal or the guest. Please check with us before using any quotations from this transcript. Thank you.
===

hugo: [00:00:00] Hey there, Chris, and welcome to the show.

chris: Thank you for having me, Hugo.

hugo: Such a pleasure. So you're a professor, author, an applied mathematician. a systems biologist, a physicist, chief data scientist at the New York Times.

Is there anything you don't do? Yeah.

chris: pretty bad at piano. And, I don't really understand cryptocurrency valuations. So there's a couple of things I definitely have not mastered. 

hugo: the former understandable. competency in piano is such a multi dimensional space and I'm it's heartening to know that someone so deep in tech, doesn't get crypto because I feel similarly. so with all these things. You work on and your work spanning academia and industry. I'm just wondering if you can share a bit about your career journey and what brought you to the times.

chris: Sure, I mean it's, [00:01:00] one thing I'll say is that it's a long journey. It's easier to have all of those things in your cV if you've been at this for a couple of decades, which I have. so in the 90s, like really in the early 90s, when I finished my undergraduate in education, I was very, interested in the mindset of physics, but also in complexity.

So when I went to graduate school at Princeton, I was very interested in how you apply the mindset and skill set of physics to something as complex as biology, which was a, a great emerging topic at the time, particularly around, single molecule biophysics and thereafter sort of data driven biology, particularly with, sequencing of whole freely living organisms.

Suddenly you have a lot of data. And then a third revolution in biology of the nineties was systems biology. and trying to think about emergent phenomena of many components working together.all three of those are very much of interest to me. My, my PhD work was very much involved in single molecule biophysics.

And as I started my faculty position, I started publishing in data driven biology and systems [00:02:00] biology as well. And I would say data driven biology, like the challenge of dealing with whole genomes or large data sets of microarray expression data, which was an emerging technology of the late 90s, really introduced me to the field of machine learning, because, as, , maybe I can use language from John Tukey.

You, it's not, it was not so much a confirmatory statistics problem. It was more an exploratory statistics problem. There was not one model that it was clearly the right model to fit genomes to or to fit expression data to. It was more a challenge to see how do you explore these data, realize, what signal or what sort of patterns might be in the data?

And then crucially, for a natural scientist to ask, how can questions from natural science be reframed? in this,unorthodox statistical language, and then answer those questions in a way that's meaningful to and useful to a natural scientist. So that particular challenge, I think, is typical of data science as we understood it 10 or 20 years ago.

these opportunities where you have complex real world data sets, [00:03:00] and you have normal people from domains that are not necessarily statistics that have questions that they want to answer, and then there's a challenge for a statistician or a machine learner to figure out it. How do you take those complex data sets, learn something, that's statistically meaningful, and then also put it to work in a way that's useful to a human, usually some collaborator.

And that could be a biologist or a chemist or it could be somebody from product or marketing or advertising or editorial. somebody who's got a real world problem. They may or may not have lots of data instincts. They may or may not be technical. but there's a challenge there to figure out how can we put machine learning to work to answer those questions from the domain.

hugo: And so how did that lead you to the times in the end?

chris: That started from a sabbatical. spring of 2013, I decided it was time to take a sabbatical. And I went to, several, researchers whose, who I thought had experience interacting with the world. which I hadn't really interacted with the world very much at that point. and one of them was Mark Hansen, who is at the journalism school at Columbia now.

And he said to [00:04:00] me, don't go to a tech company because that would just be like your day job. Instead, you should go to the New York Times. It would be weird. And I did, and it has been. I took a sabbatical from Columbia in the fall of 2013 and put my head down and wrote some really bad code and then, rapidly illustrated how you could use ideas from machine learning to try to answer questions that are of interest to 

for example, a digital subscription company, or a company that at the time was really undergoing that transition to see if they could pivot from advertising to subscription as a driver,as a potential, sustainable, repeatable business future. And so,I had shown how you can use machine learning to try to illustrate that in ways that are useful to people.

For example, predicting which subscribers are going to cancel, which is an obvious thing you would do if you were . A digital subscription service. And at the end of that semester, I went back to Columbia, but said to the New York Times, it'd be great if I could help you build up a data [00:05:00] science team and start hiring some people who would, really be implementing that work.

And so at that point I transitioned from writing code to writing emails, and I've just been trying to help them, build up a great team. Which they have.

hugo: Do you still, I remember once where the, what's the cafe up near Columbia where they have the string box,

chris: The Hungarian pastry

hugo: pastry cafe. Yep, exactly. I remember once we were sitting there and. You sent me an email from the command line and you just reminded me, do you, sending emails, do you still send emails from the command

chris: Everything I can do from the command line, I will do. And I absolutely said, I sent, I've sent literally hundreds of emails today from the command line because I sent a bunch of emails to all the applied math students about my upcoming class. yes, I still do everything I can do from the command line.

hugo: tools and companies will come and go, but Unix is sticking around. Absolutely.

chris: I try to stay at the command line if I can.

hugo: I couldn't agree more. And I actually tell everyone that the more people who can know a bit of command line, bash scripting, that type of stuff, the better off we'll probably all be. so I'm interested that [00:06:00] you took a sabbatical, at that point, My understanding is the New York Times didn't have a data function.

Maybe they had, people working with data in a variety of ways, but then you came on board to, to shape and lead the data function and have been there now for, wow, 11 years, right?

chris: it's crazy. So at the time there was a marketing analyst group and a sort of a market research group, which is like focus groups or surveys. and the New York Times had just created a tracking solution to track different events. And then associated with that effort they made a business intelligence team by essentially taking the database analysts from a bunch of different parts of the org and putting them in one part of the org chart.

so The state of data was not as advanced at that time in 2013. It's gotten much better. so January 1st of 2016, the New York Times created a data insights group, which is a centralized team, which at the time reported directly to CEO. And really to try to focus on, [00:07:00] okay, let's try to think about how we can organize the data for the New York Times and make it useful for a variety of problems.

Mostly, to be clear, on the business side, right? On the business side, there are more problems. That can be reframed as stochastic optimization, where there's a clear key performance indicator that we can track, diagnose, and possibly drive up. So, yes, so the state of data at the New York Times was much better than it was 11 years ago.

hugo: So can you tell us a bit about building This data function and perhaps take us through some of the stages of it. And I'm thinking just for, it's a sophisticated data function now. So I'd like us to hear about that journey and hopefully in a way that could be useful to people who are building out data functions currently.

chris: Yeah, so there were some emerging efforts, clearly, 11 years ago. Like I said, there was a team of marketing analysts that were more and more being asked to do product analytics in addition to marketing analytics. We want to make sure they actually do the right things, meaning they were working with product teams that wanted to have some sort of metrics for how their products were doing.

And then separately, there were, a lot of people that were [00:08:00] database analysts, people who were in charge of making sure there was a relational database for keeping track of important data, like subscription data for example. You need to have a good set of database analysts to make sure that you actually could report on those numbers, particularly if you're a publicly traded company.

By being a publicly traded company, it really encourages, forces everybody really to up their game.I would say around 11 years ago there was a realization by many people that, we probably could do, be doing more with our data, in particular we could be learning customer trends, we could be learning what are the risky behaviors, we could certainly be identifying the at risk individuals.

And then the question is not just how you can predict what's going to happen in the absence of treatment. The question is how can you prescribe. the optimal treatment in order to drive some, some event that you want.

hugo: This is actually an incredibly key point. When you mentioned predicting customer churn earlier, that's a very, well known example now in data and ML, but, you Your ability to predict is only as effective as [00:09:00] your ability Sorry. The reason for predicting isn't to predict in a vacuum. It's to prescribe some intervention, right?

chris: we actually want causality involved. Like, why are people churning? And then how do we intervene? a real interest of mine, because it was a painful lesson for me. So when, so the research I was doing, let's say 20, 2000 through 2010, was largely leveraging tools from supervised learning. And, for example, work on, Analyzing genetic networks often was framed as can we predict which genes are going to go up or which genes are going to go down in terms of their up or down regulation based on sequence data or based on the abundance of transcription factor proteins or something like that.

And reframing things as statistical, as supervised learning rather, predictive modeling is very useful for a variety of reasons. One, the tools are really good. They work in high dimensions. And the other is that it's clear when you're wrong because you can take a data set, you can leave out some of the data, you can try to predict the data that you didn't use when you trained your model, and then you can see how bad you did on the held up data.

And that style of modeling, which now is called cross validation, or, [00:10:00] empirical estimates of generalization error, really goes back to the pattern, recognition community in the, the 1960s, but is now a central, way of understanding complexity control, how not to overfit data, And how to compare different methods and really put them in the ring against each other.

if you have a method or I have two methods and I have to decide which method is the better. One of the things I might ask is, which one is the more predictive?and then when I got here, I realized that actually there's many things in the world where that's not what people want.

It's not that they actually want to know which genes are gonna go up and which genes are gonna go down. and to be fair, some of my experimentalists. collaborators had tried to point this out to me earlier. experimentalists would say to me, that's nice that you can predict, but what I really want to know is what experiment should I do next?

What they really wanted to know was a prescription, like what is the next intervention I should do? And certainly in a company, it's, it helps you sleep at night if you can predict accurately which customers are going to cancel their subscription. But ultimately, you'd like to know, well, what lever do I pull?

what is the treatment I should deliver in order to [00:11:00] optimize the outcome rather than merely predicting the outcome in the absence of treatment in the context of health, which I think is very, useful as an analogy because we all have some intuition about health. imagine a hospital that just never gave anybody any medicine, right?

like what is the point of hiring statisticians or doctors for that hospital? What you really want to know is not predicting what's going to happen in the absence of treatment. What you really want to know is what is the treatment that's going to drive some outcome that you want. So that's true for figuring out what's the right marketing message, figuring out what's the product intervention.

Somehow there's a prescriptive problem that you'd like to get to. And that touches on causal inference in the statistical literature. Or it touches on reinforcement learning in the machine learning literature. And those are literatures that you can draw in. I would say, around five or six years ago, those two communities started talking to each other more and more.

There's much more interaction now. between reinforcement learning and causal inference people than there used to be. But they're interested in the same goal, which is based on some data set and in particular, is it an observational or an [00:12:00] interventional data set that I have? Can I make predictions?

Can I make prescriptions, rather? Can I make statements about what is the optimal treatment in order to drive some outcome? Yeah, a decision. Exactly right. and using the language of decision makes clear that, In the 1940s, people were thinking about, Wald and decision theory was asking that question too, you're looking at, let's say you're assessing the efficacy or the quality of some product that you're making, and you've tested many of them, And then you have to decide, okay, does this bucket of product pass the quality test and we can ship it?

Or do we need to say that there's a defect in this manufacturing process and we should throw it out, right? these are old questions from the 1940s that are really about making decisions. Some of it actually is in the language of, Jersey Naiman. Anyways, now I'm talking about my book on the history of data.

So some of this is in the language of, Jersey Naiman and R. A. Fisher and how they were fighting with each other about what is statistics for? Is it for science? Or is it for optimizing cost function and making decisions? so in some ways it's very old, but in the present day, the thing is, companies are software [00:13:00] companies, right?

as Mark Andreessen told us, more than a decade ago, software's gonna eat the world. And it's true that a lot of companies really, the decisions they make are in software, which means they're making decisions at scale, and it means that the product interventions that they can do are ones that they can instantiate as code, right?

hugo: And so Product changes can be written as stochastic optimization problems. in the spirit of mentioning your book again and talking about the relationship between these types of techniques and making decisions, I'll see your jersey name and, and raise you one, Gosset. and, we know that, the development of students t test was actually driven by figuring out.

how to figure out which crop yields to double down on for the Guinness factory. So maybe you could just tell us a bit and tell us a bit about your book, how data happened, and how these types of decisions co evolved with statistical techniques

chris: Sure. So there's sort of two questions there. One, how the book happened. And the other is, like, how prescriptive [00:14:00] questions happened. the book, I would say, happened from my own impostor syndrome. So after, you know,I started being chief data scientist at the New York Times, 11 years ago. But around that time I also was very cognizant of the fact that although I knew something about the physics of history, because through this many physics instructors teach history of physics, or at least a pseudo history of physics as the way they teach the subject, so I knew something about the history of physics.

I knew very little about the history of statistics, and in particular I didn't have a real good understanding of how it came to pass that statistics and machine learning both existed but seemed to be done by people who didn't talk to each other or publish in the same journals or any of that, which I found very odd.

really around the summer of 2012, I started looking into the history of statistics and machine learning more even before I arrived at the Times, clearly. and then in the spring of 2013, I met Matt Jones, who was giving a lecture at Columbia on the history of machine learning. And that started a collaboration with Matt that led, for ten years thereafter.

Matt and I [00:15:00] collaborated on a summer short course for data journalists. again with Mark Hansen and Cathy O'Neill, and thereafter, we wrote a proposal to create this new class on the history of data science. I mean, we, we pictured it as the history of data science, and then in 2017, when we first taught it, the, I would say the Columbia students really pushed us to, to address questions that were even more interesting than the history of data science, really about like,as we put it later, how data happened, how did it come to pass that data empowered algorithms What is the relationship between that mindset and society?

hugo: How did that all come to pass? And how are we to think about our own, other possible futures that exist for that reality? and also how data rearranges power. And you make an explicit point that tell us about the etymology of the word statistics, and then we'll get back to statistics itself.

chris: statistics, enters the English language in 1770 in translation from the German, and it has nothing to do with mathematics, nor in fact with [00:16:00] data. Which is weird to us because we use the word statistics as a synonym for numbers, but statistics in the 18th century meant the craft of making sense of the greatness of states.

And almost immediately in the early 1800s you get these fights between high statistics and vulgar statistics, gemein statistics. or, where the high statisticians who were studying the greatness of the rulers derided the table makers, people who would make little tables of, every row was a country and the column might be the country.

Population, or size, or number of troops, or number of animals, or something like that. And that seemed ridiculous to the people who were studying the greatness of the rulers. And it's real fun to go back and look at that literature because that tension of just how could it possibly be the case that quantifying everything and just a bunch of numbers can capture something for which we have our own intuition and for which there's a community of people who have craft.

That tension happens again and again. in different communities for centuries, right? And, to the present day, there's all sorts of things in our [00:17:00] world where we have a craft, and we have a way of understanding it, and then somebody shows up with abundant data, often through some sort of change in instrumentation, and then we have to ask ourselves how, or should we be using these data to make sense of this thing for which we have all this domain expertise.

So that could be like picking movies, or picking books, or picking baseball players, or what have you, or picking, stock, like automated trading or, technical analysis and trading and statistical arbitrage and stock trading is that, right? there's a bunch of people who think you do a bunch of research about the world and how it's going to work, and then there's a bunch of other people who are doing, very small supervised learning problems on exactly the right data, and are, obviously, some of them doing very well.

And anyway,That was the relationship between statistics as it was then and statistics as it is now. Part of, I think, what's useful about that story is to see how all of these words are drifting targets. So clearly, artificial intelligence now, 24, does not mean the same thing that artificial did, intelligence did in 2020, which is very different than what AI [00:18:00] meant in the 1990s, which is different than what it meant in the 1960s.

We've accepted that. But it's fun to look at the historical records and see how statistics itself, as a word, means different things in the early 1800s than it does in 1930, even these days when statistics departments are relabeling themselves departments of statistics and data science. I would say that's also true of machine learning and,other fields that we think of as being immovable.

data science itself, is used differently in part because there's no sort of, long standing academic tradition where there's buildings where it's chiseled in stone data science, the way it is for, chemistry or mathematics or philosophy. So these fields are all drifting targets.

they all move. And part of how they move is in response to power, right? And power in academia is often just like simply who's funding different things. but as these fields intersect with the real world, there's all sorts of ways in which they intersect power, including, state power as well as market power.

hugo: I appreciate all of that context, and I do want to get back to thinking about how [00:19:00] data can deliver value at the Times. So I love that you mentioned Tukey earlier, because when building out a data function, of course, we need to be able to count first and explore data and then start doing descriptive analytics in the forms of reports and dashboards, then analytics and machine learning, then, as you mentioned earlier, prescriptive analytics.

So decision theory. So I'm wondering if there's some. suppose graded approach to building out a data function where you try to deliverthe low hanging fruit of value. So to get people on board beforehand. So maybe you can tell us about the journey at the times through that, through those lenses,

chris: Yeah, that's a good point. So I was saying earlier, We, I'd shown that we could use machine learning to do things that are obvious if you already know how to predict genes going up and down using sequence data or something. For example, we could, show which users are likely to cancel their subscription and show that we could do that with statistical significance.

Then, that's useful sort of as a provocation, but then the question is, okay, how [00:20:00] do I integrate that into process? So there's two related ideas there. One is this idea which, I think most people trace to Monica Rogati, from a blog post of hers in August of, I think, 2020? Maybe even earlier.

Yeah,

hugo: AI hierarchy of needs.

chris: so the AI hierarchy of needs is this lovely sort of play on Maslow's hierarchy of needs, or maybe on Bloom's taxonomy of, intelligence that at the fundamental base layer you really need to have good, data logging, like you need to have good data engineering first. Then you can think about dashboards, A B testing, then you can think about machine learning, stochastic optimization, then you can think about fancy AI.

So all of these things, if you really want to integrate this into process, there's a need for reliability or for engineering mindset. So from an engineer's perspective, that's obvious, like you don't want to have artificial intelligence that you don't understand, that's not reliable, if it falls down, if one of the inputs to the AI falls [00:21:00] down, you can't, you have any observability, and you don't have any traceability.

Like from an engineer's perspective, that sort of hierarchy of needs is clear. You need to have good fundamentals. But to a data scientist or to a mathematician, it may not be clear that you can't really impact things unless you have those fundamentals correct, right? Like you have good data engineering, good pipelines that are reliable, monitored, etc.

That's The other way to think about it is just in terms of, people, in terms of an organization. How do you drive change management in an organization? So doing one off fancy artificial intelligence can be very useful as a provocation. you can show to people, look, it is actually statistically possible to predict who's going to cancel their subscription.

Now, what are we going to do about that? So that's a provocation, but it's not the same thing as a prototype. So then you need to think about, okay, how are we going to prototype the code, build out the, data engineering, build out the output of that algorithm in such a way that we, to use the language of Agile software, we build our code in slices rather than in layers.

So we build one [00:22:00] complete solution. Once we've built one complete solution, then we can start hard work around systems integration, process integration, organizational integration. So there's a software engineering, there's a concept of the, of integration, which doesn't mean calculus. It means how are you going to integrate that piece of software into the larger system, right?

So once you solve the machine learning nugget, the little tech nugget, then you can think about, okay, how am I going to integrate that into a product? So upstream, you need to make sure that the data are available downstream. You need to think about how the machine learning is going to be

instantiated as some sort of product change where it's going to drive something. Then you can think about all of the engineering that needs to be there. The observability, you need to have some monitoring so that if things fall down there's some sort of alert. that there's a fallback plan. All of that basic good engineering needs to be in place.

And then you can think about all the integrations, which include process organization. You're working with real people. How are they going to change their process in such a way that they can put your [00:23:00] fancy machine learning to work. And organizational integration. How is the organization going to understand and, and continue to function in such a way that this machine learning is part of everybody's process and part of the way they understand their work.

And ideally, everybody gets some data instinct and they start to think, Oh, we could use machine learning for that, or we could use that particular statistical technique for that. 

hugo: I also think important is the interaction effects between these different aspects. So thinking about the fact that you need, To integrate the data function with actual software speaks to organizational challenges as well. how do data scientists and machine learning engineers interact with, with the classical software engineers within an organization?

Who actually ships it? Are data scientists empowered to ship or do they throw things across the fence as well? So how did you think as you built out your team about these kind of interaction effects between organizational and technical concerns?

chris: Yeah, I think the right answer really depends on the stage that a [00:24:00] company is in. SoI think the right answer depends on whether you're like a small company or a large company. Can you specialize or not? And it also just depends on where a company is in its sort of data journey. is the company at the point where it has good data engineering, good pipelines?

Has it, gone through the lower levels of Monica Rogati's hierarchy of needs? Do you have dashboards for things that need to be observed? Do you have A B tests to try to start thinking about how we can use data to help make decisions? Depending on the answers to those questions, you can imagine different organizations that work better or worse.

In our case, we've tried sort of everything and as the state of the company has changed and as the state of the team has changed, we've changed. the first code I wrote at the New York Times was in R and the output of that code was a PDF and that was useful as a provocation, right?

And then at some point, rapidly, it became clear that if I wanted to collaborate with, software engineers, I would not be coding in R, I would be coding in Python. and then, there was no, nobody on my team, to the [00:25:00] extent that there was a data science team, who was shipping to prod, so to speak, who was writing production code.

So then it was a question about how to collaborate with software engineers, which were, initially coming from different teams. So we would collaborate with one set of software engineers to build out something that might be useful, for their particular product. And eventually, it became clear to me that if we wanted to have impact, The people that I hired would need to become good enough software engineers that we could ship a lot of our own code.

Either upstream in order to doing the data pipeline building, or downstream in order to deploying things. eventually we started working, more closely with a dedicated software team. so that, depending on the project, data scientists or software engineers might be touching different, github repos, github repositories.

The data scientists might be coding in different repositories. Python or SQL or Go and same with the software engineers. and as that sort of matured, the company matured to create a machine learning platform team, of software engineers so [00:26:00] that data scientists now might be collaborating with future engineers, meaning software engineers that are associated with a particular feature or data scientists might be collaborating with machine learning platform engineers who are building out platform tools that we think will be useful for a wide variety of applications.

applications. But, there's no one right answer, unfortunately. and it really depends on the state of the company in terms of its scale and in terms of the maturity and in terms of, everybody's willingness to incorporate, machine learning into different solutions.

hugo: I also, as we've discussed before, many organizations treat data science as a support function. And that's an understandable, I think first, first pass, right? Like we need to know about these particular things, customer churn, these types of things. But as you and I know, That isn't necessarily the best way to structure a data function so that they can deliver as much value and impact as possible.

So I'm wondering how you structure and manage your team to ensure that they can drive as much real impact for the times as possible.

chris: It, [00:27:00] yeah, it can mean, again, it can mean different things at different companies. So when we started the data science team at the New York Times 11 years ago, machine learning, data science was intimately associated with machine learning. and one of the things we said in the book is that term, data science, has come to mean different things in different places.

We actually quoted a Reddit thread in our book that said, Are data scientists at Facebook data analysts? because at some point Facebook just relabeled all of the data analysts as data scientists. Which you can see because then they had a data science team which re labeled itself Core Data Science.

In any event, so the terms can mean different things at different companies. In terms of the way that we set up for success, I would say one thing that I said to people very early is, make sure that anything you're doing, you could explain to the CEO while you're doing it. So I didn't want anybody to do something that was so researchy that it wasn't clear why it was actually useful, like useful to the org.

So we've tried to work on things that are very useful, but that also means that it's easier to find partner teams such that. We're aligned that their goals are [00:28:00] also our goals, right? It's not like they're trying to reduce churn or increase engagement and we're trying to, do some research project.

Like their KPIs become our KPIs and we're interested in how do we take their KPIs and make those key performance indicators go up. and if we can find something where we're aligned in terms of KPI mindset, it's a lot easier to collaborate. similarly, there are times when. We're working with teams that are, they don't have spare software engineers or product owners or project managers to give us.

And so we have to decide either we don't have impact on that important part of the org, or we learn and laugh, product mindset or, collaboration skills in the form of project management or software engineering to have impact. And so the data science team has at times built its own, feature pipelines and data engineering pipelines.

Before there was such good MLOps, decades ago, Data Science Team has built, its own,tooling for keeping track of models and monitoring models and things like that. at this point, [00:29:00] there's great MLOps tooling from a variety of vendors for doing that. but over the

hugo: Arguably way too many to choose

chris: hard to choose, right?

hugo: Matt Turk has what he calls the mad land landscape, right? and it gets thicker and denser every year.

chris: right, people just keep making new companies. and funding them, like Matt. yes,so there's no, one right answer. Instead you really need to think about principles, which are like, how am I going to recruit and retain really talented people that know that they are having the right balance of, Autonomy, mastery, and purpose.

And how are we going to make sure that we're doing things that are meeting the needs of the organization right now? And those needs change. How are we leveraging the right technology tooling? we should not be doing MapReduce, right? That's, that is a solved problem, which is, at this point, is somebody else's problem.

we're using BigQuery, so the MapReduce is happening, but it's somebody else's problem. so we need always to be,conversant in what is the state of tooling? So that we're leveraging, the tools effectively to meet those higher level goals. 

hugo: How much specialization is there in your team? And what do [00:30:00] you look to hire for? So let's say a listener wanted to apply for a job at the Times. Would they be more well suited if they were full stack? or specializing in online experimentation? Or what type of things do you look for these days?

chris: So the team is big enough that, different roles actually do have some amount of, specialization. So there are roles which, even though they're called data science, might be more with an eye towards engineering skills or more with an eye towards causal inference or more with an eye towards having done randomized control trials from which you learn a targeting policy or something like that.

Or experience with that. contextual bandits or something like that. Different teams are,we still meet as a group, for sure, but, different roles do have a wee bit of specialization in them. In general, in terms of technical and collaboration skills, there's a lot of similarities. In terms of technical skills, we're coding in Python, we code with each other, so basic software carpentry and knowing how to work with GitHub, knowing how to work with Python, It's very useful.

[00:31:00] You're going to be getting your data from SQL, so you better either know or be willing to learn rapidly enough SQL to actually go to work. and on the other side, in terms of collaboration skills, we really look for people who are good collaborators because you won't always have, a product person who stands between you and the end user that you're working with.

Often it's the case that you will need to go make friends with the end users and try to listen to their worldview and their demands and their pain and try to figure out, okay, how am I going to build something that's going to be useful to them and try to meet their goals? so we look for good collaboration skills, both for communication and also for listening.

so those are the techie skills and the collaboration skills. And in terms of machine learning, we need people to know enough machine learning that they know the right tool for the right job. Sometimes the right tool for the right job is contextual bandits, and sometimes the right tool for the right job is a histogram, right?

so people need to know, what is the right tool for the right job rather than have somebody come in and there's one particular method that they think is the hammer, that they'll apply to [00:32:00] everything.

hugo: the histogram's more sophisticated sibling, the empirical cumulative distribution function, 

you don't get sampling bias.

chris: there may come a time where that's a good thing to use, particularly if you can try to, explain it to your partners, right? If you

hugo: the

chris: if so, so sometimes the right tool for the right job also is set by the fact that you want to collaborate with somebody, right? And you need to build out something that your collaborators can, can understand works, right?

and they interpret it enough that, they're gaining some additional insight and some sort of confidence that machine learning is working for them. 

hugo: that all makes a lot of sense. I am interested, two things we've talked around a bit, are causal inference and reinforcement learning, which I think my provocation, or spicy take, it's not that spicy, is that, There's like a bimodal distribution of the ability of organizations to use these methods.

you've got a few people who you do really sophisticated causal inference. You've got a few people who do really sophisticated, impactful reinforcement learning. but is that your sense that these are things that [00:33:00] probably we would like to be more widespread, but aren't yet.

chris: Yeah, so in the book of why,Judea Pearl talks about rungs of the ladder of sophistication. I would say that, I don't know that I would say that it's bimodal, but I do think there are sort of step changes in, communities, readiness for different methods. one is simply, are you willing to do interventions?

are you trying to learn causality merely from observational data, which could be, for example,there's a hospital that, for example, never gives anybody medicine and then you predict who's going to get better and who's not going to get better. That's a perfectly well posed, predictive learning problem, but it's not what a doctor really wants to solve, right?

A doctor wants to give different medicines to different people. Are you willing to actually do some interventions, which might mean different doctors are allowed to give different drugs, in which case you still have observational data, and so it's going to be very difficult to infer causality, because it may be that, this doctor likes to give.

the expensive drugs to the, to the rich patients and non expensive drugs to the not rich patients. But you don't really know, how the drugs are [00:34:00] working. Or, do you have a community where they are actually willing to do randomized control trials? Because once you're willing to do randomized control trials, then, as R.

A. Fisher,taught us, or Gosset before him, then you can really say a lot about causality. It's not that this field was better than this field. It's that by randomizing whether you put, cow bones, or pig manure, or whatever else Gossett and R. A. Fisher were testing out, you really know, this is the treatment that actually drives some, result that you want.

Again, in the context of a contemporary company, where the product is software, it's so easy to do an A B test. And, part of that is, you don't necessarily need to rely on the field of causal inference as we understand it in economics, right? in economics, Causal inference often means that you have a natural experiment.

There was, for some reason, there was an experiment that was too technically difficult to do, or it's not ethical to, to do something, but you may be able to infer causality from some effective, natural experiment that was done, or maybe that you have,luckily enough you have [00:35:00] something like an instrumental variable, so there was some randomness where you don't actually get to see directly the, drugs delivered at random, but there was a randomness like people getting assigned to the draft.

and then that at least provides some randomness that allows you to infer causal inference. All that field is awesome, but more complicated than if you were in a company where they're just willing to do randomized control trials. Because once you're able to do randomized control trials, where it's a simple, simplest manifestation, an A B test between two different variants, then you can directly start learning targeting policies, you can learn who should have gotten what treatment, you can start learning, recommendation procedures, like which type of content, and drives particular engagement, or which type of content shown to which type of user drives a particular engagement.

And then eventually you get closer to the land of real time explore exploit, which is the land of contextual bandits, and more generally, reinforcement learning.

hugo: Great. I love that we've covered a bunch of use cases of the data function at the times. I'm wondering if we can do some sort of [00:36:00] principal component analysis with respect to, not PCA because as Joe Howard who introduced us once said to me, I don't like PCA because there's no physics in to PCA.

but some sort of rank ordering of currently what the most important, things that the data function can deliver to the Times are. 

So, given where the Times is in its journey,the most important problems, and here I'll just, I'll quote Gartner, which is a consulting company, The problems that are most difficult and most valuable are the prescriptive problems, right? there's an old chart from Gartner which says, quad quadrants on quadrants, probably

chris: better, it's like a diagram with like most difficult on the x axis and most valuable on the y axis.

And down here in the lower left I think is description, and then somewhere in the middle is prediction, and then up at the top is prescription. if you are at the point in your journey where you have KPI mindset, which means as a community you've aligned on what numbers you want to go up. If you have, invested in good software, such that you can reliably [00:37:00] drive different treatments to different people, then you have the opportunity to drive those decisions using some sort of exploration, exploitation trade off, as they say in the business of bandits, which is to say, Do random things in random cases, rapidly learn from the data, what is the thing that drives the KPI, and then up weight the probability that you do the thing that drives up that KPI.

That's it, right? and, a century of prescriptive modeling of that, sort has made clear that it's very hard to beat that, right? it's very hard, it's very hard to do better than, explore different treatments, use the right mathematics to learn as rapidly as possible, what is the right thing to do, and then keep doing that, and upweight the probability you do that.

This also works for personalization, where you do different things in different cases. It might be that the right thing to do depends on, the type of content, or the type of user, or what type of day it is, or where in the country you are, or something like that. There's all sorts of ways to incorporate context into this idea.

[00:38:00] But, again, in some ways, this goes back to the ideas, evangelized by, Gossett and Fisher a hundred years ago. It's just that now, instantiated as software, they can be extremely powerful if you're working in an organization that has aligned on what it is that you want to, how you are going to quantify success, and have you invested in the data and the software in order to get that done.

hugo: And once we add SGD, things can move very quickly, right?

chris: Sure. SGD as a, as an optimization technique is clearly great, and works when you have, lots of data, even in high dimensional contexts. there might be different tools for different jobs. often the thing that's, the best tool for the right job is an open source repository that's been sufficiently stress tested by the community.

doubt. I am interested in how important it has been for you at the times for the data function to have a seat at the table in leadership discussions. And I'm just wondering what your approach to ensuring that data influences the decisions it's good [00:39:00] for, and then doesn't. touch so much the ones that it isn't great for.

yeah, part of that,part of that I would say is, empathy, like actually understanding what other people value and understanding what other people, what are other people's hopes and dreams and fears. and you don't come in as a technologist and say, okay, well that's not the way you should work.

When you come in as a technologist to a community and think, okay, these are the values, the epistemology of this community. How am I going to be able to communicate in a language that resonates with their values, the way they understand the world? When there is KPI mindset, that's very useful for aligning people.

If everybody's aligned that like this one number quantifies success and we would like that number to go up, then everything becomes a lot easier because that number becomes the currency, right? It becomes, it reduces all the complexity to a metric, right? And that can be very useful for aligning a community.

Provided that you've chosen a good [00:40:00] metric, which people in Silicon Valley call the alignment problem these days, you want to make sure that if there's one number, and we're all going to work to make that number go up, choose that number wisely.in terms of leadership, I think it's important to, to speak the language of the community that you're in, and to make clear that,you don't get to come in and say, the success metric is AUC.

Right? There's already some success metric in the community. You should think about how what you're doing does or does not contribute to that success metric. I would say the other thing I can do as a leader is get out of the way. And hire really good people. Hire them that, hire people that are really good at, empathy and communication.

Meaning that they listen well and they communicate well. And then stay out of the way. and don't go to those meetings. Let them go to the meeting and let them represent their own work. which I think has been really useful. is to work in a company where individual contributors, present their own work and, are recognized for that.

so we have many cases over the last decade where the individual [00:41:00] contributors who built out the innovation presented that innovation to the CEO or to a wide variety of other people at the C suite. And I'm fortunate that I work at a place where the C suite is pretty real smart and pretty real curious, and that works out well.

hugo: You actually told me a story about, former president Barack Obama, which I think illustrates this very well. Would you care to

chris: Yeah,I'm fortunate to know somebody who, was in the room when the team that had, executed this mission against Osama bin Laden was presenting at the White House and the way my friend told it was that the team showed up and met, the President and President Obama said, Okay, tell me about how you executed this mission.

And the team leader said, Mr. President, my team will now tell you how they executed the mission. And then the team went around and said about what they did. Meaning that the leader of the team didn't come in and say, I architected this plan, it was great, I'm the bee's knees. The person who led the team got out of the way and let the contributors present directly.

I thought it was a great story about how [00:42:00] valuable it is to, to grant autonomy to the people that work for you. provided that you've, built up a good team,it's really empowering to the people that work with you for them to know that they're going to do good work and that good work is going to be recognized. 

hugo: fantastic. And I do love that you framed it earlier in terms of. you're best served if you speak the language of the community that you're trying to interact with. But by virtue of your role, there are at least two communities, right? There's the executive leadership community, and then there's the data community, which is your team.

But what I'm also hearing in there is there's a role of translator between these few, but also enabling your team to speak the language of business as much as possible as well.

chris: Yeah, I mean, the bigger the company, the more there are teams within that company. we're at the point where the separate teams that, the data science team might be collaborating with include the data analyst, software engineers from the software, the machine learning platform team. software engineers from the various feature [00:43:00] teams.

It might be like software engineers who just work on marketing or advertising or something like that, as well as, the newsroom, which, can mean editors, as well as business leaders, people who are in charge of, subscription or something like that. So there's a diverse set of communities to try to understand how they think.

And that's, that's, I think, enjoyable. I actually do really think that it is useful to come from the natural sciences, particularly if you're coming from a multidisciplinary collaborative environment, because biologists and mathematicians and chemists, they all have different assumptions about what good science looks like and what the goal of a paper is.

Evelyn Fox Keller has this famous book where she opens it up saying, what do biologists want? she's She opens this book, Making Sense of Life, about the schism between mathematicians and biologists trying to work with each other. I think there's very similar elements to, product people, marketing people, editorial people, data people, machine learning people, software engineers, all trying to [00:44:00] collaborate and add value, although, all, even though they all have slightly different, understandings of what really matters and, what is best in life, and collaborating together.

hugo: Yeah, absolutely. And you and I even discussed the other night, both having, we used to work in biophysics, but the relationship of biologists and physicists trying to collaborate together. And if any, I love, there's a tome, Darcy Thompson's on, on growth and form, which is must be a century old now or something like that.

But the types of issues we have in interdisciplinary science and business. Like he elucidates incredibly well in terms of relationships between mathematicians, physicists, and biologists in studying morphology also. Chris, I'd have to fire myself if I didn't ask you about generative AI. but I, we've been talking a lot about your work at the Times, so I'd like to take a slightly different approach here.

I'd like to return to your role as someone who professes, a professor and an educator. and I'm just generally interested, I, I'm an educator as well, right? And I'm just generally interested in [00:45:00] how do we even teach people about generative AI when it's moving so quickly?

chris: Yeah, well the tools are not going away, in part because trillions of dollars are being invested in them. with that amount of capital, they're being instantiated as products, and they're also,they're also integrating into our norms. Meaning, part of this is, what are our normative expectations about whether or not it's appropriate to be using those different technologies?

is it okay to use a large language model to write your friends? a cover letter for a job or the eulogy at a funeral, like we just, we have very rapidly evolving norms about,what is the appropriateness of all of these pieces of technology in our day to day interactions.

It's really amplified around AI because it is artificial intelligence. That is the goal of the machine learning community and the product designers is to make tooling that reminds us of the things that we think of when we think of intelligence. so that's, slightly different than a really performant PCA, to use your earlier example.

as an [00:46:00] educator, I think I best serve my students by recognizing that the tools are not going away. Rather than saying, here's the way I taught this class ten years ago, I'm going to teach it the same way now. I think I should encourage the students to understand the pros and cons and the abilities and the shortcomings of the tools as they exist today.

This is slightly different when I teach technologists and non technologists. So when I'm teaching young applied math majors,I want them to understand the methods, I want them to take advantage of the tools that are available to them, many of which are great for, amplifying and expediting their research, right?

writing code that executes, debugging their own code, finding out about new papers, Getting a simple summary of a difficult technical subject, generating pseudocode to try to explain their code to their peers in a way that's comprehensible. Those are all the things for which, you know, contemporary tools of the last two years are really helpful.

And I want students to know it. That said, part of the way that I teach technologists is to [00:47:00] make them present what they've done. And so if they don't actually understand what they're doing, it's going to be extremely awkward when I ask them to get up and present what they've done to myself and to the other technologists, so they really have to understand.

enough about it that they can say, okay, this is the line in which a quadratic program is executed, and I don't actually understand how that optimization happens, but at least I know what is the quadratic program that's being optimized. and they can walk through the pseudocode at the level of control loops in mathematics, and they can derive some of the algorithms.

So, long and short of how I tease technologists is, don't pretend that technology doesn't exist, just make sure that You have some understanding of how it works. That includes also when it's not doing what you think it does. The other day, some technologists gave a presentation on something and they were doing, reinforcement learning to optimize a card game that happened to be their favorite card game.

And so they built out the game engine and then they called, a pre written reinforcement learning code to optimize it. And we realized as we dug into the code that the, [00:48:00] RL code worked, but it was actually optimizing some other problem entirely. Right? So the students need to actually read the code, not just take code from somewhere else and assume that it does what they think it does, that was true, five years ago with taking code from GitHub or from stack overflow, right?

that, those rules haven't changed. Now for non technologists, I want them to be fearless about interacting with the tools. But again, I want them to understand the limits of those tools, right? And the more. You use some new piece of equipment, the more you realize what its limitations are. You have to be willing to do some real critical thinking about the output of any of these, methods in order to understand what it is and what it is not.

I'd like for everybody to get past the top of the hype cycle and get past the trough of despair in the hype cycle and eventually get to the efficient part of the hype cycle where you have rational exuberance rather than

hugo: plateau of productivity.

chris: yeah. and unfortunately that keeps changing, right? Because the tech keeps changing, right?

So it's not like we can all just be like, Okay, I understand what PCA is [00:49:00] good and bad for. the tech itself is constantly changing. Trillions of dollars are being invested in this field. the tech is going to keep changing. it would be good for people to, to continue to be fearless, but also critical.

hugo: Yeah. And something I'm hearing that firstly, I love the hype cycle as a model. I think the, one of the things, the most important things that misses is the fact that we have many peaks of inflated expectations as time goes on with most technologies. I, I also think something I'm hearing there that I want to tease apart a bit more is just once again, like first thinking, how do we interact with technology?

What do we want to build? What type of business metrics do we want to move? How do we evaluate them? How do we think about the data? All of these things. And yeah, stress test the code, right? From GitHub. And if it's in an open source repository, that's had, umpteen issues and pull requests, perhaps it's more stress tested.

chris: And these are all things that we could have had that part of the conversation a decade ago. We probably did actually. Yes, that is an old phrase in the software community. Many eyes make all bugs [00:50:00] shallow.

hugo: yes, exactly. So we've talked about your work at the times, what you're up to at Columbia. I do want to delve a bit more into some of the really exciting, courses you're about to start teaching, but I'm just wondering your work in academia and industry have cross pollinated each other.

chris: Yeah, in terms of the way that it's benefited my industrial engagements, I do think that, research mindset is useful and there's an amount of, self critical reflection because when something works, it doesn't always work because you understood it, right? So just because something works and you go, Just because you've got a plot that looked like how you thought it was going to work doesn't mean it actually is working, and you need to dig a little bit deeper, and make sure that it's really doing what you thought it was doing.

so I think there's amount of, there's a research mindset, which I think is useful in terms of real critical introspection about the tools you're building. There's also, a bravery about research mindset, which is you may not know how to solve something, but that doesn't mean that you can't, find [00:51:00] out.

You can't. tool in the right website, the right book, the right papers, the right people, somehow find a way of what is the current envelope of human understanding and either, get to that level of understanding or somehow learn to benefit from the fruits of that community. that sort of research mindset I think is absolutely useful in industry.

In addition to the collaboration skills aspect of it, particularly for multidisciplinary researchers. When you're doing solo work and your paper is only going to be read by people whose brain is shaped just like your brain, that is not so useful for moving to a world in which you are being judged not by people who are peers of yours, but people who are complementary to you.

But that skill set of working with people who, they share goals with you, but they have very different mindsets and tool sets, that, which comes from, to my mind, that comes from multidisciplinary research, is very useful in industry. Now, how does industry help in academia? I think, in years past I was speculating on what people in industry actually want.

I didn't [00:52:00] really have a, nearly as good a sense as I do now for, what is it that makes a student employable? what are the collaboration skills that students should try to learn in their projects, that are going to be useful to them when they're in the workplace? how do they make themselves, hireable when they're trying to get jobs?

what are the trends in industry in terms of methods that matter. Um,what is the role of brand new innovation that's extremely publishable versus an open source repo that's been deeply stress tested. And so you want to build on top of that rather than build on the fancy new thing that was published six months ago.

there's a lot of ways that in which I think engaging with industry gave me a perspective that's useful to teaching the right stuff and also knowing that I'm teaching the right stuff.

hugo: A lot of that resonates and I do think, I did a postdoc in, part of my postdoc in New Haven and part in Germany, but looking at the funnel of post PhD, grad school to postdoc, to [00:53:00] academic position, clearly a lot of people leave that funnel and go to industry. And I think one of the big challenges is that, academic education doesn't, on average, prepare you for industry at all.

So it's really heartening to hear that you're able to take that perspective, back

chris: I am, but that's by virtue of doing a really weird path. the idea of just, specializing in one piece of academic research in which your work is only being vetted by other people who are just like you, that creates a real ability to expand the envelope in that field, but it doesn't necessarily produce a bunch of people who are really good at leaving that community and going into other communities or having a fruitful life.

I, it's not the road well traveled, I would say.

hugo: very much so. So if Biophysics and systems biology and applied math weren't enough. Then you went to the Times. In terms of interdisciplinary stuff, all of that, it's clear, is not enough for you because you collaborated with Matt Jones and we've talked about your book, How Data Happened. Matt's a wonderful historian who's now [00:54:00] at Princeton.

And actually, we did a podcast with him, which I'll link to in the show notes, about all of that stuff. But if Clearly, all of that still isn't enough, because now you're working with, media experts and political scientists on a new course, called Persuasion at Scale. So maybe you could tell us a bit about this.

chris: I do think, imposter syndrome has been really useful to me because, there are things that I know well about machine learning. there's other things that I don't really know about, society and people. And recently I was at, a meeting where a bunch of technologists were talking about, machine learning and how it can be useful for understanding information.

And then a political scientist got up and said, Actually, none of you has any idea what you're talking about. Here's the way that people actually engage with news and entertainment content. And just some basic facts from survey data, made me realize that actually, I, there's a whole other field out there that I don't know about.

hugo: the way they don't engage with news as well, which I found [00:55:00] fascinating. That must be very interesting for you in your role at the Times as well.

chris: Yeah,fortunately for me, at the Times, there are other people who are much more, cognizant of that. People who work in product and people who are really the c-suite already have known this for years. that's why the New York Times is a very diverse set of experiences, So it's international news. It's also a lifestyle site, lifestyle news. It's recipes, it's games, it's opinions, it's entertainment. It's actually a pretty broad set of content. Clearly not just, it's not just news in a newspaper. Right? And then the newspaper, of course is not the diversity of the digital experiences.

As somebody who's been quite focused on the data science aspects of it, I wasn't seeing it, but fortunately there are other people who were well aware of it. In the academic context,so I had met somebody from political science who was much more conversant than I in the actual data about how people engage with, entertainment, sports, politics, news, everything else, and also knew the literature of causal inference because, getting back to earlier statements about causal inference, [00:56:00] Not everything is an RCT, where different people see different types of messaging and then respond to you, how they feel about those messages.

So, the provost at Columbia had a, an announcement that they would be giving funding to try to encourage faculty from completely different schools within the university to try to teach together. So I asked her name is Eunji Kim, she's a professor of political science. So I asked her if she would be willing to teach.

And so we're going to teach a class together in the spring, which in some ways I think is going to be similar to my class with Matt, in some ways different. So it'll be two people from two different schools. we're going to try to talk about data and how it impacts the real world. Again, in code, it's very easy to introduce people to computational techniques.

it would be very difficult to derive these computational techniques or to code them from scratch. But it's very easy to walk somebody through, a notebook or a Google Colab environment where you can present somebody the [00:57:00] code directly, and you can walk through it together, and you can look at the code and see what it does.

You can break the code and change the code, and you can use the code as an experimental apparatus to understand the methods, at least as well as it would work to simply show them algebraic expressions and pseudocodes. So actually interacting with the code is an experimental apparatus I have found over the last, I don't know, 10 years of teaching to non technologists.

It's a great entry point for getting them to, to develop an intuition for how these methods work.

hugo: Without a doubt, and I always, people who've listened to me before will probably know I wax lyrical about this a bit too often, the central limit theorem is one of my favorite examples here. When I was coming up, I was taught the central limit theorem through the calculus, and then I was told to teach biologists the central limit theorem through the calculus, and then I realized you could, Get a data set and use bootstrapping 

chris: Mm mm. Mm.

hugo: see the central limit theorem emerge as you resample 

chris: Mm mm.

hugo: And the intuition that biologists are given there is far superior to lack of intuition [00:58:00] given through the calculus.

chris: Love it.

hugo: I think,

chris: so with that class I'm hoping that we can introduce people who understand society to computational methods and try to get a sense for it. The fact that these methods are actually used in industry and in all manner of persuasion, whether it's marketing, advertising, or political persuasion.

And then, I'd like for the technologist to see how there actually are data available so that you don't necessarily just think about the method, irrespective of what society does in response to those methods. and just get those communities to talk to each other more. Also, a lot of the discussion of any of these topics is just dominated by anecdotes and speculation.

And then there's this scholarly community that does, I think, some very rigorous, statistical work on it, but, that sort of rigorous work doesn't necessarily, drive the con the conversation as much as the,the, more exciting, speculations about how things are working. So I think it would be useful to spread awareness of the fact that there are careful, rigorous, logical [00:59:00] ways of thinking about the topic.

hugo: That's fantastic. You're right. There are way too many anecdotes. There's it's not even really anec data yet. So I'm excited for these more quantitative approaches. One thing I do love about this course as well is that you enable students to download their own data from social media or whatever it may be and interact with that.

And I actually think not only does that Keep it incredibly relevant for the students, but it's empowering and in a world where I do think a lot of us aren't empowered, to start thinking about our own data and what is known about us, how we're persuaded and these types of things. And I actually think, the lack of agency, a lot of us feel, is something which can be shifted with this type of approach as well.

chris: It's just, it's a nerdy thing to go like download your own, I don't know, mail data, your own browsing history, and then do a statistical analysis of it. Who would do it? But if you do it in the context of a class, then you're like, Okay, I'm doing this for a class. And you learn a lot in that. I think many people engage with data better when they have some [01:00:00] intuition for it, and no more so than a data set that contains themselves.

I noticed this when I started reading machine learning papers, is that often the data set they would use would be the conference in which they were presenting that work. And I realized, okay, nobody's data set is more beloved than a data set that contains themselves. people really love reading a data set that like, ooh, I'm in this data set somehow.

doing that statistical analysis, you start to learn about statistical analysis and just exploratory data analysis. But you also learn how limited, the representation of you in a quantified world is, right? so you can look at your search history, And you can start to see, okay, it over indexes on this thing that happened to be captured, or maybe I looked at this page too many times because the page was not loading well and I couldn't find my information, and it somehow is a, a real bad, distorted funhouse mirror of who you are.

Good! We want students to recognize that, there's a limit to the things you can quantify, and there are ways that you are not fully captured in the way that you are represented, and, a marketing database or something like that. The digital breadcrumbs that you [01:01:00] leave are a pretty poor representation of the complexity of any human being.

that's one of several, things we'd like students to engage with. In addition to engaging with the mathematics and causal inference and machine learning and understanding how these methods work.

hugo: Great. And there are a couple of wonderful pages I can send people to, including, the Columbia engineering blog. But when the course is live, we'll, with your course with Matt Jones, all the notebooks were available for me to execute as well. Do you plan on, socializing these?

chris: I hadn't thought about it, but I don't see why not. One thing that I never did with Matt was, we never made videos of the class, which in retrospect, it would have been nice to have the videos of all of those classes. but yeah,the, we put the whole, with Matt, we put the code on GitHub from the get go.

which included all the notebooks, which eventually became Colabs, and then those Colabs are available. the PDFs of all the lecture notes are all available, so I don't see any good reason not to do that, with this class as well.

hugo: Cool. And although there are no videos of your class with Matt, there is, and I'll include this in the show notes as well. On YouTube, there's a [01:02:00] video. Of you talking, giving an hour lecture about it at Princeton, which I used to send to everyone before now I send people the book, but do you remember that talk

chris: It was probably, yeah, probably the Complex System Summary School. I think I was hosted by Matt Salganik. And it was probably one of the first times I gave a talk called, What Should Statistician, CEO, and Senators Know About the History and Ethics of Data?

hugo: Wonderful. Wonderful. look. We're going to have to wrap up soon, sadly, Chris, because I love picking your brain. I do want to move back to the New York Times and just hear about what do you think the future of data science at the Times looks like? And what are you most excited about when you think about the future of data science and ML at the Times? 

chris: part of it is not about tooling. It's about, people and their process. more and more, more and more groups that we collaborate with at the New York Times have gotten to the point where They see that there are, aspects of what they do that could be, quantified in [01:03:00] terms of success.

So if we think carefully about what is it that makes us successful versus a less successful, policy, right? One of the ways of thinking about it often is a set of success metrics. Often, there's a balance between different success metrics, or there's a whole set of guardrails. for example, you might want to get a lot of subscriptions, but you also want people to engage with the content.

So that's a good example of how we can show that actually you could have a one metric that matters, but also a guardrail. And you can learn a Pareto optimization between those things, and defy gravity by driving subscription, but also driving engagement at the same time. Different groups that we collaborate with at the New York Times are at different stages in terms of developing KPI mindset, where there's a clear set of metrics they're trying to drive, and it's clear how those KPIs ladder up to the kind of KPIs that you share as a publicly traded company.

so I think that's what I'm excited about, is the fact that there are more and more groups that, are,are having KPI mindset. And then [01:04:00] once they have KPI mindset, if they've also invested in the data infrastructure and the software engineering infrastructure, then there's opportunities to show how the data can be used to drive things programmatically to make those KPIs go up.

and of course, we often learn something along the way, right? as we get to an optimal front, we, we learn, what other metrics we should be monitoring or new products are developed, or we. Effectively learn that the rules of physics have changed because user behavior changes.I think there's just more groups that we could be working with, to try to use data to, to help them meet their goals.

hugo: I, so all of that. is I think really relevant to our audience. And I've been trying to bite my tongue this whole conversation on one thing, which you've put in my head again, saying the physics changes. at the very start of this conversation, you talked about your background as a physicist, and then talked about how physics can be applicable to a lot of different things per se.

but we know there's a rich history [01:05:00] of physics being useful and not so useful. So Is there a theory of social physics? are the, is the world of our social interactions have the same type of rules? Or do we need to be very careful when we apply the, these types of tools and ways of thinking?

chris: That is a goal from the 19th century, if not earlier. So one of the things we write about, exactly, that's exactly where I was going. So one of the things we write in the book is, he was a Belgian astronomer thinking, how can I take the success of celestial mechanics and use it to build a social mechanics or more generally a social physics?

so yes, that dream that, physics is so good at what it does. we should just be able to use those important methods and use them for something even more important like society. Limitations include, as I said,for example, in terms of the way people use a product, it's like the laws of physics are changing, right?

The way that people use a product itself could change in time. the other is something I said earlier, which is that, there's only so much we can quantify, and the things that [01:06:00] count may not be countable, and the things that are countable may not be the thing that counts. there's a limitation to the extent to which we can, capture things that matter to us in any sort of instrument.

but there's also just the complexity of the thing, Years ago, somebody had said to me, The hydrogen atom is physics. The helium atom is chemistry. Meaning, if you have something very simple, we declare it to be physics, and as soon as you get to something complicated like fluid mechanics, we'll just say, you know what, that's engineering.

That's some other field. So part of what made physics, part of what makes physics a successful field is defining problems such that the methods of physics are useful on that problem. and if the methods of physics are not the right tool for the right job, we'll just be like, okay, that's it.

engineering or some other field. social interactions, like the world has done the experiment without consulting you first. it's not like the, it's not like the world of people's social interactions, it's some sort of grid world that's been designed to make things simple for you.

hugo: Without a doubt. And of course you can turn mechanics into physics by just taking the second order Taylor polynomial as [01:07:00] well and truncating it. I do want to close out, by getting back to data science. You've done such inspiring work at the times among other places. So I'm just wondering for data science leaders listening, what's one takeaway or a piece of advice for building effective data science teams and driving impact in organizations? 

chris: in terms of the way you interact with other teams, empathy and communications and trying to, not treat other teams like they're your adversary, right? Like other teams are the keys to getting things done at scale. And so you need to understand the values of other teams in terms of people who report up to you.

I think autonomy, mastery, and purpose, realizing like, how are you recruiting and retaining people? It's, finding people who are good at what they do and give them some amount of autonomy, give them problems that are really hard, so they feel some sort of sense of mastery. And I think it helps if you work at a company where the people who work there feel like that there's.

There's some sort of purpose. And I guess the third thing is change, which is, I've been at the New York Times for 10 [01:08:00] years, and the state of data is very different, and, the set of leaders has changed quite a bit, and the role of KPI and stochastic optimization and data and everything else has changed all the time, so it's useful to, to try to think about things at the level of principles, because, The products and the rules and even the standards might change, but it's useful to try to maintain some sort of high level principles that'll be comprehensive and exhaustive and will still be useful even when we're all using some new set of tools, some new set of programming languages.

That

hugo: the most useful

chris: a toughie. That is a tough one. some general ones for here, for me, I would say are. Like I said, making sure that everyone, everything you're working on, you could explain to anybody in the organization why it's an important thing to work on. understanding what are the interests, values, and terms of art of the people you're collaborating with.

And, many data scientists have written this before. that, as a data [01:09:00] scientist, you don't go in and you use You expect somebody else to learn your technical vocabulary, the idea is to look at world in terms of the values and language of the person you're collaborating with.

and closely related are, basic ideas that undergo by many names, agile, design thinking, but I would just call empathy, which is making sure that you're, building something people want and solving a problem that people have. And in order to do that, you actually have to talk to people and be willing to, to see the world from somebody else's vantage point and try to make sure that you're iteratively co creating success with somebody else, rather than just disappear into the basement and build something out and then think that you're going to produce some piece of tech that somebody can integrate into their, to their process easily.

hugo: I couldn't agree more. Chris, thank you so much for your time and generosity in sharing everything you've done. personally, you're a big inspiration to me. I'm also, you're like, do so much, I'm scared. I feel I do too much and I've got a joke with a friend that we engage in what we call surface area reduction therapy, where we try to remove touch [01:10:00] points from us and the external world, but the amount you do and The variance of things you do is inspiring, but also, I don't know how you do it, man.

well done.

chris: Okay. Thank you, Hugo. Very kind of you.

hugo: yeah, and thanks for coming back and bringing all of your hard earned wisdom to share with everyone.

chris: very kind of you, Hugo. Thanks.