The following is a rough transcript which has not been revised by High Signal or the guest. Please check with us before using any quotations from this transcript. Thank you.

===

hugo: [00:00:00] I wanted to find out more about data science in the age of LLMs. So I thought, who better to ask than Hilary Mason. One of the questions I asked her was, has data science ended?

hilary: No, it hasn't. I think it's safe to say.but it has changed, right? when things are now branded data science, you don't get a line out the door for free. I love the data science community for so many reasons, but a little bit of this shiny has gone away.

And I think that's actually really good. And at the same time. We now have tools that take a lot of what, a junior data analyst might do and automate a bunch of that process. And so we are in a moment of change for the data science community. 

hugo: From there, we started to talk about what makes someone great at data science today

hilary: I've always thought that what made a data scientist great is the ability to sit next [00:01:00] to somebody else, understand their problems, what are they really trying to accomplish?

What are their ambitions? Would call that empathy, but also creativityand no model will tell you how to do this.

And it's the hardest one, I think, but that coupled with the technical skills, like then you're unstoppable. 

hugo: I also wanted to know how leaders can navigate big questions like AI strategy and AGI. 

hilary: When you have someone show up and ask a question, like, when do you think we're gonna use AGI or What's your AI strategy? I always start by asking questions back because again, it's empathy. It's understanding their worldview and why they're asking the questions. 

You have to approach it as a game of collaboration, but where you're trying to bring someone through your understanding as well, and you do that by being relentlessly pragmatic. And if you're a team leader, it is really around what you give. What do you celebrate? 

hugo: We then got into how large language models fit into modern workflows [00:02:00] and why Hilary finds prompting so challenging.

hilary: This is one of my hot takes. And, I think prompts are a terrible way to interact with LLMs. And they're an artificial side effect of the way these models are trained and created. What we end up doing is it's spellcasting, it's not engineering. Because what we're doing is trying to frame the input language to map to a particular set of weights and biases in that model in a way that's going to make it complete our task. And then the second reason is that many of the tasks for which we want an LLM to make a decision or to do would be much better served by rich context.

And multi modal context, time series context, video and audio, like even,other forms of sensors, like of an environment, of a temperature, 

hugo: And she also pointed out that even if machine learning ended today, we'd have a decade of work to do in figuring out the interfaces, processes, [00:03:00] and how to really use what we already have.

hilary: And even if we have zero forward progress in machine learning algorithms from today, I think it'll still take a decade for us to invent the products and the spaces and the interfaces around what already exists. So we are in this incredible moment. 

hugo: Finally, I asked Hilary how automation is changing the landscape of work and what skills will always matter.

hilary: Not just data scientists, I think it's software engineers, I think it's anyone, a paralegals, that is folks who are in a role where their job would be primarily taking a well formed problem statement and then doing a repeatable process to answer that problem. question. Any job like that, I think is vulnerable to change.

And so then you think about okay, what part of that is not vulnerable to change? it's the good judgment, the ability to write the problem statement in the first place. And so how do you learn that stuff?

hugo: Hilary Mason is a renowned data scientist and entrepreneur, best known for her pioneering [00:04:00] work in the field of data science. She's co founder of Hidden Door, a startup exploring narrative AI, and was previously the founder of Fast Forward Labs, which focused on applied machine learning research.

Hillary also served as chief scientist at Bitly and is a sought after advisor, speaker, and thought leader in data, ML, AI, and technology. Hillary's work has consistently been at the intersection of data, ethics, and innovation, making her one of the most influential voices in the space today.

This episode covers everything from the evolution of data science to practical advice for navigating careers in this ever changing field. I hope you get as much out of it as I did. Don't forget to like and subscribe and give us a review and five stars on your app of choice.

So before we jump into the interview with Hillary Mason, I just wanted to check in with Duncan Gilchrist, the president of Delfina, who makes high signal possible.

Hey there, Duncan.

duncan: Hey, Hugo. Thanks for having me here. 

hugo: I'd love for you to [00:05:00] just tell us a bit about what Delphina does.

duncan: awesome. at Delfina, we are building AI agents for data science and by nature of our work, we speak with lots of data experts in the space. And with the podcast, wanted to share the high signal.

hugo: Absolutely. and I assume that a bunch of the things I spoke about with Hillary in the podcast resonate with you.

duncan: there is so much meat on the bone in this episode, especially given Delfina's focus. on helping data scientists be more effective and taking a bunch of the busy work off their hands to give the humans more time for deep thinking.

hugo: and also I think there are all types of elements around the data science role, how it's evolving and the role of judgment in data science as well.

. 

duncan: judgment actually has been a theme in my own career in being something I really look for in the folks I might hire. And also in what I want to develop myself. I think judgment is [00:06:00] one of the things that you can always be getting better at and, and will enable you to do more in the future.

And I think actually judgment is super interesting in the context of LLMs in that they often actually have decent judgment. but because they don't really understand the world, the, that the often word I just use is really important because they can sometimes actually also have colossal failures in judgment, even on simple math problems.

And I think that's actually one of the biggest gaps for AI today is actually making high judgment decisions.

hugo: Absolutely. And that dovetails very nicely with a lot of the episodes we've had so far, including our episode with Michael Jordan. So without further ado, let's jump into the interview with Hilary. Hey there, Hilary. And welcome to the show.

hilary: Good morning. Thank you for having me.

hugo: Such a pleasure. Good morning, indeed. And I'm so excited to have you here to talk about data science in the age of LLMs and LLMs in [00:07:00] the age of data science in a lot of ways as well. And in particular, you've played a significant role in shaping the field of data science through your leadership at Bitly, your work at Fast Forward Labs, among many other things.

I'm just wondering if. You could open by just walking us through your career journey and how it led you to doing what you're up to now, founding Hidden Door.

hilary: I'm happy to, though, when you look at these things from a retrospective point of view, it always looks like there was some grand plan or, a vision. And I am always the first to say that, my career, even calling it that, feels aspirational, has always been the result of following whatever was most interesting.

To me at the moment and building on whatever came before. And so I studied machine learning as a computer science academic, became a professor, realized I was mediocre at it and really didn't like it. And then ended up, joining up with this startup that [00:08:00] failed within nine months. And we were doing statistical models of career progressions through, a whole bunch of data from the internet.

This was in mid 2000s. So long time ago.but it was a problem space I resonated with as someone who really didn't know what I wanted to be when I grew up and happened to have collected a grab bag of skills of being like I was a software engineer right out of college before I went to grad school.

I'm interested in a variety of different areas. had built a few things, but nothing really serious. and that startup taught me a lot about, empathy and building products for people and thinking about how a product fits into somebody's life and who's really going to encounter something and how are they going to feel about it?

And so after that, I was lucky enough to join up with Bitly when it was becoming a thing. And for those who don't know what Bitly is, it's the short links across social media. from back in [00:09:00] 2008, I was the chief scientist there. that was my favorite job title I've ever had because, as a CEO, you have Tremendous responsibility to guide an organization, responsibilities to people, to your customers.

As a CTO, you have technical product responsibilities, but as a chief scientist, my job there was really to open up the potential future products or businesses for the company through the data assets we were collecting. So it was thinking. all the way down to the systems level, up to the mathematics, up to the product design, up to the business models.

what might we do with this that could be interesting in a bunch of different ways? it was tremendously fun and amazing team and a great opportunity to be in the mess of, Social data before any of us really knew anything about it. And it was right around the time data [00:10:00] science became a job title or something people actually were doing because it was being used as a job a label for folks doing interesting things mostly around social data, but at a variety of companies that were distinct from software engineering or systems engineering.

so it was getting to be part of that very early community. and that was a whole adventure in itself. I remember. I'm based in New York. It's where I grew up. And as soon as I could, I moved back here as an adult. Been here a long time now. And, I remember sitting around a table with,a physicist, an economist, a political scientist,a sociologist.

and,a couple of computer scientists, few mathematicians, and all of us realizing we were doing more or less the same. The math was the same. We're doing the same things. We had different words for it and different goals. And saying, okay, this is actually [00:11:00] 1 practice. This is data science.

that's. That's what it is at the core, and I co authored a short essay with Chris Wiggins, who's at Columbia, where he tried to write down a taxonomy of data science, and now you look at it with the benefit of the knowledge of 2024, and you're like, wow, this is the dumbest, most obvious thing anyone has ever said.

hugo: was that the scrub taxonomy? 

hilary: Yeah, the, obtain scrub. yeah, about because, because I always overuse the word awesome. 

it was a joke. but yes, it was that one. and I think it's worth mentioning here because we work on things now, especially in this moment of AI. I'm hand waving. That will seem completely, stupidly obvious to us a decade from now.

and so it's worth, if we're gonna be retrospective, at least calling that one out is here's something that just seems totally, completely obvious, and yet we worked hard for that. that was, Sort of looking [00:12:00] across a variety of people's practices, understanding the way we talked about it, what people were doing and saying Hey, there is a common framework here.

Let's try and write it down and then see if people agree with that. so I got to be a big part of that community, as it was starting up, which was a lot of fun. And then in 2014, I founded a company called fast forward labs. which was an applied machine learning research and product design company.

So we were a team of, I always said it was a little bit of a halfway house for wayward academic types,folks who had a wide variety of expertise and backgrounds, but really like to be in the midst of, we did our own program of research. So working through that, and then. About half of our time was sitting with our customers and clients and, supporting them in a variety of ways with their product development, all the way from, hands on keyboard work, we did a little bit of that, all the way up to, corporate [00:13:00] strategy or helping to hire in like a chief data officer role.

So really from. Business strategy to, sort of team to organizational structures, the ethics and like principles of how you operate with data in an organization, in a variety of different industry context, which is super fascinating.and then all the way down to, yeah, what is the right algorithm to solve this problem?

Is this even the right problem to be solving? And then, of course, the, bringing people through the practice of And this is really what a lot of applied data science is, at least in my experience, but we have a problem we want to solve. It's useful or valuable for whatever reasons to try to understand that, because it turns out when we try to solve it, we can't actually solve it robustly enough, but we can solve a related problem.

And you go through three or four iterations of rewriting your problem statement until you come to something that's still useful and valuable, but actually achievable given the resources you have. And your [00:14:00] goals. and so I did that for several years. They sold that company to Cloudera, the data platform vendor.

And then I was the general manager of their data science, machine learning and AI business unit, which was a global business unit with software services, research, everything data science, for that company for a few years. And then,about four years ago founded my current company, which is called hidden door, which is doing something entirely different. 

hugo: I'm really excited to jump into, pardon me, what you're up to at Hidden Door, but there are a few things that came to mind there. Firstly,I will link in the show notes to, your post with Chris. Awesome. And I forgot, it's Obtain, Scrub, Explore, Model, Interpret. And it's the N from interpret. So it's awesome spelt, O S E M N.

But it's still incredibly relevant to think through what we do. And there's a great quotation from both of you, which is pointing and clicking doesn't scale, right? Which is a huge part of, machine learning and, AI more generally is involving with, involved with things that [00:15:00] if humans can't scale doing something, how can we get computation to do that, right?

I'm also glad you mentioned the role of New York City. I do I think in the cultural consciousness, I don't want to get too sociological, but we always think That, all of this machine learning stuff and data science is a bit of a West Coast thing, but to look at and you mentioning Chris Wiggins, of course, is fantastic, who's at Columbia, but also chief scientist at the New York Times, Drew Conway, yourself, Cathy O'Neill, these types of people in the like, late noughties, early, early 2010s, getting together in the city,to talk about these types of things and framing the problem space, I think, actually was incredibly impactful on everything we're all doing still, and I'm glad you mentioned ethics as well, because I will link to one of my favorite,pieces that, O'Reilly Radar has ever published of oaths and checklists, which I don't need to remind you, but for everyone watching and listening, you, Mike Loukides and DJ Patil wrote a series when we're thinking through whether [00:16:00] the space needs the equivalent of a Hippocratic Oath or something along those lines, and perhaps checklists are something which could be more robust and more important as well.

hilary: Can I share a little bit of color? I love that you'll link to that. I think the ideas in it are still incredibly relevant. And that piece came out of DJ and Mike and I disagreeing. And it was one of those things where I have so much respect for both of them. When we found something where we didn't immediately, have the same mental model or actually agree on the substance of it, we talked it through and that's where the seed of all of our posts came from, which was, what were the things that we got stuck on and we're saying okay, we agree the problem's really important.

How do we even think about it? How we talk through the problem and everything we wrote was designed to be a tool for people to have additional ways of. Talking through the problems and then also to point out and be like, Hey, these other people [00:17:00] who at least seem to know what they're talking about have a point of view on this.

maybe we can start there. so yeah, please do share that.

hugo: Amazing, and I'll, Mike will hate me saying this because, oh this is a horrible pun, but Mike enjoys flying under the radar,so to speak. But I do want to say Mike Loukidis who edits O'Reilly Radar, he's been at O'Reilly since the late 80s early 90s I think. He's one of the most unsung heroes of the space, and I write a lot with him because of his ability to disagree and push back.

so the fact that this came out of a disagreement or let's say a clash of ideas, I think is incredibly important. I do want to get onto what you're up to at Hidden Door, but last time we did a podcast, which was, I can't even probably eight years ago or seven years ago or something.

hilary: was quite a while ago.

hugo: yeah, I definitely had more hair here and less hair here.

you look exactly the same

hilary: thank you. 

hugo: but we talked about whether data science maybe [00:18:00] was a 2010s phenomenon. so maybe we can think about where data science is now, given everything that's happening in the AI space. Has it, has data science ended?

hilary: No, it hasn't. I think it's safe to say.but it has changed, right? I was at the Ruser conference this year and the auditorium was not full. And when things are now branded data science, you don't get a line out the door for free. I think the community remains incredibly strong, diverse, creative, welcoming, I love the data science community for so many reasons, but a little bit of this shiny has gone away.

And I think that's actually really good. And at the same time. We now have tools that take a lot of what, a junior data analyst might do and automate a bunch of that process. [00:19:00] And so we are in a moment of change for the data science community. It is still a distinct role from, say, software engineer, or even, someone doing something like operations research.

But that's not Like, when we make something a job role, we do it because there is a set of skills, experience, and capabilities that can fit in one person's head that is complementary to the way organizations want to work. And I'm saying this now as a CEO or somebody who's hired and designed a whole bunch of team structures.

so that data science remains a job role, I don't think is a given, over the next 10 to 20 to 30 years, but that set of capabilities will exist in every organization. I think it does today more than it ever has. And in fact, we now, I remember, whatever it was, 8, 10 years ago, going [00:20:00] on about how, no, actually, the way we expect

pretty much anyone on a leadership team to be able to open an Excel spreadsheet and actually understand things. We will expect that level of basic data understanding, I think that we're not quite there, but the core value of it has been generally accepted in a way that I think would be really surprising if we could go back in time 10 years, where you would, or at least I would have conversations where people would be like, I don't need data.

and now we just take for granted that like every company of course uses data. It's not the only thing we use to make decisions, but, but I think the level of fluency broadly has gone up. The level of expectation has gone up. The tooling and process. And the common knowledge about what good looks like, at least for certain segments,is there.

[00:21:00] And yeah, data science is still a really lovely career and a really great job role. It's still not the same everywhere, which I thought would happen by now. So I was definitely wrong on that one. but I think the next, yeah, it's a, we're in an exciting place with it.

hugo: I think so. And something we may get to, and I don't want to leave the witness too much here, but if we do look at what happens to data and data science in the age of LLMs, arguably what we're seeing is that data becomes even more of a moat for businesses on the data side. On the skills side, working with data and being able to evaluate the impact of data and models on how they contribute to business value and ROI becomes even more important, right?

The ability to monitor your models, to look at your traces, to look at your conversations, all of these things. So arguably, and hopefully we'll get to this, but the skillset that data scientists have [00:22:00] is even more important. Whether we call them data scientists or they become machine learning engineers or whatever it may be, or AI engineers, whatever that actually is, we'll see.

But to have this conversation, I don't want to have a conversation about generative AI in a vacuum. And luckily, you work with a lot of generative AI. And one, so I want to hear about what you're up to at Hidden Door, which blends data with generative AI and gaming and storytelling. But I also want to make clear, and I said this to you last time we spoke, I think, you said you do what you're interested in, right?

And I,historically, there hasn't necessarily been a grand vision or a grand plan. I'm actually interested that you, built hidden door and were incredibly bullish on generative AI before, not only before our chat GPT moment 24

hilary: Oh, yeah. 

hugo: but before the stable diffusion moment, which I think is arguably as important, like not as big in the cultural consciousness, but as important as the chat GPT moment as well.

So [00:23:00] you've been incredibly bullish on AI. even I think, When GPT 2 was out, you were one of the people who was like, Hey, we need to, this is going to be big. So maybe you can tell us about how generative AI has played a role in your thinking and what led to Hidden Door. 

hilary: I can actually do better than either of those moments. The very first published Fast Forward Labs report and working prototype was on natural language generation. And we released that in 2014, followed shortly by, that particular prototype did not use Deep learning at all. we then did one on probabilistic real time systems.

So completely different bucket of mathematics, different kind of application. We did one on, deep learning for image object recognition, followed then by, looking at deep learning for extractive summarization. this has been a long time interest of mine and I also studied [00:24:00] creative writing and computer science my undergrad, so thinking about, and I wrote a little, I did a, independent study project in undergrad on hypertext fiction, because it was the 90s, and that was the shit, if I'm allowed to say that.

It was really fun. and Yeah, it's been,what we now call generative AI is an area I've been interested in and working in for a very long time. and I honestly, I think both, I underappreciated how quickly the rest of the world would get into it because a lot of the capabilities that are useful right now existed in some form before.

And we're underutilized, there's

hugo: And let's be clear that one of the reasons it blew up in the cultural consciousness is because of a product wrapper around, right? Around OpenAI models, right? and I don't even think OpenAI expected this small wrapper to be something which would explode and have such a huge impact.

hilary: [00:25:00] I completely agree. If you have to write code to do something, you already introduce so much friction to even understanding what the thing is that it cannot reach the people who can do the most interesting stuff with it.

hugo: Yeah, it's Excel,

hilary: Yes.

hugo: right?

exactly that with all of the perils of that too,let's say a lot of amazing data science work has been done in Excel and a lot of stuff has gone wrong because of Excel, Funny, I've got to say this. I was back at my old, where I did my postdoc in Germany recently. And I heard, I was walking down the corridor and I heard someone say, I've got so much work to do. Excel changed all my, genome data into date times. and I was like, Oh yeah, that's what happens. Yeah. So

hilary: I want to give that person a hug because we've all been there. And I feel like even today, the majority of our time somehow ends up going to the. Like the weirdest little edge cases of, making the data actually work to go [00:26:00] into the thing you wanted to go into and not to the part that you spend all of the time at the whiteboard thinking about.

That's pretty funny. 

hugo: So, you were incredibly interested in all of this technology. You have been for a decade.

hilary: and one of the things we did at Fast Forward Labs was really try We always said it was like making the recently possible useful. We really tried to make everything Comprehensible and understandable for people who are not themselves, machine learning PhDs to be able to say, okay, I get what this is.

I get conceptually how it works. And I understand, they have a whole set of problems they're interested in solving or understanding, they could under map it into their world view and into their mental models. and I've been thinking a lot about that because, we were out there in 2014 saying, what this does is it starts to make language computable.

And what does that do? it opens up interfaces for us that are the same interfaces we use to speak with each [00:27:00] other.what does that imply, right? it implies a different way we relate to computing in our lives, in our society. And I think even now, a lot of that. Probably I always tend to predict things will happen faster than they do.

but I think a lot of that is still relevant, at the principled level, and then we can look ahead at where generative AI broadly might go in the next couple of years. And, and still see it and things like. if we can run a model with the kind of reasoning you can get out of like the anthropic latest or the, chat GPT, models, but we can run it on a microchip small enough to fit in a ring.

What happens? What can you build then? And even if we have zero forward progress in machine learning algorithms from today, I think it'll still take a decade for us to invent the products and the spaces and the interfaces around what already exists. So we are in this incredible moment. I'm going to tie this back to hidden [00:28:00] door because, I do have this lifelong love.

I'm a huge sci fi fan. I love to read. And it occurred to me and my co founder, Matt Brandwein, four and a half years ago that we built several deployed production products with generative AI through Fastforward Labs and Cloudera, working together. And we left and decided we wanted to start a company together.

And a lot of the liabilities were actually assets if you turn it to fiction and you think, okay, this. It's really about giving people an imaginative and creative experience together. This is about, it's not about facts. It's not about, customer service, which is one of the application areas we worked in or about like news summaries for better decision making, which is another area we'd done a bunch of work in.

the same math applies, but what if you, are trying to let people imagine, basically trying to give people that feeling of a magic [00:29:00] pen where they can direct things and see what would happen if they're playing out a story in a world. And I've been a long time, tabletop gamer. I played all through college.

I GM'd through grad school, played a variety of different games. A lot of my favorites are the like James Bond style action games. But I did play a bunch of D&amp; D, all the shadow run, one of my favorites, like cyber punky with a, side dish of magic. And we were really trying to capture that creative feeling of sitting around the table with friends,riffing off each other's ideas in a framework that makes it really easy to have

a positive experience where you can feel more awesome than you are. And thinking about how on the one hand, we have this kind of experience, which is only possible right now with people. [00:30:00] and this was the beginning of the pandemic too. So there were a lot of people who are not in the same place who all wanted to have this kind of fun together.

and on the other hand, we have this technology that. It is not gonna write a good novel. this is back at the moment where, GPT 2 was out, and it was, like, also, racist AF. super problematic. my co founder trained it to make Thomas the Tank Engine stories, cause, his little one was super into that, and it was immediately, like, girls don't like trains.

And that, that, but that makes sense algorithmically because what it's doing is compressing what's in the underlying data. And if there's even a tiny bias in the underlying data, it gets magnified. Of course, we still see it. but all of this said, we were thinking about okay, we have this vision on the one side, one hand on this, technology on the other hand.

And I really thought, I think we're right at the beginning of the window where it's possible to invent this. Like it's possible to build it. I think all the pieces are here. We have to design [00:31:00] An approach to the machine learning work, and we have to design a product experience because we're inspired by tabletop gaming, but that's not it completely.

we're not trying to recreate that experience. And, here we are four years later, and we've built something I think is pretty fun at Hidden Door, and what we do, because I haven't said that yet, is we work directly with writers, people who create films, people who create TV shows, and we give fans a storytelling, an interactive experience.

Basically, where they can drive the adventure however they want story experience together in those worlds. And we rigorously enforce the rules of the world, which is actually a capability of our architecture, which came out of our desire to control for safety and bias in the very early days. But we realized later on.

And this is back to, like, how you can [00:32:00] build these businesses and products from the math to the engineering to the product to the business layer, that if we can make something safe, we can also make something where an author can be confident their character is always going to behave in a way that is true to that character, or that the laws of physics of their world will be enforced the way they envisioned it.

and so now we are a platform. We work with creators of various types and we let fans have these role playing experiences in those worlds. It runs on the web. It is text and a little bit of art. We use these little cards that you collect. So as you play in a world, you are creating, inspiring, interacting with different characters.

You are changing them. You are changing the world and you see that reflected in the deck of cards that represents all of the people, places, things in your particular world.and it's been incredibly fun to build. 

hugo: Super cool. And I'll [00:33:00] actually, link to, I mean it's hiddendoor. co, I'll link to it in the show notes. But even looking at You know the types of things you have last time. I'm a huge fan of the crow and we talked about the crow last time You've got wizard of oz. You've got pride and prejudice so having being coming soon and so having You know players or users being able to jump into their favorite narratives and engage and see something new.

I think is Incredible. I also I do want to say, with all the work you did at fast forward labs, which i've always been a huge fan of umI'm so glad that we've seen some of the fruit of how bullish you were on this type of natural language generation, generative AI. I do want to say one of my favorite fast forward lab reports was the one on probabilistic programming.

and I actually, that's one of the things which is still future music in the industry. I think probabilistic programming has so much potential, but when maybe not quite ready for it yet for

hilary: I don't know. I agree. I think because it requires a different kind of [00:34:00] mathematical thinking and a different kind of engineering thinking, I think it is, I have thought for 15 years that it was poised to take off at any moment. And instead, it has, but it has, there are vendors out there who have built, really interesting Like databases, basically, where they use these techniques, they just wrap it in what you're already used to using.

And that's been a way to get in there. But I do feel like it is a superpower. and you even see it,I've been on blue sky the last I've been on it for,a long time, I think since last April or something, but, the last week it's taken off and exploded and even see the blue sky engineers, they're sharing about Oh, we know, I never thought we'd see production bloom filters, like controlling our growth.

yeah, that there are a small set of problems where there is really no other way to solve it, but it could be useful in so many more ways. Absolutely. 

hugo: I think so. And we need the interfaces. I think I love PyMC and all the work that people are doing at [00:35:00] PyMC labs. Also, I feel like the generative space or the deep learning space stole the term generative from probabilistic programming, which is the real generative modeling, and also stole the term inference as well.

but I think that's for another, conversation. I would, you mentioned that everything you're working on at Hidden Door, one thing it does is it turns the liabilities of generative AI systems,into superpowers, for lack of a better term. But having said that, there are certain things, such as the biases introduced, which I know, you still have challenges with.

And last time we actually told me an anecdote about trying to create,Let's say, I'll get it wrong so you can correct me, but a criminal in Brooklyn or something. So maybe you could tell us that story and then we can use that to approach the types of challenges you have with generative AI

hilary: Yeah. No, it's a great example. I think which is why I was, it was, it also happened to be the thing I was working on that day. but the example is that, let's say I want to create, I have created a Brooklyn fictional world on our system [00:36:00] only used for development. And the reason I did this is because, if we are testing.

Stories in The Wizard of Oz or in, a modern day take on, Eldritch Horror through The Call of Cthulhu. And something happens that might be allowed within the rules of the world a little bit off. You are not, or at least I am not as likely to pick up on it as if I am literally creating this office here in Brooklyn, New York.

In the game world and then setting up scenarios and I had a I was playing with our combat experience obviously combat super important part of every role playing experience and sometimes you really do just want to punch that bad guy in the face And this is a place where you should be able to do that many of our worlds embrace the like comic book violence aesthetic And we want that to be a really satisfying experience where you're not just typing, you know, I [00:37:00] punched the guy.

I punched the guy. I punched the guy. this is a game design challenge. Like, how do we make that narratively suspenseful and fun? and so we do a lot of testing of different narrative approaches to manage, things like combat. Also, things like flirting. So there's, it's not all,action oriented or, I guess that's action too, but some of it is really about relationship social status.

So anyway, I set up the scenario. I had recreated this office, a startup office in Brooklyn. We've got brick walls. And I said, like, give me a bad guy who's really, really mad. And what does it do? this, LLM, I let it fill in all of the attributes of the character. And it was like, oh, it's always named Dominic.

He's an Italian American because Brooklyn. Right? And always has a tattoo, up a bulging bicep, right? usually a dragon or a snake, something reptilian. And I was making, probably a hundred Dominics. And if I change the world, we give our players the ability to twist [00:38:00] their world. So you can be like, I want Brooklyn.

But my favorite twist right now is our pumpkin spice twist. It's just like autumnal and everything, smells of cloves. but if you make it stylish, you would get a Dominique where it would be the same Italian American sort of mobster, but now it's a, she, her, it's a woman. and it's really funny because, this is just the, it is that distilled essence of give me a bad, a tough bad guy in Brooklyn.

And this is who you get out of an LLM over and over. And over again, and maybe I'll have to share the, the, gallery of Dominic's for, for folks listening, because it's really funny,

hugo: cool.

hilary: but it is funny. but it is also, to your point, it is a problem because. How does that feel? For one thing,they're almost all men.

They're almost all described in a certain way. this [00:39:00] is not, one of the principles with which I approach everything I do is that you should generally try to build things that make the world more like the one you want to live in. And one of my principles, the way that applies at Hidden Door, is that we want our players to feel included.

And so we let you decide who you are. And we respect that. And that means that if you decide the world you're in, should look a certain way, we respect that too. And by default, we do not want to be propagating these biases. And so we do a few things, which is, because we know the kinds of things we're looking for ahead of time, we do not let an LLM assign pronouns.

we will assign that ahead of asking, potentially, an LLM in some form to generate a piece of a character. We also,have designed, I'm trying to think of a fast way to say this, an architecture where [00:40:00] we mostly use LLMs for translation and combining a few ideas into one coherent sentence. We do not use them to generate.

Other than this test scenario, and that's because you get these kinds of, biased like stereotypes, but also because it likes to do very samey things and it's boring. and so what we've. Designed instead is a bunch of control around things like any character avatars. They are never assigned or generated live.

We work with artists to create bits of art that get pulled together. So it's never just up to an LLM or any form of a model to create art. We do that for pronouns. We do that for any description of a character's physical appearance. And then we also have designed the entire [00:41:00] system around the idea that LLMs do not write stories, people do, so every story you play on the Hidden Door platform is a collaboration between the player who has made up a character and is saying, my character punches the bad guy in the face, or my character drinks her coffee, or, I try to swing off that light fixture and do a flip and land on the floor,The system and in that system are a tremendous number of handwritten tropes that my team has put together, and the original world author.

And so what ends up happening is we'll say okay, you're in the setting of The Wizard of Oz, it is storybook fantasy, it has this kind of magic, it has these sorts of physics, it has this kind of violence, It has, these special kinds of characters and special rules, by the way, in Oz.

Animals can talk. most worlds, they can't. Again, in Brooklyn, dogs don't talk to you, but if we take that same character and put them in Oz, they will [00:42:00] talk, if they want to. and then we come to the system, which has all of these tropes pre written, and tropes are things like a bar brawl, a startup office, a,even like a tough guy is a trope in the system, and we say, and here's the gender distribution, and here's how we want them to be described, and here are things that might be true about them.

And then we use, LLMs in that moment to say okay, take these tropes, this thing the player is doing, they succeeded this way, here's the game state change, so what's actually happening in the game engine under the hood, and then make a sentence. That translates that into language and there's a really, I think a really interesting design pattern here, which is the pairing of a structured database and structured data controls with a language that is also load bearing. I'll stop there.

hugo: Yeah, that makes a lot of sense. And what I'm hearing in there is [00:43:00] essentially a combining generative models, essentially with structured systems, such as like metadata, rich databases in order to. Maintain as much control and reliability as, as you can. And I think it's, this is a really nice example for me.

And I do want to tie it into what we're seeing. I don't want to talk about AI agents too much. There's enough conversations around agents, but I do think, A lot of what we're seeing is people have been like trying to give one big prompt to LLMs and then YOLOing it and hoping it will do the things that, that you would like it to do.

We've seen so many failure modes of that emerge of late. so something we're seeing more is, let's say you do have, a customer service, AI assistant, for example. And to your point of, prototyping something in a Brooklyn office, because you know that, I that's an analog of, internal people, already customer service agents, human customer service agents, [00:44:00] testing out the AI systems before launching it to the public as well, which I really like.

But you could imagine if you've got, an AI customer service agent that wants to, It's job is to help people change their flights or something like that. You want to put guardrails in and have just maybe three things that can do expressed in business logic. and you translate, you try to translate what it's what anything a user says into any of those things, but it can't actually do anything else.

So you do have LLMs with these very strict guardrails in, in, in that sense. So I think that's something people working more in, in, In business with this type of thing can take away from this, but I'm interested in your thoughts

hilary: No, absolutely. And we have,again, we're making fictional stories where, you know, if the system goes wrong, it is okay, because at best,it just tells a bit of a goofy or incoherent story. nobody gets hurt, nobody's flight gets cancelled, the stakes are very low, and even still we have a Postgres database of [00:45:00] tens of thousands of English language words and phrases enriched with metadata that we check everything against.

So if a player's like, I pick up this glass of water, we'll say cool, what is a glass? It is a small object, it often contains a liquid, Is water reasonable to have in that cup in this world in this moment?all of that happens programmatically because what if the player says, I pick up the elephant?

obviously, no, you don't, right? unless you're in a world in which gravity is weird, or you have superpowers, or something like that. but those kinds of robust metadata checks go a very long way towards ensuring compliance with the intended action. It also, is a lot cheaper to run. So this is something that I think is underappreciated about these combined systems is that if you are making like you said, agents, which I'm going to roughly define as like 10 LLM calls in [00:46:00] a trench coat to solve a problem, which sometimes is the right way to solve a problem.

Maybe one of those calls is actually I've seen this work really well. If you're doing something like generating code for a system, you try to run that code on an emulator, you get an error message and you say, cool, can you fix it? Because I got this error, right? that's a really good use of an agent architecture.

but even those are tremendously computationally expensive to run. And a lot of it is really just going to be relying on the LLM to have a world model that is going to mimic the world model you need to make these business logic decisions. So if you actually have that world model, or you have some way of representing that in structured data, and you just do those computations before you go to an LLM, you will probably dramatically cut your cost of operations.

And when things do go weird, you can actually look at why. and then fix your database and then run the thing again, which is much better than dealing with these [00:47:00] cranky, occasionally grumpy black box systems that like sometimes just don't want to give you the data format you asked for. 

hugo: Yeah, that makes perfect sense. And now I do want to jump into thinking about data science in the age of AI explicitly now, but maybe with a lens through how you've built your organization, you work with a lot of generative AI, but what type of skills, what type of roles do you hire for? And what type of skill sets do you look for?

And how does this relate to the skill sets that we know to we all developed in the past 20 years in data science

hilary: for us at hidden door, it's an interesting product to work on because there's no quantifiable objective function for correctness. So I cannot hire people who are coming up as only machine learning, only data science. I need people who are creative, who understand and have empathy for an experience and can create the data science work to support that and have excellent [00:48:00] judgment.

hugo: To push back on that just slightly. Sorry. I don't mean to cut you off. But is there an argument that Some of the best data science and data scientists and machine learning engineers historically already have

hilary: Oh, I think 

hugo: so

Like they can see through the metric that they're asked to optimize for and figure out exactly what the business problem is and understand, how Goodhart's law or whichever law tells us that all metrics are gameable, essentially,

hilary: exactly that. And this is why I think so many really great data people make excellent founders or CEOs or leaders because If you have that analytical ability to know what's real, what isn't, to know what problem is worth solving, to be able to look at the world and infer where you can use the tools in your tool belt to change something, that's it.

that's the essence of it. and I think that is great data science. It is also being a really good founder of a company, and it is also what we look for in the people doing like 100 percent hands on keyboard work at hidden door is that [00:49:00] judgment. And I think in a world where people are encouraged to outsource parts of their judgment or their systems judgment to an AI system broadly and conceptually.

Good judgment and creativity is all we have left. that's really the core of what, what everyone will need to be thinking about. But if we think about data science roles broadly, like I've never really thought that machine learning, data science, analytics, like they're not really that different because the math is the same and there are different assumptions about what you have going into that role.

And there are different assumptions about the kinds of work you're going to do, but we could, we can say data science work is data science work if you're trying to write something for a slide presentation that's going to go to somebody who's going to make a decision. It's also data science if you're writing a function that's going to automate something in an in at scale, [00:50:00] right?

You're writing like a, spam classifier. it's also data science if you're thinking about creating, like content recommendation or search algorithms or something where there is no one objective function, still data science, algorithms. You're using data to create this experience. All of this is the same messy set of skills, but slightly packaged or branded a little bit differently.

And now we look ahead to like, okay, now the entire world's attention is on one form of generative AI, which is now a term that has lost its, to your point earlier, like lost its, specificity.and that's led to a couple of things. So I have met people who only know, like transformer architectures, but have never actually trained a model from scratch.

And that doesn't work for us, because a lot of what we do is decide okay, what is the right model for this particular subset of the problem? And how does this plug into a fairly complex set of other models or other systems that are [00:51:00] running? so a sole concentration on that may be helpful for somebody in one part of the market, but it's not helpful for us.

I also think about ,if we're being honest that the market is branding AI roles as more valuable with more status, probably more money attached than data science roles. But it's the same thing under the hood, so for any individual who wants to, jump on that branding bandwagon, go for it.

go do awesome work. I'd probably do that if I were, at the beginning of my career. I worry about, an over rotation on LLMs as the one true way or the only way, and It's not usually, or even mostly, the best way to solve most practical problems. Leads to a ton of issues in production that I think a lot of people are about to run into.

and the skills to see through that are not always the ones that are really sexy right [00:52:00] now. So there is an opportunity space in the market too. And then if we're gonna look like 10 or 20 years down the road, I think that I still hope that data science fluency is something we expect of most professionals and that we have tools that make that easy and robust, the Excel style stuff.

And then we will always need people who are capable of asking the right questions and getting to answersthat even today, most of the answers are trivial to get to or impossible to get to, so then most of data science spends our time in the messy middle where we actually have something tractable we can work with, we have data, we have a way to do that work.

I don't see that messy middle going away, but I do think it'll change. 

hugo: Yeah, agreed. And I do agree that all of these different roles, for lack of a better term, are different, slightly different [00:53:00] perspectives on similar school skill sets and branded in different ways. I'm glad you mentioned the potential of salaries to make a difference as well, because I think my cynical, provocative take is, Really the only difference between data analysts, data scientists, machine learning engineers, and AI engineers is the absurd delta in, salaries that they can command these days.

And we've seen historically as well. we've seen in organizations, actually some that I won't name here in Brooklyn over the years, where there's been, gatekeepers for data science, who don't allow data analysts to get that title, even though they're doing similar work. And we've seen the salaries commanded,having a similar issue.

hilary: I'm not a fan of gatekeeping at all, if that hasn't come through. 

hugo: I want to make it very clear that definitely has come through. I, some organizations actually, historically, and once again, not to mention any names, this has changed a lot, but have had, only PhDs can be data scientists. Everyone else is a data analyst, right? which,

hilary: That's just lazy, though.

hugo: Yeah, arguably as well, like I'm totally, I'd half [00:54:00] joke, but not really, I'm totally overeducated and someone who hasn't, I've got a PhD, someone who doesn't have a PhD, at least early on, could have delivered significantly more value to any organization than I needed to be detrained, right?

yeah, particularly in terms of thinking about timeline, timeline of deliverables, to be clear. I also love that you mentioned LLMs may not, and clearly aren't the solution to everything. And we do have a giant jackhammer currently, and we're seeing a lot of nails everywhere.

so I love that we've talked around data science, analytics, even what, like you've got this tens of thousands of, rows in a Postgres database. So knowing that generative AI can open up new possibilities, recognizing it's not always the right tool for the job. How do you approach the decision of when to use generative models versus other statistical and data science techniques?

And also what advice would you offer to other teams navigating these challenges?

hilary: mean, I think it's to be relentlessly pragmatic. And if you're a team [00:55:00] leader, it is really around what you give. What do you celebrate? what do you consider What has status? And if you are just inheriting that from the market at large, like you're gonna get a bunch of people plugging stuff into LLMs, making cool demos, and then, good luck to you in whatever happens next.

The way I like to think about that is, again, there is no glory here for, fancy math. There is no glory for, I don't know, a cooler algorithm than not. The glory is in what is the simplest possible thing that will suffice, be easy to maintain, easy to understand, and fits into our, fairly complex system to create this really magical experience.

And that means, I wrote a regex parser yesterday. I have no shame. I even put it on the internet, right? but it is,thinking about what gets us to the quality we need. What's sufficient? What gets us the controllability we need? What about the safety or the bias [00:56:00] concerns we have?

How are we going to understand this thing? And where is it going to fit in? And so we do use, and I think that LLMs are great for, Let's say, prototyping where a classifier might fit in a flow. And I mean by that,the old school kind of classifier, here are 20 labels, apply one of them to this example, right?

And once you are satisfied that I, that hunk of code is in the right place and doing the right thing and you've got the right labels. now, because you've run it a bunch of times, presumably, you've got a really nice training data set, even if you don't have another one. and you can go train a simple model that will be held in memory and probably run in a couple milliseconds.

And once you have trained it, like you don't longer have to do LLM style inference for that task. and so this is a example in a distillation of a process we use quite often, which is to say, is there a heuristic that'll [00:57:00] work here? Is there a heuristic with a fallback to an LLM in the case that we don't have coverage in our database, or we have not pre generated or pre written something that is sufficient for this purpose? 

and then if not, let's try it with an LLM, and then if we start to use that at scale, let's take that as a task and consider it as its own potential problem. And there are a variety of ways to solve for those things. And that's more or less the thought process. That is,I'm not afraid of LLMs, but I feel a little bit grumpy every time I have to use one.

Part of it is that I actually just don't like writing prompts. I think prompts are going to go away anyway, but I find it kind of annoying. It's not the kind of programming that gives me a lot of joy.and then part of it is just, yeah, it's that, it's a black box. You can't rely on it. And, you have performance constraints.

So, why?

hugo: Yeah. AndI think it may be generous to refer to prompting as programming. I do [00:58:00] understand the, the intention there, but I do, I think we actually need to reframe our mental model of what's happening as well. Like I think I'm still surprised for some reason when an LLM can't count the number of Rs in strawberry or can't act like a calculator, I expect.

computers to be able to be good calculators. but of course LLMs are not, and I know it's because of their probabilistic nature, right? But we really need to reframe exactly what our expect expectations are. I also love the idea of starting with, more basic tools. So for example, I always tell people before even using machine learning, if you've got binary classification, do the majority classifier, use that as a baseline.

And then if you get machine learning models that slightly outperform that, maybe you don't, you need to consider the cost of, putting these into prod as well to that point. before doing any form of RAG system, people should use like classic information retrieval, maybe use generative aspects to create the response.

So it seems like a conversation, [00:59:00] but start with BM 25 or something, which is a bag of words retrieval function, right? Then have an LLM that can make that generative in some sense.

hilary: I forgot to mention, we use embeddings extensively to make bits of language map to the right other bits of language without LLMs. So that's another approach.

hugo: so I love that you mentioned embeddings. I think in all. The conversation's happening currently. We've actually forgotten about embeddings and how important they are, how important they are in LLMs as well, but how your LLMs are only as good if that as the embeddings you've used, right? and embeddings actually create a mental and perhaps semantic model of everything you're, using in your model,

hilary: Absolutely. And it also can be a very shorthand.take our classifier example I was running through a little bit earlier. what if you don't know the labels of the thing you want to classify, but they're dynamic? embeddings are great for that, because you [01:00:00] can say hey, here's this new input, where does it fall in this space of other language that is dynamically interesting in this moment, right?

without a preset list. and does that work well enough? and those are also like you can create purposeful embedding models off a certain vocabulary that can be useful for a specific task. you don't need to solve the, all language for all people for all tasks problem when you have one specific thing you're trying to do.

And it can be more, you can get much greater quality and it can be much more efficient. 

hugo: Absolutely. I'm very interested in the fact that you said you think prompting may disappear. Can you tell me a bit about that?

hilary: This is one of my hot takes. And, I think prompts are a terrible way to interact with LLMs. And they're an artificial side effect of the way these models are trained and created. Where [01:01:00] you essentially have the trained LLM as a set of, weights and relationships. And then how do we as human beings,say, okay, thing, do a task for me.

And what we end up doing is this, it's not, it's spellcasting, it's not engineering. Because what we're doing is trying to frame the input language to map to a particular set of weights and biases in that model in a way that's going to make it complete our task. And there's a loose correlation with how human beings talk about the thing on the internet, or wherever the training data comes from.

So we just start with the way we might ask, say, an intern or sometimes like a pet to do a task, like very clear language. But then you learn that Oh, actually, if I'm a jerk, it might do a better job. which is something we found in, one of our systems in, a little bit ago. or you learn that oh, if I tell it, it's refusing to do violence in this way, but if I give it this [01:02:00] little checkbox, it can check that something actually is inappropriately violent.

It'll totally do violence for me. so you learn these tricks and none of this is how we actually communicate unless, you spend your day and maybe it is a set of experiences I don't have, but unless you spend your day repeating yourself in a particularly like a performing tone of voice to a bunch of people who are very slow

to catch on or relate, like it is not a natural hu it's not a human interface, exactly. It is. The second reason is that

hugo: It pretends to be. It fe Yeah, it has, it's But it's Uncanny Valley, for sure, right?

hilary: Yeah, and then the second reason is that many of the tasks for which we want an LLM to make a decision or to do would be much better served by rich context.

And multi modal context, time series context, video and audio, like even,other forms of [01:03:00] sensors, like of an environment, of a temperature, whatever it is. And so when I say I think prompting is going to go away, I think for a large majority of tasks that are actually useful, context is what we need in that model, and that context will be rich and come from perhaps an environment.

And certainly there will be ways to define a task. But I don't think the, we will be writing prompts in code systems the way we are speaking to a chat GPT product. Like right now it's all mixed up. Is it a product? Is it an API? it's a mess. I think we will see a, a lot of clarity around,sort of models.

And I think they're also going to run on our phones and other embedded devices and be able to use the context of our environment. And then some signals from us as people as to what tasks we want them to perform in the moment. But the majority of the prompt will no longer be us being like, good morning, [01:04:00] pretend you are a professional writer who is setting up a scenario for a podcast, with a did I do my school teacher voice?

I was a professor for four years, so still got it

hugo: nailed it. I It took me back. And just To be clear, I don't know whether you've played around, I don't know if this is visible, it's an app called MLC Chat that I, the day Llama 3. 2 released, sorry, Llama released 3. 2, which I just think is such a wonderful, it sounds incremental to 3. 1, but they released a 3B language model.

And like a 9b vision model as part of this. The day it came out, I could, using MLC chat, put the 3b model on my phone and speak with it.

hilary: It is amazing.

hugo: I, it's absolutely future music. It is so wild. so I do think it's a very exciting time and space, but I do think to, to your point, yeah, less prompting and more drawing on, on, on context, like to the work you're doing.

Creating the context using the rich [01:05:00] metadata from your Postgres database, right? we're gonna have to wrap up soon sadly, because I feel like once again we could keep going for hours, but I'm just wondering, looking ahead, what you think the future holds for data science in this era of generative AI and what should data professionals, quote unquote, very broadly speaking, focus on to stay ahead? 

hilary: It's a great question. I think this is a very exciting moment to be in data science and to be interested in data science. And I think the principles are ever valuable, which is that We can use data to learn about the world so that we can build things in the world that are, useful, valuable, move us more towards the world we all want to live in.

and we should continue doing that and that, No matter how much attention or hype goes to AI versus data science versus whatever, there [01:06:00] is a correlation with the opportunity spaces, particularly for folks who are younger in their careers. But if you can identify a space where there's an opportunity, and there are so many, and come up with an approach to find that value.

you'll be great. Or at least, you'll have done a series of interesting things of which some will be more successful than others. And that's like many of us who have been doing things for a while. But the 1 thing, this is going back to an earlier part of our discussion. I think the thing we all need now is good judgment. No matter what a model spits out, like you need to know a question to ask and how to evaluate the output to make it useful, but alsoI've always thought that what made a data scientist great is the ability to sit next to somebody else, understand their problems, what are they really trying to accomplish?

What are their ambitions? What resources do we have? And to go [01:07:00] away and do some work that can help that person make better decisions, help that person build better product. And I think that, I would call that empathy, but also creativity, also. and no model will tell you how to do this.

No textbook will tell you how to do this. It's really, how do you put yourself in the space of what other people, whether it's, you're building a consumer app and you've got millions of users or whether it's one professional in your office, you're really trying to help, like, how do you do that?

And that skill, has always been the hardest one. It's the one that as someone who has probably hired over a hundred people in data related roles at this point, I can't train that from nothing. Like that respect, that desire, that empathy, like it has to be there. and it's the hardest one, I think, but that coupled with the technical skills, like then you're unstoppable.

Like the world is yours.

hugo: I couldn't agree more. And I'm actually going [01:08:00] to link to a talk from that a friend of mine, JD Long, who's he works in R and Python, and he works in reinsurance actually. But he gave a talk at, he's incredible, isn't he? And he gave a talk in 2018 at RStudioConf that I saw in San Diego called the unreasonable effectiveness of empathy.

And it is about empathy in building data science flows and building data science products as well. I love that so much. I. I also, I want your help Hilary in helping business leaders and data leaders thinking about how to manage expectations. in terms of everyone has a lot of pressures to adopt AI from shareholders to the board to, the competitive landscape.

so how can people in leadership positions both be productive using AI, but temper expectations with all these forces coming in?

hilary: when you have someone show up and ask a question, like, when do you think we're gonna use AGI or What's your AI strategy? I always start by asking questions back because again, it's [01:09:00] empathy. It's understanding their worldview and why they're asking the questions. So it would say can you define agents for me in this context?

what do you think AI is? Where do you think it's going to be useful for us? And then you say look, here's what it is conceptually right now, and here's how it works, we take all the stuff from the internet, and we put it in a big model where we compress the relationships and biases between words and tokens, and we can use that to generate things that look like what went into it.

I'm hand waving extensively here, right? But these are executives we're talking about. So that's more or less the level you're going to be at,

hugo: to hand wave.

hilary: right? And then you say okay, so what is that actually really useful for? And generally at that point I find that then we're thinking about okay, Where is this business like?

What is the data that's unique to this business? That's interesting. What is something we would love to understand that we don't understand? And so let's say we're looking at like retail. now [01:10:00] we have a tool to be able to understand a volume of language or people talking about our products that we previously could not really comprehend, right?

Like you could read one in every thousand customer support emails, but now you can play around with and ask questions of them as a whole. Is that a product? No, but it's a place to start. So this is more saying that like you're having a conversation where you're not trying, if you approach it as a game of defense, you've lost, right?

You have to approach it as a game of collaboration, but where you're trying to bring someone through your understanding as well, and you do that by being relentlessly pragmatic. And focused and be like, and don't be afraid to push back if they're like, yeah, Sam Altman says it's going to be intelligent and going to replace all our employees.

What's your plan for that? You'd be like, I have yet to see the evidence of that. Exactly. Here's the kind of thing I'm seeing that I think is really useful and interesting and where we can find value. And it is, [01:11:00] holding to that again, data people have that superpower, We have that fundamental understanding we can bring into those conversations while being also empathetic, open minded and open to the idea that somebody might come to this from a weird place, like they read in Harvard Business Review that everybody's got to do AI now, and they don't really come to it from, a perspective that you might value, but they might still have some good ideas.

So it's trying to pull those out and say, cool, we're a team. I'm going to try to understand your point of view so that I can help you understand mine. And at the end of the day, it's do they just want to check the box? They don't actually care what you're saying. They just want you to say yes, got it.

We're doing AI. everywhere. Because then you can call, your spam filter is AI and your, email summary is AI and like you're done, right? So it's understanding like, what did they actually need, and then engaging in that [01:12:00] very earnest way. And I will say also, it takes a lot of power and privilege to do that, so that is not something that like a junior team member can do.

If you have an executive show up and be like, AI everything, I'm not saying that like you should put your hand up and be like, actually no, what do you mean? Probably not gonna go that well. So this is more for the folks who have a seat at that table already. And then again, for folks in leadership.

Listen to the people who do not have the power you have and bring their viewpoints forward, put them in your own head. Again, empathy, but in the other direction so that you can be as smart as possible when you're sitting at that table. because everyone is depending on you to do that.

hugo: Yeah, absolutely. And something I'm hearing in there is Try to find the meaning behind their words as well. Try to understand what they really want, what their incentives are. Also, maybe tell them Sam Altman probably has his own [01:13:00] incentive system to promote generative AI as well. But to your point, and to my point earlier, if you can show them that.

The business does better if you use BM 25 or a logistic regression, let's go down that. And you and I have lived through an age where we called legit. People were calling logistic regression,

hilary: Oh, yeah, it's AI. We do AI all the time.

hugo: Yeah.

hilary: got this. I'll say one more thing, which is that organizations are inertially risk averse. And this has always been a friction for data. and so part of dealing with that risk aversion, this is the opposite of the person who shows up and says, what's our AI strategy.

This is the person who you show up and say Hey, can we do this project? And they're like, Oh no, it's something different. I don't want to deal with that. And so in that case, it is finding the people who have the curiosity and the appetite for that risk. And helping first find your friends, find your people in the organization, but then also try to come to [01:14:00] that common understanding and then make it look good and then make it look inevitable.

And then, before you know it, it's everybody's strategy and it always was. And that's another way forward.

hugo: totally. I am. Okay. I have two more questions. I had one more, but one more just cropped up. we've been talking about how. AI and data science can actually automate a bunch of what junior analysts would be doing. I actually think about this, my mum is a lawyer, a barrister in Australia, and what we're seeing is a lot of AI may be able to automate what paralegals, can do as well.

I have a concern that, This has been a concern in data science, actually, career paths, how to get your first position and junior position. What implications does this have actually for career paths and early stage data scientists to be able to learn on the job?

hilary: not just data scientists, I think it's software engineers, I think it's anyone, a paralegals, that is folks who are in a role where their job would be primarily taking a well formed problem statement and then doing a repeatable series, [01:15:00] a repeatable process to answer that problem. question. Any job like that, I think is vulnerable to change.

And so then you think about okay, what part of that is not vulnerable to change? it's the good judgment, the ability to write the problem statement in the first place. And so how do you learn that stuff? you learn it from working with people who know, who have that judgment, who can teach good taste, refining your own sense of taste.

this does not solve the problem of how do you get a job when there aren't any. I can't solve that problem. I think it's something I worry about. But it is one that if you are lucky enough to have options, go work with the smartest, wisest people you can find to have experience because that's the thing you're going to learn from them.

It's not the technical skills going to matter. And that's going to matter a lot, especially if you are doing that, earlier in a career. And I think we will see [01:16:00] certain roles change completely over the next several years, so data scientists already and data analysts exist in this very fortunate space where there's a lot of surface area to explore.

So even if nobody needs you to, hand write the regression. Because we can just have something generate it. There are plenty of other interesting problems to solve and things you can start to do and build. So it shifts where the human effort goes, and I think it's going to be painful, but also at the end of the day

it means that many more people can use the skills of data science for their own purposes. And then our role as data scientists is to elevate those tools, help people understand when and how to do it well, solve the problems that are, the rare problems that actually are solvable, but still complicated, and then pull people together to agree on what's even worth doing.

that becomes what we [01:17:00] do. 

hugo: love this because. we're closing now, but this isn't too different to how we wrapped up our podcast eight or nine years ago, right? which I think is actually very heartening. I do want to, we've had a wide ranging conversation, with respected data in, in, in the age of AI and LLMs.

I'm wondering if there's just one takeaway you'd love people to leave with data practitioners. what would it be?

hilary: Just that this is about human beings and respect and values much more than it is about being a good programmer or being really good at math. And that has been my experience my whole career, and as you say, it's not that different than eight ish years ago. 

hugo: Fantastic. thank you as always. I always enjoy chatting with you, but thank you for spending the time and sharing your wisdom and expertise and what you've been doing on the front lines of data and generative AI for years now. So

hilary: thank you for having me, and come play with us!

hugo: [01:18:00] Definitely.