The following is a rough transcript which has not been revised by High Signal or the guest. Please check with us before using any quotations from this transcript. Thank you.
===

[00:00:00] That's really what we care about is trying to get people to, to graduate from that thinking and think about using data in a way that is not a cost center, but is directly driving the profitability for their organization. That was Noah Broman, president of data, CRT, on the existential question of moving the data function from a cost center to a strategic value center.

How do Noah and his team make this happen? They turn analysts into Jack Ryan. So think about other industries and analysts like Jack Ryan at the CIA, right, or one of these financial analysts at a big bank that comes on TV and tells you, oh, I think a recession is coming, or whatever. What they're fundamentally there to do is not like the mechanics of how do you crunch the numbers.

They're there to think about how do I make the situation really clear and then what do I think is gonna happen based on that and what should we do about it? And [00:01:00] that's a cool job. That's like, that's the whole promise of going to work in data, right? That's what everybody wants to do. And all the number crunching stuff, something that we've had to, to crawl through and, and unfortunately has become.

It's such a, a blocker in many cases that it's become almost the job description, right? Noah is a veteran data consultant with over 15 years of experience spanning trading flaws and Silicon Valley startups. In this episode of High Signal, we explore that architectural and psychological shifts required to move data teams away from being ticket takers and more towards the Jack Ryan model, acting as high level strategic intelligence analysts.

We discuss how LLMs and agentic tools are automating the technical drudgery of data preparation and more allowing professionals to focus on problem framing, the ability to reduce complex business ambiguities into actionable insights. Noah also explains why AI amplifies your existing data culture, the importance of no assistance reporting to automate hallucination [00:02:00] filtering.

And how rebranding documentation as context can finally secure executive buy-in. It's a thoughtful and actionable conversation about moving the data team from a service center to an essential left brain partner for organizational decision making in the era of ai. If you enjoy these conversations, please leave us a review.

Give us five stars, subscribe to the newsletter and share it with your friends. I'm Hugo b Anderson. And welcome to High Signal, the Delphina podcast. Hey there Noah, and welcome to the show. Hey, thank you. Nice to be here. So, great. Well it's so great to have you here and particularly as you know, you've been working in data for, you know, 15 or or so years now from trading to Silicon Valley startups to seven years or so in, in consulting.

And I'm wondering if you can talk us through what the through line you've seen and how companies think about data and maybe even what hasn't changed over that time that should have. Oh, it should have interesting from, [00:03:00] from the beginning of my career, which as you noted, has bopped around a fair bit trading, and then I've been on more the engineering side and more of the data science and analytics side.

I think there, there have been several constants. One is that, you know, most of the work is the iceberg thing underneath the surface. It's getting the data prepared and understanding the data and all of that upfront. And then another is that, um, data's always kind of an odd duck, so weird one in the room and getting the relationships right, making sure that the folks at the top who need to be utilizing the data, understand where you're coming from, what value you can provide to them, and that kind of thing.

Is really one of the big, uh, levers for making it so that what you're doing actually has an impact. And there's this sort of little pocket of the visible work that everybody tends to think of and talk about if you're talking with people outside of data and then there's like the 80% of everything that goes on around it.

That [00:04:00] makes a lot of sense. And in terms of what hasn't changed, that should have something I'm hearing. There are two things I'm hearing in there. What that I find very relatable is that it, when that happens, people can start to think of the data function as a service center or something. Uh uh, along those those lines, it could also seem to the organization as a whole, as something that incurs cost as opposed to something that like de delivers a lot of value itself.

So I'm wondering if these are things you've seen play out. Yeah, definitely. We see that cost center thinking a lot, and one of the things that. We try really hard to do in our consulting practice, both because it helps sell the consulting practice, like that's obvious, but then also because our whole shtick is that we want people to build more effective data practices like that.

That's really what we care about, is trying to get people to, to graduate from that thinking and think about using data in a way that is not a cost center, but is directly driving the [00:05:00] profitability for their organization. And I think you, you tend to see this in two modes. One mode, which Duncan, I know you're super familiar with this one, is you build some algorithms, you test them in production and you can demonstrate, Hey, look, this is doing great things for you.

Right? That's your classic, easy for people to internalize mode because it's very much in this same product mindset that a lot of Silicon Valley is in already. And then the other mode is. When people really shift and start thinking of analysts, data scientists, as the person who is enhancing the left side brain thinking of the leaders in the organization and that that's the harder one to conceptualize and get people to wrap their heads around.

But I think it's really important because we have this long history of data, people describing themselves as like, well. And this Venn diagram of statistics and software engineering and other stuff. And like, I think, you know, you mentioned [00:06:00] something about Venn diagrams. Within Venn diagrams, like it, it gets very, um, convoluted and it doesn't tell anybody what the value is.

It just says, I'm weird. And they're like, yeah, I know. I, I don't know what to do with you 'cause you're weird. And so if you can graduate people from that and you get them to really start using you as. That leverage point then that that's where the value really starts to come from. So you are right. I did mention this last time and I'll find it for the show notes.

I did once see a Venn diagram for data science, which had drew Conway's Venn diagram as one circle or even inter intersection in the Venn diagram. Duncan, I am really interested in your thoughts on everything we've just been talking around as well. Uh, particularly 'cause. As we've discussed before at at Uber, like data was an ML was a core function and part of the foundation of Uber.

So people had bought in or already. But I'm wondering what challenges you, you had there with respect to building out the data function or what you are working on there and what you'd done [00:07:00] previously as. I, I think one of the interesting problems with data and the value of data is that it's actually so heterogeneous across kind of groups and companies, and so is for sure the case, especially that in like large technology companies that, uh, production.

Data models, machine learning models can drive extraordinary value, and those as Noah called out are super easy to quantify, are pretty transparent, and they can literally drive billions of dollars of revenue. For big businesses, that's clean. What isn't clean is everything else, and especially as the data analytics or data science becomes kind of further from real time, more strategic in orientation.

And in businesses that maybe like slower moving and like actually just don't have the quick, quick twitch like muscle reaction to, to fast decisions you can make with data and the hyper targeting you [00:08:00] can make with data. The data becomes like way harder to kind of evaluate the ROI from that isn't say it isn't there, right?

Like actually I think of parts of management consulting as doing data analytics and data science that at a different scale and charging for it differently and talking about it differently. But it's definitely more foreign to like the Silicon Valley mindset without a doubt. I am also interested, I know you've got questions for Noah as well, but I do think AI can help us with this in a variety of ways.

So I'm interested in how you think about this with what you're building at Delphia as well. And then we'll get back to questions for Noah. Yeah, I mean, I think the exciting thing about AI actually like. So many of our customers don't even really know, like when they get started with Delphina about what they're gonna get out of it and what the value of using AI on data could be.

And it's really interesting and fun to see like the discovery over time of, oh, we could use this for the obvious kinds of like sql, the text to SQL kind of questions. Oh, we could use it for stuff that we used to require a [00:09:00] little bit more judgment and maybe financial analyses or really kind of deeper operational analyses.

And so I think that's really, that's kinda the magic of, of AI is partly like opens Pandora's box from, of what you can do with data. I still that said, do think that actually like the value of data is like, is pretty hard to actually really pin down. And that, that's something that our, I think, discipline in the world will probably struggle with for quite some time.

I'm, I'm quite curious, you just said value of, of data. One thing that pops up periodically in, in this world is people saying, okay, so what are your KPIs, like analytics team, data science team, what are your KPIs this quarter? That kind of thing. Have you seen that and have you found ways of going about that, that feel really healthy?

Or are you in the camp of like, if you have to ask that, we're probably doing it wrong? Yeah, I, I, among both kind of sides of that, I am in a bit of like the Andy Grove mentality of, I think that you can define [00:10:00] sound operational processes for, you should think of this as a process and you should think about, okay, how are we going to manage this process and what are our indications?

This is going well and not going well. And that can be some numbers that can also be subjective stuff. There have definitely been times in my career where I've been in organizations where we tried to like. Measure religiously how many different kinds of experiments we're running at any point in time and how many data pipelines were being run.

And I've been in promotion reviews where we've discussed, okay, this person built this many data pipelines and I am allergic, um, to like that kind of strict quantification in general. That said, you do need some way of knowing that stuff's working. And so I think at scale like it, it's not crazy to try to look at some of those numbers.

So I, I find that very painful. I also find that even worse in today's world with, with software engineering, right? Where like all of a sudden I we're all trying to measure like how many lines of code the AI's writing for us. And that's a terrible metric as we all know, but it's still the metric talked about the most on the internet, right?

That's the thing bubbling up all the time. So it's like a hard world of productivity measurement. [00:11:00] Yeah, that was a little there. 'cause I feel like I'm very much on the side of, if you have to ask that question, you're probably doing it wrong. Four of the reasons that I already said, which is we shouldn't be measuring, call it productivity of the data science group or whatever, because that that gets you in this cost center model of how many dashboards am I getting for the thousands of dollars that I'm paying you your salary, and that's never going to feel great.

But if the metrics that you're actually measuring yourself against are the operational metrics and like, did we actually help the marketing group to optimize their. Campaign mix this quarter, or did we actually help the executive to make the right strategic decision here? Then it's Oh great. ROI is amazing and that's harder to measure.

And it's also really hard because it's never just the data person's control, right? Like it's, you're an. But to me that's always the framework that you have to have if you don't want to end up back in this bad place of basically [00:12:00] measuring some proxy of lines of code or how late did you stay up at night versus dollars that are getting put into it.

I love those examples, Noah. And one question I do have around them is. They still frame data as something which can help existing functions in the organization, which I think is incredibly important. But I asked this question about AI as well. Just imagine if with the computer revolution and then the internet revolution, if we'd.

Merely said, how can this help the sales team or how can this help, you know, the executive right, we would've lost out on so many things. So I'm wondering what are we missing with data and AI and how can we start thinking and, and dreaming bigger of what's possible? Huge question. I'll speak to a tiny piece of it, which is how it works within the sort of analytics and data science space.

And that's not to this count that there are probably bigger things outside of that you look at. Most organizations have [00:13:00] these terrible problems with actually gathering the data that they want, and I think LMS have a lot of leverage there. And there's a bunch of things out there that I think are super high leverage.

But I'll point to the more we have really good tools to back up the data work, the more people can spend time. Thinking in the way that an analyst in a really traditional sense thinks so. Think about other industries and analysts like Jack Ryan at the CIA, right, or one of these financial analysts at a big bank that comes on TV and tells you, oh, I think a recession is coming or whatever.

What they're fundamentally there to do is not like the mechanics of how do you crunch the numbers. They're there to think about. How do I make the situation really clear? And then what do I think is gonna happen based on that and what should we do about it? And that's a cool job. That's like, that's the whole promise of [00:14:00] going to work in data, right?

That's what everybody wants to do. And all the number crunching stuff, something that we've had to, to crawl through and as unfortunately has become, it's such a, a blocker in many cases that it's become almost the job description, right? And so I, I see this, this wonderful world. In the future hopefully, where a lot of that gets solved and we're spending less and less time on that.

And yes, there's still technical skills and you're still thinking about how the numbers are coming about very carefully. 'cause otherwise you get the wrong answers at the other end. But then is thinking in that capacity of what's happening, what's gonna happen, what do we do about it? How do you, I, I love that, Noah.

How do we get there? Like how do we actually operationalize that, that data analyst, data scientist as Jack Ryan, that's, that's my t-shirts. I want one. You know, I, so I think there are a couple of things. There's removing that technical blocker where everybody's spending all of their time preparing data.

Basically, that's my iceberg thing from the beginning. If you've got [00:15:00] 80% of the work that's happening just to get stuff stitched together, I think everybody who's in data has felt this, where it's like the visible piece of the analysis that you do is actually really simple and easy, and it's the preparation and just understanding, hey.

Where did this row actually come from? What? What spawns the new row in this table? What does it mean so that you can reason about it? It's hard and it takes a lot of time. I think there's a lot that LMS can do to accelerate that. There's one thing that we see in SaaS organizations in particular. This is true across board, but especially in SaaS organizations and financial institutions, there's a huge disconnect between your contracts and your database.

And it causes all sorts of downstream problems for analysts, right? And if your process becomes, you write the contract and LM parses it in a really high fidelity way, and it ends up in the database looking like what it should, you've gotten rid of a huge surface area of errors. [00:16:00] And one of the things that we have to do over and over again when we start contracts is go in people's CRMs, put together a bunch of rules that basically disposition.

Everything in there. Give them a dashboard and say, okay, you need to designate somebody as your deal desk person, and every day their job is go through here and fix all this stuff that looks fishy. And once you do that, then all the rest of this data work that people have been doing, like we can clear out hundreds of lines of SQL here in your DBT pipeline because we aren't going back and trying to fix things.

Like, oh, this person always books things incrementally, and that person just wipes it out and gives you the full value. All this stuff goes away. I think. One piece of operationalizing this is doing all of the right things as you go along, some of which are amazing and new in LLM, world scraping. Now it's like we went from covered wagons to jets all of a sudden.

'cause you don't have all this fragility around finding your right anchors and all that. So there's a bunch of stuff that that is just, you know, amazing. Now with LLMs, [00:17:00] there's a bunch of stuff that we should have been doing all the time that now maybe we're gonna get to do because of LLMs like documentation.

Nobody ever paid any attention to documentation, but if you call it context and say, and now your L LMS gonna be able to work with this, they're like, yeah, great. Spend time on that. There's a bunch of this sort of get out of this business of thinking of yourself as a data manipulator. And then you get to the hard stuff, which is how do you actually convince people to use you this way?

And there, you know, I think part of it is having the story to tell people around, like, one of the reasons that I say Jack Ryan is not just to be cute, but like that's something that'll stick for people, right? And so having the stories to tell people around how you think you should be functioning and what you want to be doing here, that seems important.

And then also just. Like actually doing it, like carving out the time and resources to do it once or twice so that people feel that, oh wow, you actually got the business question. You actually came to [00:18:00] me with something that's actionable and interesting. I want to do that now over and over again. That's what I want.

That's the great thing about all this is that I think once, once people get it, it's what everybody wants ultimately anyway. Nobody wants to be. Going and interacting with dashboards all the time. They want to be interacting with real intelligence. I think that that, that makes a lot of sense and it's really exciting to hear you talk that through, I guess the, the.

Probably cynical take is that like an organization that's already in the cost center mode is so hard to move past it. If you have kind of the, the people who are used to being cut tickets and just working through those tickets, if you have like, uh, leadership or executive team who's used to just throwing things across the fence and hoping they'll come back in some reasonable state.

Like how do you go from A to B? And actually, and have you seen that happen in practice that an organization we're in cost center mode becomes like the value center of [00:19:00] the Jack Ryan operation? We've seen over the course of the last 15 years I've been in the industry and then beyond that too, data. Is in kind of this magical place where even though it's this kind of weird red headed stepchild of of the business, it also, 'cause it's sort of weird, periodically gets these huge opportunities to do more on different things, right?

So you had the big data hype and the data science hype, and now the LLM hype and or AI hype, you know, whatever you wanna label it. And so I feel like a lot of the times when there are shifts like that. It comes from the fact that because the technology is maturing quickly, because there's a pipeline of really actually quite smart people coming through the data world, there are these spikes of everybody actually pays attention.

And those are wonderful turning points for people to be able to say, oh, well yeah, analysts who treat this way, sure, fine, make your tickets, whatever. But like, I'm a [00:20:00] data scientist now let's do this a little bit differently. And that's why you saw this whole thing where. A bunch of people who were labeled analysts got relabeled as data scientists.

And that sometimes that's a healthy thing because it's actually changing the paradigm and sometimes it's a terrible thing. 'cause it's just saying, oh, we're gonna keep doing the same thing that we were doing, but with a different title. And you don't get that opportunity anymore. But that's one place where this can happen, is that there's an external disruption and then people can take advantage of it.

And so I see that as like right now, if you're in data, this is an amazing time. 'cause there's this huge external disruption. And think carefully about what you want to be doing and take advantage of it to try to shape that. I, and then the other, the non external version of this is just a lot of lead bullets.

It's not, oh, I have a magic thing to sell. You just do this and all of a sudden it's better. It's like, well, you gotta keep making traction. You gotta keep talking to people, you gotta keep talking about. The instate. You gotta keep trying to show what that instate would look like, and if you just keep [00:21:00] repeating that, hopefully it comes around, and if it doesn't, then maybe you go find somewhere else where That's part of the initial conversation of defining the role.

I like that you mentioned this rebranding movement from analyst to data science. 'cause we're seeing something analogous play out in some ways with ML engineers becoming AI engineers perhaps soon. Context engineers, harness engineers, whatever the term EU is. These things are very real though as well.

Right? 'cause they impact salary a among other things. Also, last time we spoke, you said something that has been itching. My, my, my brain, so to speak. And you said that AI gives you more of whatever you are already doing, and I'd love to know what you mean by that, but first, thinking through what more looks like when the fundamentals are shaky.

Yeah. So, you know, my, my basic assertion here is that yes, AI is an amazing tool and yes, it changes the paradigm in a lot of ways, but fundamentally, [00:22:00] if you go to, if you've got a. Data warehouse that's a complete mess and is not documented and is not structured well. And you go to even a really good LLM tool and you say, Hey, I want you to do this analysis, and then you don't bother to look at what it's actually done.

You're probably gonna be vulnerable to a lot of the same problems that you're vulnerable to if you do that same thing with a human. And we see for forever in companies that are data literate and do a lot of data work, and where that's important. I'm thinking back to my trading days. When a junior analyst comes to you with some data output, the very first thing that you do is you look at it really carefully.

The senior executives who are really good at this, you can watch it happening. They're looking back and forth and back and forth. And the next thing that they're gonna say is, why is this number a little different from what I'd expect from that number? Or, I'd expect these two things to be correlated.

They don't look correlated at all. [00:23:00] Go take it back, check it. And so you have to have, and maybe even someday, a lot of this gets built into your All M harness or whatever. There's an adversarial agent. I think the recent Claude Lee shows us that this is happening. There's some adversarial stuff happening too.

Which is great, but ultimately the organization needs to be responsible for doing that, for making sure that there's, there are all of these quality bars built in. And if you don't have that culture and that's not something that you're paying attention to, you're just gonna turn through more analysis and you're gonna keep getting more bad answers.

And that's no good. I like something you've mentioned in there from a, a variety of different perspectives, and it's the ability to stress test and test our, our systems. 'cause when, when you talk about it. It gives you more of what, whatever you're doing. I do think it, what we're seeing in software right, is the ability to write and reason through good tests for the software that we build.

And one way that seems to be playing [00:24:00] out today is actually building software AI assisted software. We, I, I don't even think we, we should probably call it AI assisted coding anymore because it is, it isn't as though what we're doing is, it isn't like the AI is assisting at coding, it's really assisting at building.

Building software, but um, we're writing it with one framework such as Claude Code with Opus 4.6, whatever it may be, and then building tests or stress testing it with, with Codex. And you spoke to that adversarial nature as as well, which kind of is like the executive or the domain expert coming in and being like, what's happening here?

What, what, what's happening there? And making sure that. We've built that muscle before going in and going on the wonderful dopamine cycle of using Claude Code. Yeah. Yeah. And I mean, to me, this is, this is the wonderful promise of it, right? Is one of the challenges, one of the reasons that you want smart people doing your data work is that.

As you're going through it, you have to hold so many things in your head at once. You have to keep like zooming in, [00:25:00] zooming out, zooming in, zooming out, looking at it, two different lenses. Does this make sense from the perspective of just the rules of the database? Like all of my joins gonna be okay. Does this make sense from the perspective of, I know I have a hundred thousand customers and there should be a subset, like there's a bunch of stuff that's happening and the more you can codify that.

And then have it just happening in the background as you go through things, the higher quality your output's gonna be. Um, 'cause nobody has time to do everything that they want to do to check their work on their own. And so if you can give them a bunch of certain intelligence to assist them. And that's pretty amazing.

Do you think there's like a kind of really bad outcome here where folks overuse AI to make a really important decision and we know we have some kind of like bad AI driven disaster coming, and if so, what does that look like? Again, I, you know, maybe this is just my people nest, but I keep comparing to just things that I've seen with humans, and this has [00:26:00] happened.

This does happen. People pull data and they think it's right. There was a fan out in their joint or something else happened and you make a decision on it. You go, oh my gosh, we're gonna invest a lot of money in making it so that this target population really loves our product. And two years down the road, you realize.

Oh, those people don't spend any money. And we could have known that. So these things happen. They've been happening for a long time and I think there is a Hugo, as you were saying, like there's this dopamine effect of using the LMS where it's like, wow, I can just like get to the answer so fast here. Um, I think there's going to be a psychological retraining that all of us have to do.

I'm finding that I have to do this in my day-to-day work now too, of saying, whoa, whoa, slow down. This is fundamentally the same analysis that we're trying to do. We have to treat it with the same care, even if it feels like I'm interacting with somebody who's pretty smart and is taking care of all of this [00:27:00] for me.

Um, so I think that's the real risk, is just that there's this. Additional layer of the psychological ease of it that can lull you into a false sense of security. Oh yeah. I think quickly, I, I find like in my own work, which is both exciting and tricky to manage, is that the AI's getting better so quickly that it's hard to baseline what you actually should expect from it.

And so I think with a human, like I think your analogy is good with a human, you kind of learn, okay, this person is a junior data scientist. This person's a, a staff data scientist. This is kind of like the quality of work to expect, the level of depth to expect with ai. Six months ago, it didn't work very well for data stuff at all, and now it actually can work pretty well for data stuff amazingly well for data stuff in fact.

And so like knowing you know, where the rough edges are and like where you really need to dig is not as intuitive as it can be with, with a human. Yeah. Also, to your point, Noah, I do wonder whether with the way we're, we've gotta be mindful to in interact with systems like Claude Code or whatever [00:28:00] agents, cursor, amp, whatever agents we happen to be using when building.

I do wonder whether if we've internalized the lessons from using Instagram or these types of things mm-hmm. We can be better. Like, I don't, I'm not feeling as a coincidence Right. That, um, you know, this is a technology that's taken off now given everything we've, we've seen in the attention economy. I also, and Duncan and I have joked about this before, if the future is, you know, managing a hundred agents at once, I do wonder whether.

Instagram has helped us actually have that fractured attention if channeled correctly. The other kind of perhaps slightly wild thing I'm interested in, dun, when you talked about these models and agents and capabilities changing so quickly, and it isn't like having a colleague or even an A hire who. His rate of change is far slower.

It's not only about skills, it's about meta skills in a variety of ways, like knowing how to be flexible yourself, how to change. And I know you have two young children and I wonder if there's anything to learn from the rate of growth of HA having [00:29:00] kids with respect to these technologies. Yeah, I mean that's actually a good, also a good analogy, Hugo, like the, you know, I guess the sense in which you want to put like cha reasonable challenges in front of these technologies and see what happens.

And you shouldn't go too crazy, but you also should be a little bit ambitious to make sure there, there are challenges and opportunities to problem solve. And you know, I guess that's kind of the state that most organizations and companies are in today where they're trying to figure out like, what can this stuff do?

What can it do now that it couldn't do six weeks ago? What kind of happened with the clogged code moment in December 25? Like what, what's changed? And I think that the chase of the pace of change seems even faster than that of kids. But I, that's actually, that's an interesting kind of analogy that we should probably think more about.

Yeah. And as a meta skill for oneself, just knowing that you'll need to be flexible, like you won't have. The same. You'll have to have a different mindset in six months to what you, what you have now. I am interested. No, we've been talking around agents, [00:30:00] LLMs of course, don't want any secret sauce from you or your clients.

But, um, what can you tell us about your setup, what people you are working with use in terms of using AI to power, data science and organizational work? We're in one of these sort of fun moments where it's pretty clear that. You can get a long way out having to go and spend a year doing in-depth work to just get everything hooked up.

But I don't think that there's a completely canonical path yet. Right? We've, we've seen people reasonably successfully using setups where everything is connected through MCP. We've also seen people who have said no. We don't like that. It's not the right paradigm for us. We want everything to be CLI and we give stills about how to run the CLI.

And you know, there's still sort of some question around how much do you want agents to actually be calling tools at all, versus writing code and having the code all be the like lab [00:31:00] notebook log of what you did and how you're approaching this. And so I, it, it's a really exciting time because there's not a single answer right now.

Um, and we operate, uh, again, because we have a bunch of clients, we operate across a bunch of these different environments and that looks different at every one. I'll say some of the things that I've noticed working really well are, one, making sure that there's clear instructions to just continuously ground yourself back in the data so often because.

Z grew up and we're still using, we're sitting on top of things that were built for writing code. If you've got, for instance, A DBT project and you're trying to do work in it, it'll go back and it'll, it'll do all these things where it starts reasoning about like, well, this model is showing that model, and this is a constraint there, so I think this is what's gonna happen in the data.

You're like, great. I think that too, but what I think doesn't really matter. I wanna know, does that happen in the data? So that's one area that [00:32:00] we've seen. It'd be a fairly good like success lever. And then the other one, which I don't think is gonna surprise anybody, is just more documentation. You have more documentation.

And I don't mean more words, I mean more clarity. Uh, if you have more clarity in your documentation about how things work, what they really mean, and you know what the pitfalls are, uh, hey, make sure that you always filter for this thing before you do your analysis because there are two. Pretty heterogeneous groups that are in this table.

Then that also, it's super high leverage for if you want to be able to utilize a reasonably independent way to, to do any data work. I think that makes, makes a lot of sense. However, I'm curious, there's kind of two ways to approach data work. One way is to say, we need like these. Fundamental modules that are completely auditable, reproducible.

Always exactly the same and [00:33:00] have those be like the, the Legos we stack up for any given analysis a different way is to say, okay, we have a bunch of content. Let's let this coding agent get pretty crazy with it and see what it can do and ask it and kind of prompt it to like, be careful, but really kind of give the keys to the coding agent.

I will say for us, we actually started in the former camp saying, okay, we gotta build these kind of libraries to make it all work. And every time we built the library, we found the coding agent just did better. And so now we're very much actually anchored in the the latter category of give it some reasonable instructions and give it a bunch of high quality content, but actually let the coding agent have a lot of latitude.

What are your reflections on that and like when you look at that kind of your customers adopting ai, where are they at in between those kind of polls? Yeah, we've definitely seen, I don't know if anybody's fully out on the spectrum one, one side or the other. I haven't seen anybody lock things down, so it's like.

You can only use these five fully baked models with [00:34:00] all the documentation for these three metrics. I haven't seen anything quite that. But that's actually though, to be clear how like old school bi kind of works, right? You had these semantic layer, you had like your metrics defined, you could drag and drop the specific things, the building blocks.

Well, exactly. And the reason that we don't see that is that once you get there, it's just, it's some dashboards. And so why add the LLM? It's just a more expensive version of the same thing. And then it's quite clear that just. Checking a bunch of data at it and being like, read my slacks. Try to interpret this data like that doesn't work either.

Like the total raw side of of your data progression is also not. Accessible. At least not yet. Maybe someday, but not yet. And so we, we see stuff kind of in, in the middle, like, uh, kinda your staging, your prep layer, your silver layer, whatever you wanna call that piece and your like, data mart gold layer or whatever you wanna call that piece.

And, and with varying levels of control, I'll say for me the things that really matter are if you're gonna do the, like less [00:35:00] structured data, that kind of thing. Again, having that. Institutional knowledge of how do you check that what's coming out actually makes sense is really important. And it's the same as if you shove something over the fence to an analyst and you say, Hey, I need this information.

Like, well, one, give them the context for what you're gonna do with it so that they can reason as they go about, am I making reasonable assumptions and estimates and all that kind of thing, you, you gotta give that to the LM too. And then also when it comes back over, do you have some metrics that you either know or you go and look at to make sure that this fits in the overall picture?

There's a lot of baselining that I think you need to have already built up first, and there's a story that maybe if any of my former colleagues see this will, will resonate. We just did this thing that was, I don't know, maybe a little mean at my old job. When I was running an analytics team, once a week, we had an all hands meeting [00:36:00] and for a short segment of that.

Every week, somebody from my team had to get up and without any assistance, would have to tell the company about the top line metrics for the area that they were involved in. So, you know, marketing, it's what's our pack? What's roughly the spread between our best and our worst based on channel? That kind of thing.

And the point there was basically to really enforce that, hey, it's super important that you have some numbers in your head. So that every time you're doing an analysis, if you come out with something that's just garbage, right? There's no like long loop of, I go present this to somebody who says, ah, that doesn't make sense.

Or I go and look at the dashboard or whatever. It's just, it's there, right? And so I don't know what the canonical version of that's gonna be in LLM land, if it's a marked markdown file that lists some stuff out or pointing it back at your semantic layer or what. But I think it's really important that either a human or some process.

Has that kind of awareness, [00:37:00] situational awareness to apply to any answers that come out. I'm interested in what we're seeing LLMs and agents capable of now. So if we think about, if we think about like data team capabilities, I don't mean act activities, I mean cognitive capabilities we've got like judgment is super important.

Intelligence, right? And by intelligence, I'm using a kind of, kind of the way we, we talk about in the space now, the ability to reason. To mimic analytical thinking, to do pattern matching, these types of things. Then there's communication and then there's procedural stuff like munging data and, and and that type of stuff, right?

Like using pandas or polars or whatever it is. And clearly LLMs are very good. Agents are very good at the latter. Now of course they require, we need to make sure they're doing the right thing. A lot of the intelligence stuff they're getting better at. But the judgment stuff still re remains a serious challenge.

So I'm wondering. And this, of course, is an act of prediction, but we both work in, in [00:38:00] machine learning. Um, like where you see the space going with what we hand off to agents when we verify, but trust, but, but verify and what we keep for ourselves. I, I, I love this, this paradigm. I have this sort of thought experiment that I've been trying to run with myself for a long time.

For a long time it was, what if the internet went away? What if there was some catastrophic event and I couldn't use a computer anymore? The internet's gone. What would I do for a living? And now, like this is a much better version of that, which is basically LMS are coming for your job, right? People are worried about this in this space.

What are the things that, that we still do? And I think you're absolutely right to think of this as your job. Like your capabilities aren't defined by just manipulating data or you know, using some set of tools or whatever else 'cause that is going away. What are the things that are the equivalent of like, okay, so I am a swimmer.

Now there's no pool at my school [00:39:00] anymore. What am I gonna do? Well, I am strong and I've got decent cardiovascular and all of this, right? Like your buckets are similar to that. And I think that's a really important thing to keep in mind as, as we try to navigate this space. So that's a long way of saying I love that.

I agree with you to get that. The actual question. Yes, I, I think. The tool use stuff I think is going to go away quite quickly. And as you may guess from everything else I've said, uh, I, I welcome that with open arms. I think that's great. Uh, I think the intricacies of the syntax of this or that, or trying to optimize this or that is, you know, it's, it's basically like how do you interact with the computer, uh, which is just getting in the way of this bigger question of how do I do the thing that I actually want to do?

And so I think a lot of it is going to be shifting more and more towards the, the problem framing, which is part of this analytical thinking bubble that, that you talked about. Okay, so my [00:40:00] organization wants to blah, blah, blah. How do I reduce that into, uh, set of actionable steps? What's the primary driver of that versus probably when you first see the question, there's five different things that pop into your head, and four of them are gonna be secondary, right?

Some of the smartest people I know when they talk about things, it always feels incredibly simple. Like, you're so smart and everything you say is so simple. Well, it's not an accident, right? It's like they, they know how to zero in on the things that really matter. And I think that's going to become the, the driving force.

Like if you are thinking about how do I interview a candidate for an analyst position, and that analyst position is like, really well set up to utilize all these technologies. And you're thinking about it in this, you know, value driving rather than cost center way. To me, that's the thing that you're gonna be focused on in the future.

How do you, how do you screen for that, Noah? How do you find people who can actually have that skill filtering [00:41:00] or sourcing? Filtering? Filtering. Usually for me, what I do now, just procedurally is in an interview setting. I make sure that people understand that I am asking very wide questions. Because that's how they come in the real world.

And I want them to go back and forth with me to filter those down and come up with like, okay, now what are you actually going to write S two L to solve? If you just throw super wide questions at people, sometimes they're worried that you're fishing, oh shoot, that guy's got some version of this stuck in his head that he thinks is the best way to do it.

And gosh, I'm gonna throw a dart and hope I hit it right. So it's gotta be, it's gotta be framed right? But if you frame it right, I, I think the best way to do it really is just to have them do the thing. Give them a, a super wide question. Something that's not well formed, but that is usually the way that it comes to you in the wild, which is this seed of like, okay, so we know we have a problem here, or an [00:42:00] opportunity here.

What do we need to do? And you work together and you start breaking it down and you watch and see whether that leads to a simple answer. And if that simple answer actually feels like it's going to be high leverage thing to solve it, I just love that there's less and less lead code happening these days due to the AI revolution as well.

I am in interested, this is something we touched upon last time we spoke, Noah, in what happens to teams and organizations as we adopt more and more of these technologies? Will data teams shrink, will organizations. Shrink as we kind of, as everyone, you know, engineers can get closer to business, business gets closer to engineers.

Do we collapse? Do we expand? What, what do you see happening right now? What do you, where, where do you think we're going? Yeah. I think there's gonna be a recomposition for sure. We definitely see right now some orgs where some things that traditionally were, were covered by dedicated engineering teams are [00:43:00] starting to be like, oh yeah.

Maybe our engineering team can just handle this from here. And I, I think though that, that on a broad scale, my, my hope at least is that what ends up happening is we ask more questions. I think today there's this thing where everybody's very constrained. Usually every day team has a huge backlog, and that backlog is waiting on people to plug away at.

Oh. Yeah, that's an obvious question. I can't tell you that until I've figured out how to connect the system to that system and a way that the IDs are set up, uh, you know, I gotta unpack 'em from the JSON and maybe they're in two different places and it's gonna take me a long time. It's gonna take four days dedicated work to be able to answer this very simple question, right?

My hope is that that work starts to go away. Maybe it'll get worse because maybe there'll be more just like. Slop code that's generating data in messy ways, I don't know. But my hope is that starts to go away, and that then that [00:44:00] backlog of questions starts to clear out so that then you can step back and you can look at it and say, okay, what do we learn?

What's gonna happen? What do we need to do? Where are we gonna invest? What's that thing that I need to put into a dashboard form where somebody's gonna look at it every day? And say, oh, okay, so obviously the next account that I need to go and look at is this one. Those kinds of processes I think are gonna be the things that data people start to focus on more, rather than all of that sort of in the middle data prep and understanding work.

That's the optimistic view. Noah. And Noah, coming up on time here, one of our final questions we'd love to ask is if one message from this conversation could really land with executives, what would make the biggest difference? Hopefully I'm beating a dead horse here. Don't treat your data team as go center.

Think of it as what are all of these things that I need to make decisions about. What conversations would I like to have in order to make those better? [00:45:00] Let's make sure that the data team can have those conversations with me. I love it. So it's not office space, it's Jack Ryan. Exactly. More Jack Ryan. Love it.

Noah, thank you for coming and sharing your work and, and your insight and everything you've seen in, in, in the space and really, really excited to see what happens next with all your work in, in data and all the AI stuff. Thank you so much for having me on. Thanks so much for listening to High Signal, brought to you by Delphina.

If you enjoyed this episode, don't forget to sign up for our newsletter, follow us on YouTube and share the podcast with your friends and colleagues like and subscribe on YouTube and give us five stars and a review on iTunes and Spotify. This will help us bring you more of the conversations you love.

All the links are in the show notes. We'll catch you next time.