Pete Hunt:
Really with Dagster, I don't really think of it as like a DAG versus a not DAG kind of way of thinking. I really think of it more as are you going to treat this as a real software engineering discipline and all the benefits that you get from that.

Eric Anderson:
This is Contributor, a podcast telling the stories behind the best open source projects and the communities that make them. I'm Eric Anderson. We're joined today again by Pete Hunt, CEO of Dagster. Pete, actually, this is your first time on the show, right?

Pete Hunt:
This is my first time on the pod, but you have had the Dagster CEO on the pod before.

Eric Anderson:
That's right, yes. Makes sense of that, audience. Pete's exceptional. I'm glad he's going to spend some time with us today. The best place to start maybe is the punchline on Dagster. What's your elevator pitch?

Pete Hunt:
So we are an open source data orchestration framework. That is something that a lot of people understand, but maybe some people don't. But think of it as a toolkit for building data platforms. So every company in 2025 operates on data. They're making decisions with dashboards. They're leveraging modern AI techniques. And you need a set of pipelines to take data from one place, process it, transform it, enrich it, and then activate it in some other place.
And Dagster is that toolkit for being able to build those pipelines, operate those pipelines, understand what they're doing and optimize and maintain them over time. We tend to see developers really giving us a lot of love and organizations transformed after they adopt Dagster.

Eric Anderson:
So I'm going to come back to Dagster, how it works. I have some questions there. So I interviewed Nick in 2021, I just looked, so four or five years ago. What's happened since then and where do you come on the scene, Pete?

Pete Hunt:
2021 is like a completely different era. I mean, back then, I think we were probably still socially distancing, at least in San Francisco, and NFTs were really hot. It was like a completely different world.

Eric Anderson:
Democracy was alive. Just kidding.

Pete Hunt:
Pre-ChatGPT.

Eric Anderson:
Yeah, it's true.

Pete Hunt:
It's a pretty broad question, like what's happened since then? In 2021, pre 1.0, we were pre-selling. And in 2022, we got to 1.0. We started selling. And over the last couple of years we've really done a couple of things. We focused on building a fast-growing venture backed business. And from an open source and technology perspective, we've made a lot of investments in our integration or our different integrations. We've made a lot of investments in our core abstractions.
So we've really pushed this notion of software-defined data assets, and that's become a thing that's really mainstream. And I think one of the most exciting things is we've recently started to roll out our approach to low-code, and we like to think of it as low code without the mess. That's been very interesting to see the reception from the community, but there's a ton of things that we've been doing. It's very, very hard to really think of them all and answer them quickly.
So that's what's changed for the business and the project since 2021. Certainly I joined the company in 2022, which is pretty interesting. I had known Nick since forever. We were both engineers at Facebook a long time ago. I joined Facebook in 2010 and he was there before that. And he was working on GraphQL, which is a very successful open source standard for fetching data.
And I was working on ReactJS at Facebook, which is a very popular open source library for building user interfaces. So we were hanging with the same crew. And so when he started Dagster, I was an angel investor early on, and he was looking for a head of engineering as the company started to scale. And I had sold my last startup and was thinking about what to do next. And he approached me, he said, "Hey, can you help me find a VP of engineering?"
And I was helping him with the search for a while and I finally decided, hey, maybe I should just be that head of engineering. And so I joined the company and we had a great first year. And I started flexing some of my old CEO muscles for my last company. And by the end of that first year, he was like, "Hey, man, listen, I'm the sole founder of a series A scaling to series B company. It's a lot of work for one person to do."
And frankly, I think he wanted to do the fun creative stuff that CTOs do, talk to customers, be a technical visionary, write a lot of code. And so we agreed that I would step into the CEO seat and clean the toilets and he could do all the fun creative work, but it's actually worked out great since then. So I'd say that's how I got into the picture. And obviously since 2021, the world of data, machine learning, and AI has changed dramatically.
And we've been fortunate to be positioned really well for that and benefited a lot from the rise of both AI being used within data pipelines, AI being used by data and ML engineers to write more code, as well as new classes of applications coming up that are powered by data pipelines. So it's been a very exciting past couple of years for us.

Eric Anderson:
So I meet people all the time that want to become CEO of a company like Dagster, but they only want to do it in a direct approach, like just make me CEO of Dagster or something like that. But I always tell them to do what you did. Not that you ever planned on becoming a CEO, I don't think that was the plan, and maybe that's why it worked, is that I was like, no, just become a great executive in a company and opportunities come out. Things happen.

Pete Hunt:
Yeah, I definitely didn't join thinking that I would be CEO. I think anybody that goes through an M&A process needs to take a little break from the CEO's seat for a little while. So I definitely took that break. I was really excited to come in and lead engineering, and I did that for a year and it was really fun. But what I've just found in my career is what's really worked for me, frankly, has been try to create this halo of value around me in the sense that maybe I'm not doing the coolest and most sexy or hot thing, but the people around me are being successful and getting successful.
And that to me has worked out really well I think throughout my career. A really good example of that was going all the way back to my days at Facebook. When I was there, there was big transition from web to mobile, native mobile engineering, and I actually sat that one out. And I was working on the web stack, which was really being almost deprioritized at that time and wasn't cool, but it was still really important.
There's still tens of millions DAUs, hundreds of millions of MAUs on that thing. And only in that environment could something like React come out and be really, really successful. I was fortunate to be a part of that. But again, it was this technology that was being worked on and I helped out and I enabled other people. I continued that pattern at Dagster and other places too.
That to me is I think how I think about it. And again, I didn't ask for CEO. I was doing the head of engineering thing. I saw, hey, there's some things on growth or marketing that maybe nobody's stepping into that area of responsibility. I'll just go in and try to make that better even though that's not my job. And I think that that really made it clear that there's a really great partnership between me and Nick that felt more like co-founders rather than like CEO, Head of Edge.

Eric Anderson:
That's great. And then I know we have more important things to talk about, but are you surprised that... 15 years, I don't know how long it's been, but that React is still the top. I mean, people are still hiring for the most progressive apps. We're starting with React. Is that surprising to you?

Pete Hunt:
Well, I didn't create it. I was maybe the third person to be on the team or something like that. So when I say I thought it was really damn good technology, not to toot my own horn, but yeah, I think the technology is really, really good. I think we put a lot of heart and soul into it and got a lot of community adoption. It's evolved a lot since the first version came out in 2012 or 2013. So I'm not surprised it was successful.
But I remember when we launched, I was like, man, I hope that this thing eventually gets to be as big as jQuery. And if we do that, then we've made it. And I was measuring between when I thought jQuery became the standard, which I think was 2006 to 2013, I was like, if we have a run of being the number one JavaScript framework as long as jQuery, which I'm doing my math, that's like seven years, I'll be like, man, we did something super special. And it's been a lot longer than seven years now. So that's been cool.

Eric Anderson:
All right. When I talk to people about Dagster and related technologies, one thing I hear is there's the DAG oriented solutions and there's maybe more code oriented or other. Is that the right framing? If people are choosing a framework, should they start with are they DAG people or not? How would you tell somebody to think about project selection here, framework selection?

Pete Hunt:
Yeah, I mean, I think you first have to think about the category and maybe the challenges that modern data platform teams have. Then we can talk about whether Dagster is the right fit or not. Obviously, the CEO, it's the right fit, at least I'm going to say that, but I truly believe it. So data platforms got really complex going up into and through 2020-2021. There are a lot of venture-backed startups.
Basically the way that we think about it is there was this explosion of what we call big complexity. So once you've got a snowflake or data bricks or big query in place, now suddenly you have the ability to analyze and build a ton of value on your data sets no matter how big they are. And so everybody starts tapping that oil well of value and creates a lot of complexity.
And managing that complexity higher up in the stack becomes a really big problem. So the classic problem that a lot of companies have is they want to understand, first of all, is their data fresh? How often have you, Eric, been in a situation where you're looking at a dashboard and it's missing the last week of data or there's something wrong with the last month of data or looks weird? It's like a data freshness problem.
Every company has this all the time, and it's because there's this big complicated mess of plumbing between where the data comes into the organization and then how it pops out at the end in the enriched activated form. So understanding data freshness and then other axes of quality as well. Is it passing the quality checks? All that stuff is really important. So understanding the integrity and whether you can trust your data is really what organizations are trying to do.
And these large scale compute layers, they let you compute on it, but they don't help you with that problem. And so we think that data orchestration is the right place to start to solve those problems for our customers. So what orchestration historically has been is a workflow engine where it's basically a scheduler that will run your data pipeline at 9:00 AM every morning and then the data will land in a dashboard by 10:00 AM or something like that.
The problem with thinking of orchestration in that way is that a lot of things can happen between that 9:00 AM and that 10:00 AM that can cause issues, different steps can fail, different upstream systems can have challenges with their data, nulls in places that weren't expected or whatever. And things can get more expensive over time, especially as companies are computing on more unstructured data in 2025.
They're leveraging APIs like the ChatGPT API or whatever large language model they're using. These things are expensive. They're priced using consumption and they're nondeterministic. So if you have an agentic step in there, for example, you don't really know exactly what it's going to do, how much it's going to cost. So there's a lot of observability challenges that have to be solved in that layer.
And then you have to be able to action on those observations in real time. So as the system is observing the execution of a data pipeline, you might want to make changes. You might say, hey, listen, this thing passed its quality check. Let's go rerun it, or let's go page an engineer, or this job is taking too long or is accruing too much spend. Let's shut it down before it bankrupts us.
So there's a wide variety of things that can happen. And existing workflow orchestrators, the way that people used to do it, they really treat these things as separate steps of black boxes. So run this task, then run this other task. And what that task does, the workflow engine doesn't care. It's just running the workflow and retrying it if it fails.
And so what teams tend to do is they tend to integrate tons of different tools in order to try to get that single pane of glass of observability across their data pipelines, and then they try to write a bunch of custom code to react automatically to different failure events. And that is not fun for anybody. And so in many ways, to me and the reason why I joined Dagster was that it reminded me a lot of front-end pre-React and pre-modules.
A real software engineering discipline would not have a life cycle like that. And so really with Dagster, I don't really think of it as a DAG versus a not DAG kind of way of thinking. I really think of it more as a are you going to treat this as a real software engineering discipline and all the benefits that you get from that, or are you going to treat it as more of like an ad hoc point and click type of... Are you stuck in that old way of thinking or not?

Eric Anderson:
Right. Right. Right. The front-end pre-React is a really good example, because I think there was a lot of people who were like, "This is annoying. It's not ideal." But at the same time, they couldn't say, "This is what it should look like." But once they're in the world of React, they wouldn't go back.

Pete Hunt:
I think the lesson that I learned from that process and that we're applying to Dagster is I think the right way to approach this is to think about the tail complexity first. What are the really hard challenges that organizations are facing when they're building data pipelines, or in the React case, when they're building web apps? Let's build a tool that solves those, that gives you observability into the really tough stuff, that gives you the flexibility to write complex code when you need to.
And you start there and you win over the shops that have the complex use cases. And then there's going to be a lot of shops that say, listen, it's great that you solve these complex problems, but we don't have the complex problems. And in fact, because you're so good at solving the complex problems, it makes the easier problems harder to solve because we're paralyzed by the amount of choice or we're not able to really understand the tool.
It is much easier to solve that problem through abstraction and documentation and best practices and simplification of the product than it is to go the other way. Take something that's really good at Hello, World! and the basic stuff and make it work for the really tail complexity type of use cases. And that's what we learned from React. When React started, it was really tough to... The really good people that consider themselves software engineers were really excited about it.
The people that came from a design background or a non-traditional background thought it was really hard to use. And over time we built starter kits. We simplified the API. We built great learning resources, and eventually now everybody can pick up React and just start using it. It's the favorite tool of vibe coders now. It really completely turned around. and I think that's the right way to approach any developer tool or infrastructure.

Eric Anderson:
You said something at the beginning that I wrote down because I thought it was interesting. I think it was called virtual assets. You have a paradigm for how you describe your data assets and what is this.

Pete Hunt:
We call it software-defined assets. So we start out by talking about this big complexity problem and solving big complexities for organizations at scale. And what that really means is less data downtime and higher developer productivity and higher engineering retention. But it boils down to one key different way of thinking about the problem than maybe the alternatives, which is what we call asset orientation.
So the idea here is that unlike a workflow engine where every step is a black box, our orchestrator understands the data assets and their global lineage at a very, very deep level. We understand every single data asset in the organization, every single state transition that it's been through. We understand what it depends on, what its historical scheduling and failure rates and success rates look like.
And we can do a lot of really interesting stuff with that information, and we can deliver a better experience for developers and operators of data platforms. Again, contrast that with a workflow oriented approach where these tasks are black boxes and you can't really reason about what they're doing. So the implications of that are pretty profound. It is similar.
Again, I hate to keep going back to this React example, but it's like going from full page monolithic templates to reusable components in React. Similar thing here where you go from these monolithic opaque workflows in a workflow engine to a set of data assets, which is the thing that you really care about. Is my dashboard up to date? Is my machine learning model hitting its proper metrics?
With Dagster, you get this global asset lineage, which you would normally have to buy a separate observability product to get. You get this searchable data catalog that's always up-to-date. You get to understand freshness anomalies and things like that. You basically get a lot of value out of a single tool that if you were using a workflow engine, you'd have to compose a bunch of different tools together. You might be able to get it to work, but it would be pretty clunky.
It wouldn't be part of our real software development lifecycle. And what's really interesting about this is our largest competitors of all copied us over the last year. So you'll see Apache Airflow is like a widely deployed workflow orchestrator. They now as a 3.0 have carbon copies of our APIs and this notion of a data asset in there now. It's not as powerful as anything that we have. Imitation is the most sincere form of flattery, so we feel pretty flattered.

Eric Anderson:
Totally. And I think it's encouraging and validating internally. You're like, well, we're thinking about the next step and apparently others are just kind of... It's hard to think about the future when you're copying the past, I guess.

Pete Hunt:
That's how we feel as well. We think they're fighting the last war, and we're really pushing forward. Again, we started by solving in a very principled way the tail complexity, the really hard problems, the P99 complexity. We have a tool that has a multi-year track record of doing that super well and is the preferred choice for a lot of the teams that have those types of challenges. Now the question is how do we bring this to everybody else?
So we've been working on this low code approach that we call Dagster Components. And the idea here is that we think that this is the next frontier of how these platforms are going to be built and operated. And the way to think about it, I think, is that if you're a team that has these problems of big complexity and you're building data pipelines, over here, you've got full Python-oriented approach, full code, real software engineering.
And this would be like a Dagster today or the Dagster of last year and an Airflow or something like that. And then over here you have basically no-code or low-code approach, which would be like an Informatica or Kestra or like a dbt Cloud, for example.
A lot of teams start over here and then when they hit some complexity that they can't solve with those low-code or no-code tools, there's this giant leap that they have to make all the way over to an industrial strength tool or at least an industrial strength tool that embodies software engineering best practices, which we think is important to solving these big complexity problems. And that sucks that they have to do that.
And so we're trying to build something in the middle here, which is you can get the benefits of these low-code solutions. So you write in YAML or you use an in browser editor. And then at the same time, you can drop in... It's a layered abstraction, so you can punch a layer below and use the full power of Python or whatever programming language you want to solve those complex problems. And we think that this layered approach is really important.
The way to think about is that the developers that are using Python are the data platform engineers. It's a small number of people that are enabling a wide diverse set of data practitioners within the organization to build in that lower code style. And this is super important over the next couple of years because like it or not, everybody in your organization is using a tool like Cursor or Claude Code and they are writing more and more code using AI.
And that is great, except people are definitely writing code that they don't understand fully. And code reviews are great, but they don't solve all problems. And so it's very, very important for the vibe coders or the people that are not necessarily experts of this particular domain to be able to set up guardrails and make it very, very clear like, hey, this is the area where you're interacting with the data platform.
It's got clear guardrails so you can't shoot yourself or others in the foot. And then it really limits the blast radius of technical debt that these large language models can produce. Because while they are super, super powerful and transformative tools, and I use Cursor all the time, I think it's great, they also can create an enormous amount of technical debt very quickly and introduce very subtle bugs that without proper guardrails and observability could create a lot of problems for you.
So we think of this as both our take on low-code without the mess and also the first... This is really one of the first frameworks that's been built with LLMs as a primary customer. And so I think it's an interesting approach and not one that we've taken in the past. So that's pretty neat.

Eric Anderson:
So the code generation tools, you mentioned Cursor, there's also the prompt to app things like Lovable and Bolt. And my understanding is that the latter have really benefited from frameworks. I mean, you mentioned that React is the thing that people are using. But even in addition to that, there's a super base for user management or what have you.
And so I think what you're saying is that we could have Cursor just generate data pipelines at will, but then you're left with this mess that who's going to maintain them in different forms. Some are Java, some are Python. But with these components, now my Bolt, Lovable, Cursor, whatever agent I'm using, is going to generate these components that are already accepted within my organization. Maybe somebody built them within the organization.

Pete Hunt:
That's exactly right. We ship some default components out of the box. You want to do an ETL job or you want to do a RAG pipeline or something. We got one pre-built for you, you can just take off the shelf. But what enterprises and larger teams are going to do is they're going to build their own internal library of components that fit their standards. So it's going to say, oh, you want to move data from Postgres to Snowflake?
Here's our in-house component. We put certain rules around which data can move between which regions and stuff like that. And so we put the guardrails on there. Then you can give to, for lack of a better term, the vibe data engineers or the data vibe coders. You give them those components. And what we've done is we've built a stack of tools optimized for the LLM. So we can give the LLM very rapid feedback about...
It's through things like MCP and command-line tools basically that give it rapid feedback about how it's doing, and that really increases the quality of the code it's able to generate. And the other thing is by boiling it down to going from hundreds or thousands of lines of Python code to dozens of lines of YAML, you can fit a lot more in the context window. And that is really where a lot of these AI enabled tools start to fall over. You hear this from everybody.
The first 2,500 to 10,000 lines of a project go super, super well, and then it just falls apart. That's because the project gets too big. This is one of the reasons why I think React is such a great fit for LLMs is because the LLM only needs to think about the components it's working on and maybe a couple of the components it's using. So you've constrained the search space in the context window massively. And we're trying to do the same thing with Dagster Components.

Eric Anderson:
It feels like, going back to an earlier topic, whenever I've done a data pipeline work, I've always copied and pasted somebody's SQL and edit it, and I've copied and pasted somebody's Spark code and edit it. And I'm assembling from my coworkers a similar thing, and then it doesn't work. And of course, whoever's maintaining my coworkers thing isn't maintaining my thing, even though it's more or less the same thing. And so you've given us a way to do this that's both safe and efficient than what we were already doing before in a sense.

Pete Hunt:
Yeah. I mean, we refer to this internally as having pre-baked product market fit, which is normally not something you say. But in this case, it's like we would talk to all of our big customers and they would say, "Hey, we've built a DSL on top of Dagster that puts guardrails in place for our stakeholders." I mean, we've got dozens of them that came to us and told us about this.
So there clearly was demand for this. But when you actually look at what they built, it's like they did the best with the time and focus that they had. They didn't have time to build a VS Code extension for this. They didn't have time to do evals with large language models to figure out whether it was going to actually work with Claude Code or Cursor. They'd always get weird stack traces out of the tooling that were really hard for people onboarding to understand.
So in many ways, this is a packaging up of patterns that everybody is rolling their own. And I almost liken it to a tool like dbt where it seems like every company pre-dbt had some sort of system for templating SQL statements and bundling them together and running them. And then dbt standardized that and made really good tooling around that and became a very successful project. We're doing the same thing with data orchestration.

Eric Anderson:
Totally. Well, let me ask you about, I think data orchestration and data pipelines have been the standard way people have thought about things. Has AI changed how people think about data pipelines? I mean, these RAG pipelines is one example of where I feel like AI is a new pretty common element, and then there's often maybe an inference step or a human in the loop step. Are these things I would do within Dagster or how would I do these things in Dagster?

Pete Hunt:
Yeah, so let's maybe split these applications. So RAG is a great example where you have some sort of context repository. It's usually like a database or search engine or vector DB of all the knowledge that your LLM system needs. They have some product that sits on top of that that users use, and they have some set of data sources that feed into that context store.
We operate on that, all the stuff feeding into the context store. So you're integrating all of the documents in your organization, all the customer support tickets. You're doing a lot of different type of data integration tasks and enrichment. Those all have to end up landing in a context store or a set of context stores.

Eric Anderson:
The population effort kind of a...

Pete Hunt:
Exactly. Exactly. And so we have a ton of traction with vertical AI companies that what they end up doing is they start by saying, "We're doing analytics on our app." And so they bring in Dagster to start to build some dashboards and do some data engineering. Then they realize, wait, this actually solves core production app problems for us. We have to pull in all these different data sources and populate the context store.
Between the context store and the user facing application, that's probably not us. That's a different class of tool, but we see a lot of use in that first category of stuff. And the other change that we're seeing in AI or related to AI besides obviously the rise of vibe coding is people are doing way more with their data pipelines. They are now able to reliably ingest...
A good example is they can ingest PDF scans of dental invoices from 3 million different types of dental offices, and they're able to actually reliably structure that unstructured information and do something with it with their data pipeline. And so with that requires a much more focus on data quality, understanding like, hey, okay, we're now doing a lot more work with unstructured data. Error rates can change over time. How do we handle that?
And so bringing in data quality checks directly into the data pipeline we think is really important for that type of thing. And so we definitely see a lot of our customers and prospects asking, "Hey, we've adopted AI. It's introduced a bunch of non-determinism into our pipelines. We are using data quality tools, but we get this post-hoc observation a couple hours after the bad data's already propagated through the system. What can we do with Dagster to stop that low quality data from making its way through the whole process?"

Eric Anderson:
So I don't know if this is going into this data quality stuff. I don't know if this is true, but I've always imagined that some people are like, well, can't Snowflake do the data catalog or the data quality? And I've always pointed out, well, but certainly a big enterprise has so many different data stores. They don't migrate away from them. They just collect them over time.
And what you want in a catalog is universality. You want to cover everything, I would assume. And so having it separate from your storage system, separate from your processing systems is helpful. And I'm just using catalogs as an example, but that could probably apply to quality or other. But what's interesting about orchestration is that that actually maybe leans more universal than it does...
Your orchestration system works with all of your different storage and processing systems, I would think. It fits for me that you could serve these metadata things that end up being quality and catalog and whatever else.

Pete Hunt:
It's interesting you bring that up because when I was talking to Nick about joining, I was like, "Hey, what if Snowflake or Databricks just does everything and vertically integrates the whole stack?" And he's like, "I mean, in that future world, maybe that's bad for us, but the real world is much different." Every enterprise that we talk to, really heterogeneous data stack and trending more heterogeneous over time. So they're using multiple compute systems.
They have legacy systems. They have new modern systems. They're now interacting with multiple AI APIs, multiple modern data tools. And the other thing is that there's also trends that make me think that this is going to accelerate. So there's this rise of these open table formats for building these data lake houses. We think there's a really great opportunity for companies to save a lot of money and get a lot of flexibility by doing more single node compute.
So rather than spinning up a big and expensive job on a data warehouse, you can use something like a DuckDB or a Polars. Whereas before you wouldn't have been able to do that on a single machine. And having a vendor-agnostic orchestrator that lets you... It gives you some optionality to swap out different compute engines at will. We think it's a really big advantage for organizations.
If we're talking to an organization and they say, "Hey, we're all in on AWS exclusively for everything or whatever it is," we say, "Listen, number one, you're not getting best of breed tools. There's a lot of reasons why you would want a best of breed in this category. And number two, it's only a matter of time until you're going to face this problem and you're going to have to migrate everything into a more vendor-agnostic posture." We think orchestration is a really, really strategic place to start there.

Eric Anderson:
I mean, I'm into the single node compute, but I hadn't thought that my orchestrator could be the thing that makes it easy to swap around. I already have this pipeline built, and I can go to one node and be like, let's just flip you to DuckDB. Let's flip you, this other one, to Polars. And maybe that's a standard component I can offer to my team.

Pete Hunt:
We're not going to rewrite your SQL query from whatever the other dialect is to DuckDB, for example. That's only part of the journey. The other part is getting it into production and validating that it's running correctly and doing an A/B test between the old way and the new way. And that's all stuff that we can help you with, and it provides a ton of leverage for our customers.

Eric Anderson:
Great. Pete, I want to give you the mic for a bit. What should we cover today? What have I missed? What would be interesting to talk about?

Pete Hunt:
Oh man. I mean, I think we covered a lot of the hot button stuff from a technology perspective. We talked about the low-code. We're trying to build the iPad of low-code. There's a laptop, there's a phone, and you need something in the middle.

Eric Anderson:
Oh, interesting.

Pete Hunt:
That's kind of how I think about it. If I had a black turtleneck, I'd pitch it that way. You asked what's changed since 2021, and I think there are a lot of things to talk about from a startup perspective. The profile of customer changes and how do you change how you position that product and talk about it.
And one of the big areas is, as we were talking to larger and larger enterprises that are a little more risk averse, observability becomes a really, really big deal. One of our big focuses for this year has been to really build the best set of observability tools that the data platform's ever seen. And I think we've got some really cool stuff coming down the pipe for that.

Eric Anderson:
That's awesome. I think data observability has been an orphaned effort. I think some people think that, I don't know, other observability vendors might address this, but I don't think they have the right tools and the right buy-in from teams. And so it's been build your own largely. I don't know.

Pete Hunt:
Well, it's because everybody's got a different impression. So we partner with a lot of great observability tools. And data observability means different to different people. And so when it comes to examining the content of your data and making an assessment as to running anomaly detection on a column in your table over time, that's not something that we really focus on. We do focus on anomalies and challenges around data freshness and cost.
What we've found with a lot of our customers is that when they talk about data quality, they're really talking about freshness. And then when we talk to other categories of customers, they say, "No, no, no, no, no. This really is about the, again, finding the nulls in my column," or whatever. And so it does mean a different thing to different people, and we do like to partner in that second category. But in that first category, really that's the job of the orchestrator to understand what's going on and help users understand how they can fix it.

Eric Anderson:
Yeah, that makes sense. The orchestrator is where you manipulate timing on time delivery, the schedule and data freshness feels at home there.

Pete Hunt:
Totally.

Eric Anderson:
Well, Pete, I think what you've given the world in Dagster is pretty amazing. I mean, I think what's fun about talking about open source is that it's a gift to humanity in addition to what you're doing for the company and your customers and stakeholders. We get this exhaust byproduct that just benefits the world. So appreciate what you've done there.

Pete Hunt:
Yeah, yeah, I appreciate it. I'm a cynical guy, but I do think that the fact that customers today demand at the very least open core for most of their vendors has created very, very good incentives for everybody. First of all, there's a lot of free high quality software out there now that people can just use and get value out of immediately.
And then the commercial products are also getting really, really good too because these commercial open source vendors are really trying to earn everyone's business. So I just think that the way these companies are going to market these days is super interesting and pro customer.

Eric Anderson:
Awesome. Thanks for coming on the show today.

Pete Hunt:
Yeah, thanks, Eric. Good to see you.

Eric Anderson:
You can subscribe to the podcast and check out our community Slack and newsletter at contributor.fyi. If you like the show, please leave a rating and review on Apple Podcasts, Spotify, or wherever you get your podcasts. Until next time, I'm Eric Anderson and this has been Contributor.