The following is a rough transcript which has not been revised by High Signal or the guest. Please check with us before using any quotations from this transcript. Thank you.
===

anu: [00:00:00] The way we think about agents is how do teams of humans and agents work together? In all of these different contexts that I mentioned, uh, there's hundreds of robo agents that get deployed across marketing teams, engineering teams, product teams, HR teams, just across the company to. Really figure out what are workflows that agents can step in and help with.

Whereas, what are unique decision points that require human beings to really make those decisions, but give them the ability to supercharge their day-to-day work with these assisting agents? In the short term, really, how do you bring agents into the day-to-day life of our end users such that it can assist them with various tasks?

But over the long term, how will just the very nature of teamwork change? How will the very nature of creating software, the very nature of working together, change in a world where AI agents become a lot more ubiquitous? 

hugo: That was a new Barage president of Atlassian talking about [00:01:00] how AI agents are beginning to change the very nature of teamwork.

In this episode of High Signal, I have the great pleasure of speaking with Anu about what happens when agents move from hype to daily work. Atlassian employees have already built thousands of internal agents from onboarding assistants in HR to developer tools like the dev agent and the code reviewer.

And with over a million active users on Atlassian's AI platform, customers like Harper Collins are cutting manual work by four x while industries from publishing to finance rethink their core workflows. We dig into how Atlassian's culture makes space for this kind of bottom up experimentation, what it really takes to move agents from demo to production, and why technical teams are often the ones building the most useful assistance.

We also look ahead to the big questions, reliability, multiplayer collaboration between humans and agents, and what governance and compliance mean for the future of [00:02:00] enterprise ai. It's a conversation about experimentation at scale and the next chapter of software development in an agent world. If you enjoy these conversations, enjoy, please leave us a review.

Give us five stars and share the podcast with your friends. Links are in the show notes. Let's now check in with Duncan Gilchrist from Delphina before we jump into the interview. Hey Duncan. Hey Hugo. How are you? So before we jump into the conversation with Anu, I'd love for you to tell us a bit about what you're up to at Delphina and why we make high signal.

duncan: At Delphina, we're building AI agents for data science and through the nature of our work, we speak with the very best in the field. So with the podcast, we're sharing that high signal. We covered a lot of ground with Anu, and I was just wondering if you could let us know 

hugo: what resonated with you the most.

duncan: You know, ANU sits at such an interesting intersection of software engineering and ai. I think Dustin Moscowitz coined the phrase, the work about work to describe how much effort goes into organizing [00:03:00] around the actual task at hand. And so much of the work about work lives in Atlassian's tools. Like Jira, the project management tool.

So she really sees how things get done across the largest companies in the world and it's absolutely fascinating to hear her speak about the emergence of a kind of artificial collaborative intelligence where it's not about individuals and their tools, rather it's the team agent hybrid. Let's get into it.

Hi there, ANU, 

hugo: and welcome to the show. 

anu: Hi, Hugo. Great to be here. Thanks for having me. 

hugo: Such a pleasure. Well, it was so much fun to chat with you at, uh, VentureBeat Transform and to have a fireside chat, and it's so exciting to continue the conversation about everything you're up to at Atlassian, particularly with respect to agents and experimentation.

So I'm wondering if you could give us a brief overview of what you're up to with agents of Atlassian. 

anu: Yeah, definitely. Um, so Atlassian is a collaboration company. We [00:04:00] build the apps and agents for teams to really be able to work together in different contexts. So what we have with our AI offerings is, um, vo, which is basically our search, chat and agent platform, which helps users collaborate with each other in different contexts in the context of project management, document sharing, knowledge management, as well as service management and ticketing.

So the way we think about agents is how do teams of humans and agents work together in all of these different contexts that I mentioned? Um, so internally we use a lot of add-on products. Uh, there's hundreds of robo agents that get deployed across marketing teams, engineering teams, product teams, HR teams, just across the company.

Really figure out what are workflows that agents can step in and help with. Whereas, what are unique decision points that require human beings to really make those decisions, but give them the ability to supercharge [00:05:00] their day-to-day work with these assisting agents? So we think very much about, in the short term, really, how do you bring agents into the day-to-day life of our end users, such that it can assist them with various tasks.

But over the long term, how will just the very nature of teamwork change? How will the very nature of creating software, the very nature of working together, change in a world where AI agents become a lot more ubiquitous? 

hugo: I love it, and I'm really excited to kind of get into the nitty gritty of how you using them internally.

I also can't wait to talk about how you, you're enabling a lot of your collaborators and customers to use them also, but I'm interested in the experimental approach because. As opposed to classic software where we'll roll thing, roll things out, and have a sense of, of how they work and what they are with, with agents.

It seems like experimentation has to be a grounding and, and for first principle, so I'm wondering why, what motivates this approach of experimentation for you? So 

anu: I, I think the way that agents [00:06:00] differ from traditional software is that they're adaptive. So they adapt to the context that you send, adapt to the data that's available to make these decisions, and also adapt to what's the underlying model, uh, that's driving the behavior of these agents.

So to that end, it's important that. We all try out what agents behave like the fact that they're not deterministic and you can't pinpoint with 100% precision exactly what the output is going to be is a blessing and a curse. And so it's a blessing in that in many ways you can adapt them to context that it was hard to adapt deterministic software for.

But also it means that you have to make sure that you understand in what context agents are particularly helpful. What context you need to pass to the agent for it to be able to make decisions. Take actions that are the right. The contextual actions and decisions for that particular situation, and also what kind of data is helpful to inform, uh, those, um, actions.

So as a very specific [00:07:00] example, we have an agent that does a new user onboarding, for instance, at Atlassian. And the more we used it, the more we discovered ways in which we can enhance the agent to help new users get onboarded in different locations, in different roles with different functions. So the more we used the agent, the more data it was able to really collect and optimize for.

And the more kind of use cases we organically discovered and started, uh, sucking up with the agent, the more useful it became to our end users. So I think there is a virtuous loop there of experimenting with how you can deploy agents in real world. What. Do you really need to do to make them more useful?

And sometimes you also discover cases where they're not that useful and you're probably better off just using a piece of deterministic automation or just doing that manually. 

hugo: Absolutely. And I think onboarding is such a wonderful use case. Um, LLMs and agent systems even thinking about retrieval in terms of getting the right information in, in, into the hands of the right [00:08:00] p at the right time, I'm interested with like new users with how you manage expectations or is there another human in the loop to let them know how to use the agent?

anu: Yeah, so I think it the first, so I think of it mostly as a spectrum. There are some jobs or some tasks that agents can perform with the right context and the. Right guardrails are on permissions and the set of actions, it has access to the amount of knowledge it has access to where it can run fairly autonomously.

What are examples? Uh, for instance, ticket resolution. Given a set of knowledge base articles and a definitive set of action, the interactions that a virtual agent can have with an end user submitting a ticket tends to be pretty fertile ground for agents to be able to make their own decisions, run a reasonable set of workflows, et cetera.

Uh, whereas there are certain situations where you need a human in the loop because the spectrum of possible options are very different. A good [00:09:00] example is. What we use, for instance, uh, in code review. So the code review agent actually is able to look at what constitutes a good code review, what constitutes a good comment, it understands the code base deeply and is able to use that context to say, here's a code review.

For a PR that's submitted, but it does take a human in the loop to say, okay, is this the right set of review? Comments to action, and how do I use my judgment to make sure that the human in the loop position takes the right kind of decision sets in the whole workflow? So I think we will see a spectrum of that over time and move moving a fair number of workflows from the left to the right.

Where the agent gets more and more capable as both model capability improves as well as the context and tools that we are able to provide to the agent improves. 

hugo: Yeah, absolutely. And I, I like that a lot. And I think of it as even a two by two, right? Where you have amount of AI agency and amount of human in, in the [00:10:00] loop and tools are wonderful with the amount of agents that they have, dependent on how much human is the loop.

So I think cursor with or Claude code is an example where if you're in the loop in the right way as an end user, great. If you're not, and you may end up letting it, delete your production database. And we've seen, you know, cases like that happen, happen recently. I am interested, and this is something that we touched upon at the Transform Conference, how 

duncan: you, 

hugo: how Atlassian cultivates a bottom up environment where employees feel safe enough to be honest, to experiment and create thousands of custom agents.

And for a bit more context, I think, um. These types of things we don't necessarily see. It's a difference between short-term incentives and medium to long-term incentives, right? The ability to experiment in a workplace is wonderful for medium to long-term, but it may not show short-term, short-term results.

So I'm interested how you create that sense of comfort and safety. 

anu: It's a great question, and it's one that we are very intentional about at Atlassian because one of the [00:11:00] fundamental things that drive us is a fundamental belief that Atlassian needs to be a hundred year company. So that really puts the DNA of long-term thinking into every Atlassian.

So that when we think about decisions we make, we don't just think about the short-term impact, but really the long-term impact of said decision. So from that context, one of our company values is open company, no bullshit. And so openness is a key part of the culture and we are very open about saying, look, we have to try different AI interactions, not only in our portfolio, our tool set, but also whatever's available externally so we can understand what the evolution of work is going to be.

Right. Uh, because unless you live the future, it's difficult to build the future. So in order to live the future, there's always uncertainty involved. So I think we often, as leaders, we underestimate the emotional impact of questions. Like, oh, will my job be there? Will AI take over my job? What is [00:12:00] the impact of me trying out all of these different things?

What does the long term look like? And at Atlassian, we fundamentally believe that with ai, the amount of software that gets created is going to go up. It's going to explode. And one of the good things about it is a lot of people who traditionally were the ones able to create software, technically adapt coders and developers.

They're not gonna be the only ones who will create software, which means it opens up the field to a lot of non-technical folks that haven't necessarily learn programming or coding. That is an immense opportunity. So really being able to think about what can I do that I wasn't able to do in the past, and no doubt my job might change, but what are the superpowers I also get in addition to maybe my day-to-day looks different, but different in terms of what, what else can I do that I wasn't able to do before?

Making sure that we are able to communicate that to our employees and also make sure that we don't penalize [00:13:00] failures. A lot of the agents I talked about, the rover agents. Many of them worked very well. Many of them are in the graveyard. It just did not work at all. But we don't think about the time that people spend creating those agents is wasted.

Really, that time is helpful in terms of understanding what is possible and what is not possible, and what are other ways, creative ways that we haven't thought about that work is gonna change, and how can we lead, uh, from the front on that. So I think the long-term thinking, openness, not treating failed experiments as.

A time wasted. All of those together are helpful to create a safe space for experimentation 

hugo: without a doubt. And you know, there's this saying only 5% or whatever it is of models make it to, to production, right? And people say, we need to change that. And I always say, Hey. If you're doing the right experiments, maybe it's great that 95% aren't working and you're, and you're shipping, experimenting, experimenting quickly.

Also, I, it's always in my mind that Atlassian was founded in Australia and it's an Australian company at heart, but [00:14:00] you're reminding me that the motto is open company, no bullshit. It's very tough to not realize that it's very Australian in some ways as well. 

anu: Indeed, 

hugo: I'm interested, you've already mentioned several use cases.

I'm wondering if you could walk us through one internal agent example from conception, from idea to daily use and what problems it solved and how adoption spread throughout the organization. 

anu: Yeah, so maybe the agent I'll talk to you about, we have been, um, tinkering with it for a good 18 months now. Um, it's called VO Dev Agent.

So we started out the vo dev agent with a thesis that, of course, code generation is. Greatly, um, improve in an AI era, right? That's why you see so many of these Code Gen two. And so we started dev with the question of, of course, code Gen is going to get better and we should support code generation, but let's think about what is it that developers spend a lot of their daily time on?

And turns out for an [00:15:00] average developer, they spend less than 25% of their time coding. And really they think about coding as one of those tasks where they're in flow. It's one of the creative parts of their job. A good bulk of their time is spent on handoffs across, um, their teams asking the designer for mockups or talking to the product manager about customer feedback and how the bug is going to address said feedback, talking to the DevOps folks about when this went into production, what was the incident, how do you debug that, et cetera.

So when we built the robot dev agent, we. Started out by saying, let's provide developers the ability to go from intent to code. So from a Jira ticket, you can directly generate code because we understand the context of the Jira ticket or the Confluence page, which describes the intent of what you wanna build.

And we understand the code based context, which was phenomenal. But we had a fair amount of adoption internally as well as [00:16:00] a bunch of customers telling us they appreciate the feature. But the question about what about the rest of my time? How can you help me with that? That question continued to be an important part of our thinking.

Mm-hmm. So we built the code reviewer agent, which tends to be, we have a large engineering team, so we have a fair number of senior devs that can do a number. Code reviews and help. A lot of our junior devs really ramp in and elevate their thinking, but often there is a line for these experts. So the code reviewer agent really was built to help assist that particular process, that, uh, waiting and handoff process for code reviews, where not only can it do an automatic code review, but it understands what good code reviews look like for your organization.

So for Atlassian, what a coding guidelines that we deploy. What are, uh, specific Atlassian ISS that are appreciated in the code and how can we incorporate that into the code reviewer agent? And then [00:17:00] what about deployment and the CICD pipeline? So really, we thought about the entire software development lifecycle and not just the code generation, but to do that, uh, was, uh, many, many cycles of iteration.

So we didn't get it right in the first shot itself. So. As we saw adoption and usage feedback and as the models got more powerful and as our data loops got more powerful, we've now got to the place where OD Dev CLI is here at the top of the SW bench leaderboard, which is a phenomenal achievement and it really speaks to how useful it is end to end, not just in the.

Siloed part of code generation. But that to me is a great example of how you can build the perfect agent right away and deploy it to everybody. It really needs several iterations and loops. 

hugo: What a wonderful example. And it, it does really dovetail nicely into another topic. I wanted your thoughts on how do you enable non-technical people to assemble and work with [00:18:00] useful agents, and are there any like illuminating examples for you?

anu: Oh yeah, many of them. In fact, I think one of the great promises of AI and software development is Howard. Unlocks creativity from a lot of the folks that necessarily were dependent on technical people to bring their creations to life before. So we see this both internally and amongst our customers. A lot of product managers, designers, HR people, comms people, marketing people use what we call studio to go build their own agents.

And the way we think about agent construction is a spectrum. So you can start from the beginning of the spectrum. It's entirely visual. So with the designer, you can basically drag and drop a bunch of components and preto your agent is ready. An example of that could be generate a blog post based on a particular topic or generate, for example, an onboarding workflow based on here's the set of SaaS apps.

I want you to auto provision people too. And then on that spectrum, you can keep adding more and more sophistication [00:19:00] with more third party data connectors with more. Code driven actions that you can take with the agent such that even a pro code developers can do more of the sophisticated agent building.

But what we've seen is so many more of the non-technical teams tend to build their own agents and find great value in it. So we've been investing more and more studio such that more of the non-technical users can build more powerful agents. So the example I was telling you about Nora, our onboarding agent, that was built by our HR team, in fact, very much in a no code format.

hugo: I'm interested in something we've seen is that a lot of AI products have a lot of promise and then stall a proof of concept. I jokingly h jokingly call it proof of concept purgatory. I am wondering what playbook help, and I don't want any secret sauce or trade secrets, but I'm wondering what, what playbook helps you move an agent from sandbox to to production rollout?

anu: Yeah, and honestly, there isn't [00:20:00] a secret sauce there either. And so at Atlassian we publish what we call a playbook, which is, uh, effectively we document all ad learnings and what has worked for us in terms of practices so that our customers, 300,000 of them, are able to access that and see if some of those plays work for them.

So we have actually a play for very much the thing that you talked about. How do you take something that's a proof of concept or a demo and roll it out into production, and what are the best ways to do that? So fundamentally, uh, I would think of it as three things. One, it's helpful to anchor your, uh, rollout of the agent on a well defined.

Problem and measurable outcomes. So often because the agent development tends to be a bit more iterative, it tends to somewhat expand in scope to, my agent can do AB and C altogether, and you see overall the, is the agent doing what it's supposed to be doing, starts to get diluted. So it's helpful to have a well-defined problem and measurable outcome such that the evals [00:21:00] also you're able to say, okay, I can clearly understand the quality of the agent, the accuracy of the agent.

Is this what it's supposed to do? A second one is to make sure that, um, you embrace feedback and it trade often. So for a lot of our own agents, we deploy a lot of first party agents. In wo uh, we test with real users in multiple rounds. We generate our own event data sets to make sure, uh, whether the agents behaving the way it is in different contextual, uh, scenarios.

Um, especially for horizontal agents, it's very helpful 'cause the way Hugo uses the agent may be different than the way ANU uses the agent. And you really wanna test whether it makes sense in all of those contexts. So, um, embracing feedback from both real users, synthetic generated users and iterating often is helpful when the agent is in the sandbox, as well as when you deploy it to production.

Just making sure that you have a wide user base cover. And last but not the least, I think it's important to think about adoption as well. Because when you flood [00:22:00] an organization with a number of different agents, it can be a lot of overwhelm. So one of the things that we have actually, um, used in our own products, for instance, the user experience of discovering agents, what's the right agent to use in this context?

It's very helpful to keep an eye on what, um, adoption practices can help with discovery of the agent and deployment in the right context. 

hugo: R relatedly, I think we hinted at this earlier. Reliability. Reliability is really interesting here because the non-determinism or flip floppiness or stochasticity of LLMs generative AI and agents of course, yeah.

Can, as you said, can be challenging, but it also can be a superpower, so I'm just wondering how you think about how to use it as a superpower, but then also on the other side, how do you ground agents and curb hallucinations and making them more deterministic when they need to be? 

anu: Yeah, I think this is particularly important for our context because Atlassian operates [00:23:00] very much in the B2B context, right?

We are building software for teams that are part of a business, and a lot of this, uh, these applications require high degrees of accuracy and reliability. So how do you bring that? Um, so there are two ways we think about grounding. One is grounding the factual claims in the answer such as the answers to who owns Component X or where is the documentation for feature y or when is some release going to happen?

And so it's usually the who, where, when, what, why kind of questions for that. We think about grounding agents to the knowledge sources that contain the answers. So it's helpful to converge and to say, you know what? The knowledge sources we want the agent to be grounded on are the following things. Mm-hmm.

Some documents in Google Drive, some documents in Confluence. Something else across email. Which are already verified to be sound sources of knowledge. That in and of itself, just drawing the boundary around knowledge sources they have access to is a [00:24:00] helpful place to start from where it's clean data and verifiable data that coupled with the right permission.

So this is why the agents for them to function well in an enterprise context requires a lot of technical deep investment. Which respects permission, security, role-based access, all of that in an enterprise setting. This is precisely what we have done with our data connectors. We have about 50 of those with such that they are smart about the permission sets to be deployed in any given situation so that they're grounded on the information that is supposed to be available.

Caller. A second thing way to think about grounding is reasoning, grounding. So the process that the agent is thinking and giving, um, answers or making decision that we think about as how do you show the reasoning, traces and inform users step by step. Such that when there is a human in the loop, they're able to see a full log of, oh, this is why the agent did what it did.

So that even post [00:25:00] factor, if there is a correction to make or a tweak to make, uh, the full context is available. And so that I think is going to be even more important in the coming days as agents get more powerful and more decision points at sub auditing and traceability gets more important. 

hugo: I love that with index.

Quite heavily on observability and monitoring and auditing and evaluation, uh, as well. I am, as you may recall, like I, I do a bunch of work in consulting, helping people build and ship data, ml AI part products and teaching, uh, as well. And a stu a stumbling block I come up against is when I tell people like, you need to look at your data even, let's look at 20 traces together, what, whatever it may be to early on, assess failure modes and that type of stuff.

I kid you not, half the time people say to me, can I get an agent to do that instead? And I'm, and at the moment I'm saying, no, you really need to look at this yourself. But I am very interested in the future, I suppose, of self-healing [00:26:00] agentic systems. And in fact, it, I think it was just last month that Anthropic put out their wonderful posts about their, in their research agent and how they developed it.

Internally, and one thing that caught my eye in in there was the, I think it was the orchestrator agent in the system became quite good at self-healing prompts, given failure mode. So it would iterate on its own prompts once it was aware, quote unquote, aware of the particular. Failure mode. So then this kind of felt a bit like future music to me.

So I'm wondering your thoughts and what, what you've seen with respect to potentially self-healing systems. 

anu: Yeah, it's really fascinating. I understand the post that you're referencing and also the amount of progress that's happening in that stream is quite mind boggling. Again, I think there is a spectrum of self-healing.

'cause self-healing as a concept predates AI too, right? So self-healing systems have certain attributes. I, I think there is a lot of low hanging fruit today already that companies building AI agents as well as users using [00:27:00] AI agents can deploy, much of which used to be prompt engineering sort of basics, but gets more and more subsumed into the way models behave automatically, such that you don't have to worry about explicitly.

Editing your prompt to get said behavior. A great example is we have a deep research agent in ro, which is, um, very similar to the research agent, and what it does is it checks its own work. So a lot of, are you sure this is the right answer? And cross-checking against a verified source of knowledge. A lot of that can be built into the behavior of the agent itself.

Ultimately, it's a trade off between the time to response. How expensive is the query and how important is it that this needs to be absolutely accurate or not? So I do think orchestration of multiple agents will require some degree of self-healing across the spectrum of self-healing, and that that just feels like the organic next step to happen.

But plenty of things that you can do already today [00:28:00] by just treating the input to the agent smartly and treating the construction of the agent smartly. 

hugo: Absolutely. And in fact, it was another Anthropic blog post that you'll recall from December last year, building effective AI agents where they, they said, well, let's just step back and think about what we're doing.

And sure, agents are great, but let's look at LLMs Augmented LLMs. So augmented with. Memory retrieval tool calls. And then they showed like several patterns that they've seen occur a lot in, in, in the use of an anthropic and Claude. And the final one was, was the evaluator optimizer, where they had a small loop where one LLM would generate something and the other would provide feedback and it would go in in a loop.

So even thinking about these basic workflows, I think can. Can take us a long way. I'm also interested, we've talked about the importance of evaluation and end-to-end evaluation. I'm wondering what me metrics matter most when you're proving business value? Is it. Time saved. Task completion [00:29:00] stickiness. L latency.

anu: Yeah. Yeah. So two parts. Just to react to the first part of your question, I think that's going to become increasingly more interesting. The whole thing about one agent does one part of the workflow, another does another part, and how do you stitch the two things together? So for us at Atlassian, we think a lot about, hey, when multiple agents interact, perhaps with multiple humans.

So in the team of the future where there are many humans and many agents working together, what would be the next unlock of value? And we are very excited about the notion of multiplayer collaboration, even in a pre AI world. That was really what Atlassian products, uh, optimized to do. Bring Hugo and Annu and two others together and really help them work together such that they unleash their collective knowledge.

And in an AI world, what can we do to bring Hugo and Annu and a few agents together, but how can we help them work together very well? I think so far we've been able to unlock value where Annu works with one agent, or ANU works [00:30:00] with multiple agents. And how does that workflow really get unlocked? But really Annu, Hugo, and a bunch of agents working together, the multiplayer multi-agent collaboration, there's so many permutations there that have not yet been explored.

And the more powerful agents get at handling more sophisticated reasoning as well as orchestration across humans and agents. I think we are gonna see brand new workflows emerge that were just not possible to do before, and that's what's exciting about teamwork in an AI enabled world. The second part of your question, what's important?

How do you measure business value? So Robo already has over a million active users, and so I spend a lot of my time talking to robo customers about how do they perceive business value, what is important to them. And I think the answer varies across a range of sometimes they think about time save. For instance, Harper Collins, one of our robo customers, saved time on manual effort, um, around the publishing business using ro.

Such that they [00:31:00] reduce the amount of manual effort by Forex so that, uh, time savings is important for that specific business, and they're able to measure business value through that task. Completion is another one, a proco or one of our early customers, a Provo construction based company. They're able to really, um, complete tasks of product roadmap, creation and backlog creation, and make that a lot more automatic and, um, complete those sets of tasks which were leading up to product roadmap, uh, creation a lot more quickly.

And so really specific tasks around that. And then there's also the question of ours saved, converting to. How can you generate revenue, right? One is around the how do you optimize Another is how do you really generate revenue through new workflows that you weren't able to do before? Customer support is a great example there of a lot of the times our customer service reps, in fact, we see this internally, we have saved millions of uh, [00:32:00] dollars, but also been able to increase customer satisfaction.

By deploying some of the Rover agents in a CSM scenario, uh, that tends to be a pretty popular way to think about business value as well. 

hugo: Super cool. I'm wondering across your customer base, which industries and industry workflows are just seeing the fastest ROI from these types of custom agents, and why you think those are happening first?

anu: Yeah. I would think of it more as what are the. Workflows that see the most value because I think across the board, which industries see value from it, I think we depend a lot on speed of adoption. So the examples I was just giving you, publishing industry. We have a financial services company, we have a bank, we have customers across pretty much every industry seeing value with ai.

But I think what distinguishes companies that, uh, are able to unlock a lot of value from AI versus those that are not. How well they understand their business [00:33:00] workflows and are able to choose the right workflows to be, uh, assisted with ai. So the question you were asking, somebody asked you about, Hey, can I give this to an agent?

The answer to that question is yes and no, depending on what the workflow is, how deterministic it is, and how automatable it is. So I think that the companies that are able to clearly establish which workflows can be AI assisted, AI powered. Those are the ones that I observe getting the maximum benefit from AI deployment.

hugo: Super interesting. And I'm wondering what type of challenges come up in adoption and in terms of meeting people in what they really need? So something I see quite a bit is I think people would generally prefer to talk to agents in certain situations and not tight, right? And voice is something where latency is a far more serious co concern.

So I'm wondering what types of things crop up on that. Along those lines that you're trying to solve? 

anu: Yeah, I think we are in the very early stages of what user interactions would work [00:34:00] best, not only for agents, but any AI powered workflow. The example you mentioned around audio, we have a feature called Audio Overviews in Confluence, which basically gives you a 32nd, um, description, audio description of maybe a 15 minute long page, and that tends to get used a lot.

And in the context of a mobile app. We have robo voice mode, audio overviews, all of those getting involved from people on the go, and they very much want the voice mode interaction. But also I think they have a bit more tolerance to, for example, oh, how precise or how concise, or how exactly the output format is in case of those interactions.

So Wise I think is definitely one of those that are getting a lot more mainstream. Again, customer support, service tickets, all of those tend to deploy voice a lot. So we see our voice capabilities getting used. But I think what's gonna be more interesting is Chad is definitely not the end state. Nobody wants to be chatting all the time.

So really, [00:35:00] what's the kind of background agents, smart agents, that are able to intercept the right time at which you can engage with the user without the user explicitly having to call? A great example is a LU AI agent, so it's an AI note taker that's in the background, listening to meetings and transcribing notes and taking work.

Items and connecting it to your existing work such that it actually moves work forward. Not just, uh, making note, but imagine a world where the AI agent was also able to interject in the meeting and say that, for instance, if I was telling you, oh, Hugo, I think Sharif on my team works on that. If the agent was able to intercept and say, actually now Gja on Sharif's team owns that particular work item, and maybe we can route you go to Gunjan.

That would be very valuable. So the interaction patterns of when can agents intercept, when can they read and write into a given experience. I think we are just starting out in some [00:36:00] exciting, uh, progress we had there. 

hugo: I love that example so much. I always state the example I would love if we're on a call and I was like, oh, I knew maybe we can chat in at this time in a week.

And you agreed. The agent doing the transcription could just send us a calendar invite so we wouldn't have to deal with time zones. But I feel like I'm playing checkers and you're playing four Hs with that example. There. There are actually a couple of things we've touched upon, um, that aren't quite happening yet.

Um, and one is, one is multiplayer mode, and I'm a bit surprised that hasn't happened on the consumer side so much yet. One of the only examples I've seen is actually meta AI and like WhatsApp and chats and Messenger and Instagram where you can have. But in all honesty, it like its memory doesn't seem fantastic and it won't remember who said what and these types of things.

And of course there are a bunch of different in incentives there, but I'm interested in maybe what the challenges are there, where the memory is, is an issue. Um, and like retaining contact. The other that you've spoken to explicitly [00:37:00] about hinted at something broader is reactive AI systems. So. I go and chat with Claude or chat GPT or whatever it is when I need things done.

But having something that would come to me on Monday morning and say, Hey, I see all these things. I can take this off your plate. I have no idea about this. This, if you answer this question, I can help with, um, and you need to reply to this email, those types of things. So I'm wondering. What needs to happen in the space for us to really solve multiplayer mode and make progress on reactive systems?

anu: Yeah, uh, that's a great question 'cause I think there are multiple dimensions to what are vectors along which we need to make progress for systems to truly behave like multiplayer collaboration systems. And so some, what are those axis? Well, one is, if you think about the capability of agents along the axis of.

Model context and tooling, um, models themselves are getting more and more capable by the day. So I think that's a great place where you continue to see progress context. The kind of context [00:38:00] you can pass to agents determines the quality of their output. And when thinking about multiplayer context becomes even more important because what Annu knows may not be what Hugo knows.

And what Annu knows that she is allowed to communicate to Hugo may be a smaller subset of what Annu actually knows. So in a multiplayer context, I think, um, there is an additional layer of abstraction on top of permissions. You need to think about how does that context translate when it's Annu and Hugo working with an agent versus Annu Hugo, and the general counsel working with the agent.

What kind of shared context can these people have in that particular setting? And that tends to be very sophisticated and understanding the enterprise graph and permissions graph. How do work objects relate to people and what, what's the Venn diagram overlap across that? So context can get fairly complex.

The more number of people you add, especially in a company context, [00:39:00] in a non-consumer context. In terms of tooling, I think this also, you see a lot of, for example, Atlassian release released a remote MCP server for cloud. Such that you can use tooling where you can automatically create a Jira ticket or reference a Confluence page through Cloud I, I think we are seeing more and more innovative ways of improving the kind of tooling that.

Agents can have as well as that agents can call into, and I think that will also deeply help with a multiplayer scenario because what can you do in a multiplayer setting as that, as the toolbox that grows more and more, I think they become more interesting versus Annu and Hugo and Christine working on a whiteboard together.

Yep. You can kind of whiteboard some ideas, but wouldn't it be marvelous if you take those. Ideas and convert them to actual prototypes next and take those prototypes and potentially ask a few users what they think. And there I think automation is your friend. So this is an area where Atlassian, we run a lot of experiments where [00:40:00] what if we could combine the deterministic nature of automations with the creativity of agents, and then help string those engine multiplayer workflows such that you can simulate a real world work to the extent possible.

We've had some failures and some successes there. And, uh, it's always fun to see what works in an Automation Plus agent context, but I think that will play an important role in the evolution of multiplayer experiences. 

hugo: Super interesting. And I, I must apologize. I committed the cardinal sin of any interview and asked, you'd got too excited and asked you two questions at, at, at once.

So I'm interested in any further thoughts on the concept of, um. Not just reactive agents, but But proactive agents. 

anu: Yeah. I, so I think proactive agents can. It can manifest in multiple ways. I think one really interesting way, especially in a teamwork context, is the example you were mentioning where you have a debrief of what your day looks like without having to ask an agent or [00:41:00] a prep, uh, material.

Uh, here is what's going to happen next that, uh, agent can proactively deliver. We think about that as really smart ways to inject agent experiences, uh, proactively into a user day-to-day workflow. So we do a number of those things. One really popular agent is called the Next Best Work Item. So if you log in to your Atlassian system, you're able to look at across everything that's on my plate today.

And the broader you go, the more projects you take, the more it gets right. It's able to actually look across all of those pieces of work, understand your context and say, this is the next best work item to take on for the following reasons. And that is a great way for agents to proactively be able to assist users with here is what needs to happen next.

And a second category of those, uh, what we think about is ambient agents where they're running in the background and are triggered by a particular event. For example, if I get an email from a [00:42:00] particular person, which requires a turnaround time, that's much faster than anything else I see. For example, those are agents that can be smart enough to say, Holly email Annu, and this is a time sensitive thing regarding a flight ticket booking and you need to respond within the next 30 minutes.

Um, so event triggered ambient agents that understand the context of on those daily work, how important is that flight ticket booking? In a news life and what is the time sensitivity involved? I think those that are multiple opportunities for us to innovate on. 

hugo: Fascinating. I also, I am interested in, so this may seem like an odd question, but I'll explain it after the fact.

Yeah. The question is how do we make agents' lives easier? And what I mean by that is, so a while ago I used OpenAI operator to search for shoes online for me, and I watched it and it found what I needed. But it was like clicking around a web browser and, and going to websites and looking at [00:43:00] dropdowns and that type of stuff.

And. And the structure of the human readable internet doesn't seem like an ideal way for agents to be going through browsers and that type of stuff. So I wonder how you think about the present and future of creating these types of. Information systems that allow agents to do their job as well as possible.

anu: Yeah. So in the present, I would think two things that matter. So the human readable web becoming more of the agent readable web is already in play out today, right? So you see a lot of robots do TXTs on, uh, websites, get more and more friendlier to agents and listing the do's and don'ts. We see that with our documentation sites, with our main website, all of that as well, and we very much are.

Progressing on making a lot of, uh, comms and documentation and websites agent readable as well as human readable. A second part of the same thing is actions, taking actions. This is why things like the MCP server are so interesting because these, uh, [00:44:00] think of them as your offerings built for agents. And they're gonna get more and more powerful and ubiquitous over time.

This is why we spend so much time building out those servers and making them available. But what is important in the future as you think about making agents' lives easier, but also making agents powerful in the right context is governance and ownership and security and permissioning become a lot more important.

I think we have not even discovered the kinds of, uh. Security issues that will pop up by using agents that unlock a wide swath of tools. I mean, you can see that with operator examples itself, right? Uh, I think that will become increasingly important in the future where we are to think about what are ways in which we can help agents act securely and in privacy, security governance compliant with.

Especially in the professional context. 

hugo: What a wild future. And, and one way I've thought about it is when electricity, the ability for humans to create electricity came about. We [00:45:00] didn't have a grid and it took us a while to, to figure that out. And similarly with the internet, Tim O'Reilly has a really interesting take that OpenAI, which he talked about on this podcast actually, that OpenAI is the a OL of generative ai.

And we're yet to see Google. The Google, um, emerge in in a lot of ways. And I do think. What I take from that is we have a lot of individual units and wonderful agenda capabilities now, but being able to network the, these systems and have a global network of language models and agent capabilities is yet to emerge.

And a lot of what we're working on now. 

anu: Yeah, that's a, that's definitely a fascinating, uh, metaphor. And I think it grows exponentially, which is why it's so exciting. We are very much in the first mile of a marathon and often we are eager to declare winners, but I think as we are able to really take the fundamental capabilities and string them together in a network that's meaningful, I'm pretty sure we will see scenarios emerge that we haven't even thought of yet.

hugo: Absolutely. And to your point [00:46:00] earlier, the ability of. People from all walks of life and all backgrounds to, to create here and contribute. The barrier to entry has been lowered so much in a variety of ways. I'm interested how you, you, you know, we've talked here and there about, um. Security issues, privacy or, or authentication.

And I wouldn't, I don't think I'd be doing my job properly if I didn't ask you about regulated sectors, um, which, you know, have very significant, um, and important concerns about oversight and audit trails. So I'm just wondering how, like, how forward leaning companies like reconciling necessary compliance with speed of adoption.

anu: Yeah, so we, at Atlassian, we have 300,000 customers, and a lot of them tend to be from the regulated space, federal government space, access to sensitive information and protecting and governing those is critical. It trumps pretty much everything else that we can talk about in terms of productivity improvement, increasing revenue.

Yeah. But all of that is great. But the number one thing is to [00:47:00] make sure that it happens in. An auditable, safe, secure way. So there really a few key principles that have helped is one, having audit log for all agents, whether they're built on top of row ourselves or whether custom agents are being built.

I think this is, um, going to get just as sophisticated as in the non-agent world where observability and. Um, bringing these different windows of context together and stringing them to get meaningful insights, even without, um, actively looking at it. Uh, doing that more as a background process is going to get more and more important.

So we've invested heavily in this through Atlassian Guard Premium and a number of our offerings. The second one, that's a principle that I would say is important is, uh, transparency. So how are agents making the decisions they're making? Reasoning chains and explaining what was really the decision factor.

Explainability itself [00:48:00] is a really large, deep area where the different companies look at it differently, but in the enterprise context, I think it's very important to have citations, citation quality. Especially given the risk of hallucination so that there is clear source attribution and transparency in terms of what was the reasoning, what was the source of data used to supply the reasoning, all of that.

Um, third principle, super critical is governance and ownership. The way the head of HR runs an agent and the results she gets versus an ICU developer runs the agents and the results I get. Need to be fairly different. So governing agent behavior and attributing ownership of agent outcomes to a particular human.

I think that's a very, very sophisticated, complex area that we are just scratching the surface off at the moment. And that I think is another important principle, especially in the. Regulated industry and you put all of that together and [00:49:00] there is, there's a really comprehensive set of security, privacy, governance, compliance capabilities that are required for AI agents to be deployed in production.

So this is one of those deep areas of investment that we've embarked on, which is gonna be very important for our customers as well as many SaaS users that are regulated companies 

hugo: without a doubt. And thank you for that. Such a thoughtful. Breakdown. Sadly, it's time to wrap up in a minute. I feel, you know, I've, I've got so many other questions I think that we can, we could chat about, but I'm wondering, we've talked somewhat a, a, about the future, um, both like the near term and medium to long term.

I'm just wondering, looking at a year ahead, what's, what's the next leap you expect in agent capabilities or tooling, and what should our listeners prepare for now? 

anu: Um, excellent question, and at the risk of making predictions in a highly uncertain area, I would say the most exciting thing, at least personally for me, and one that I expect to emerge very soon, is very [00:50:00] much a multiplayer interaction with agents like we talked about.

I don't think the interaction and the experience has come close to settling and I think we will, uh, discover more and more unique ways of interacting and experiencing agents in our lives. But what will be truly exciting is if we can unlock the same level of multiplayer collaboration across humans and agents in, uh, the way we interact with agents, as we can do today in a purely human environment.

I expect that over the next year we will make rapid progress in that area, and it's exciting to see how that really impacts the fabric of teamwork. 

hugo: Fantastic. Well, thank you for yet another, such a, such a wonderful conversation and for bringing everything you're doing at the forefront of, um, AI agent experimentation internally, but with your hundreds of thousands of customers as well.

Appreciate your time and wisdom, Manu. 

anu: It was wonderful to be here, Hugo. Thank you so much for having me. 

hugo: Thanks so much for listening to High Signal, brought to you by Delphina. If you enjoyed this [00:51:00] episode, don't forget to sign up for our newsletter, follow us on YouTube and share the podcast with your friends and colleagues like and subscribe on YouTube and give us five stars and a review on iTunes and Spotify.

This will help us bring you more of the conversations you love. All the links are in the show notes. We'll catch you next time.