Agent tools part 2 | Durable asynchronous tool calls
A promise ID is the secret ingredient in the special sauce of solving for our distributed system problems.
If you are building an MCP server then you are also building a distributed system.
The square hole
The main issue with MCP, and all LLM tool calling functionality at present, is that it is synchronous with no built-in mechanisms for handling failure.
Even though MCP standardizes a tool calling convention - it ignores all the other issues that arise from building a distributed system and as the developer you are burdened with figuring out:
Supervision, such as timeouts, to detect issues
Retry logic for application level errors
Deduplication and/or idempotency guarantees
Recovery for process crashes
But… what if you weren’t?
Pinky promise
The MCP Server Quickstart guide uses weather forecasting as an example use case. Let’s flip it around and use historic weather data gathering as our example, a use case which could take a much longer time and better reflect the need for a background job.
To enable asynchronous behavior, instead of a single tool such as get_weather_data
, we will create three tools:
start_gathering
probe_status
await_result
And here is the key — The start_gathering
tool, instead of blocking on the result of the data gather job, kicks off a background job and returns a promise ID — the secret ingredient to solving for our distributed system problems.
Let’s have a look at the code that makes it possible with Resonate:
In the previous code example, the job_name
doubles as the promise ID. We don’t need to burden our LLM with learning about promises and concurrency. We just need to make it clear that this tool starts a background job, and returns the name of the job which can be used to get the status or result.
How is this possible?
Decorate with Resonate
Resonate breaks the code up into two worlds:
Ephemeral World
Durable World
MCP lives in the Ephemeral World, if you were just using MCP and your server crashed mid-tool usage, then you are on your own in figuring out how to handle that.
Resonate enables MCP tools to transition to the Durable World, and as soon as you transition to the Durable World you are covered — Resonate gives you everything you need to handle idempotency, application errors, process failures, distribution, and concurrency.
Decorate the start_gathering()
function with @mcp.tool
, registering it as a tool with MCP. This function will execute in the Ephemeral World.
Decorate the weather_data()
function with @resonate.register
, registering the function with Resonate. This function will execute in the Durable World.
Within the start_gathering()
function we call weather_data.run()
, which transitions the call chain from the Ephemeral World to the Durable World. This call returns a handle to the invocation. You could wait right there for the result using the handle, but instead we will return the promise ID (job name) we used to invoke the weather_data() function.
The promise ID (job name) is also an idempotency key, enabling us to rejoin that invocation from almost anywhere. This is because Resonate promises are Durable Promises. That job name will always give you the same handle, and once resolved, will always give you the same result, instantly.
Are we there yet?
In the previous example, to keep things simple, we are accepting a single location at a time. However, you could alter it to kick off a set of jobs for a set of locations.
But to hammer in the value of Resonate Durable Promises, we want to look at checking the status and awaiting the results in sets. So that you can do something like this with your LLM:
The await_result()
and probe_status()
tools take a set of promise IDs (job names).
The probe_status()
tool just checks if the data is ready by using the promise ID to get the handle of the invocation and calling .done()
to check if the promise is Resolved yet.
So you can do something like this:
The await_result() tool will actually block and await on the results of the background jobs by calling .result()
on the handle.
And this turns what was a previously a completely synchronous and fragile tool interaction, into a durable and asynchronous data gathering and analysis experience.
The best of both worlds
Integrating Resonate into a MCP server preserves standardized tool calling while completely transforming your application into a Durable Distributed System.
You don’t need to teach the model about distributed systems.
You don’t need to wrap every tool in scaffolding just to handle timeouts or retries.
You don’t need to worry if your server crashes halfway through a job.
Check out the weather data agent tool example application to see for yourself.