Building a Local-First Agent Framework in Rust (Part 12): Health Checks, doctor, and Testing HTTP
See Part 0 for the latest table of contents and sample code. New chapters will be added over time.
Chapter 12: Health Checks, doctor, and Testing HTTP
In the previous chapter, abcb crossed an important line. The provider stopped being only an in-process mock and became something that could talk to an OpenAI-compatible model server over HTTP. That was the first real contact with the outside world.
But once a program crosses a process boundary, a new kind of question appears. Is the server running? Is the URL correct? Can we read the config file? Is the model actually loaded, or is the server only half alive? A mock provider cannot answer those questions because the mock lives in the same process as the test. A real provider can fail before the agent loop even begins.
This post is also available on Medium. If you’re a paid Medium member and happen to read it there, it helps fund my next cup of coffee. Much appreciated ☕️😄
The project needs a place for those checks before the agent loop starts. The new framework feature is not a smarter agent. It is a better doctor command. abcb doctor started as a simple sanity check. Now it reads the project config, reports the configured model endpoint, and optionally probes the model server's health endpoint.
That sounds small, but it is an important shift. The agent loop should not be the first place where we discover that the local model server is missing. Diagnostics are more useful when they run before the main workflow and report the environment as it is. They help us understand the setup before we ask the agent to do useful work.
The sample code for this chapter is in chapter12/abcb/.
12.1 doctor Reports, It Does Not Judge
The doctor command now has a real job:
File: abcb/crates/abcb-cli/src/main.rs
async fn run_doctor() -> Result<(), Box<dyn Error>> {
println!("abcb doctor");
println!("workspace: ok");
let config = match load_config(Path::new("abcb.toml"))? {
Some(config) => {
println!("config: found abcb.toml");
if let Some(name) = config.project_name() {
println!("project: {name}");
}
config
}
None => {
println!("config: abcb.toml not found (ok for now)");
return Ok(());
}
};
match &config.model {
Some(model) => {
println!("model: {} @ {}", model.model, model.base_url);
check_mlx_health(model).await;
}
None => println!("model: no [model] section configured"),
}
Ok(())
}
There is one continuity detail from Chapter 11. doctor used to be synchronous:
Command::Doctor => run_doctor()?,
Now it may call check_mlx_health(model).await, so run_doctor becomes an async fn, and the main match arm awaits it:
File: abcb/crates/abcb-cli/src/main.rs
Command::Doctor => run_doctor().await?,
This is the same async rule we saw in the previous chapter. Once one function needs to wait on an async operation, the caller usually has to become async too, or at least explicitly await the async work somewhere along the call path.
The design choice here is that doctor is a reporting command. If there is no abcb.toml, it prints that fact and exits successfully. If there is a config file but no [model] section, it prints that too. The command is not trying to enforce that every project has a model configured at every moment.
That fits how this book has been building the project. The mock path is still useful. The real-provider path needs model config, but the whole CLI should not become unusable when the config is absent. doctor tells us what it sees. It does not turn every missing optional piece into a fatal error.
This is different from the real chat and run paths. When those commands run without --mock, they do require model config:
File: abcb/crates/abcb-cli/src/main.rs
fn load_required_config() -> Result<Config, Box<dyn Error>> {
let config = load_config(Path::new("abcb.toml"))?
.ok_or("no abcb.toml found; add a [model] section or pass --mock")?;
Ok(config)
}
The distinction is intentional. A command that needs a model should fail loudly when the model config is missing. A diagnostic command should show the current condition and keep going as far as it can.
12.2 Optional Health Checks
Chapter 10 added an optional health_url to the model configuration:
File: abcb/crates/abcb-cli/src/main.rs
#[derive(Debug, Deserialize, PartialEq, Eq)]
struct ModelConfig {
/// OpenAI-compatible base URL, e.g. `http://localhost:8083/v1`.
base_url: String,
/// Model identifier sent in the request body (the local model path).
model: String,
/// Optional health endpoint used by `abcb doctor`.
health_url: Option<String>,
}
The base URL and model name are required when the [model] section exists. The health URL is not. Not every OpenAI-compatible server exposes the same health endpoint. Some have /health, some have something else, and some may not expose a useful health endpoint at all.
So doctor treats the health check as optional:
File: abcb/crates/abcb-cli/src/main.rs
async fn check_mlx_health(model: &ModelConfig) {
let Some(health_url) = model.health_url.as_deref() else {
println!("mlx: no health_url configured (skipping health check)");
return;
};
match check_health(health_url).await {
Ok(report) => {
println!("mlx: {} ({health_url})", report.status);
match report.loaded_model {
Some(loaded) => println!("mlx model: {loaded}"),
None => println!("mlx model: (none reported)"),
}
}
Err(e) => println!("mlx: unreachable ({e})"),
}
}
The function name and printed mlx: prefix reflect the local setup I was using while building this chapter. The surrounding design is still OpenAI-compatible rather than MLX-specific: the config provides a URL, and doctor probes that configured health endpoint. If another local server exposes a compatible health response, the same shape can still work.
The first line is the important one:
let Some(health_url) = model.health_url.as_deref() else {
println!("mlx: no health_url configured (skipping health check)");
return;
};
It does two jobs. First, it borrows the optional health URL as text. Second, it leaves the function early (return;) when there is no health URL to check.
Start with the field. model.health_url has this type:
Option<String>
That means the config owns the string when it exists. But check_health does not need to own that string. It only needs to read it as a &str. Calling as_deref() turns:
Option<String>
into:
Option<&str>
It does not clone the string and it does not move the String out of model. If the original value is:
Some(String::from("http://localhost:8083/health"))
then as_deref() gives us:
Some("http://localhost:8083/health")
where the inner value is a borrowed &str.
After as_deref(), the right side of the let statement is an Option<&str>. Then the pattern on the left side decides what to do with it:
Some(health_url)
matches the Some(...) case and binds the inner &str to a new local variable named health_url. After the let-else statement, health_url has type &str, so we can pass it directly to:
check_health(health_url).await
If the value is None, the pattern does not match, so Rust runs the else block and returns from the function. That is the let-else part. It is useful when one pattern is the path forward and every other case should leave early.
In other words, the line is roughly equivalent to:
let health_url = match model.health_url.as_deref() {
Some(value) => value,
None => {
println!("mlx: no health_url configured (skipping health check)");
return;
}
};
We could write this with if let:
if let Some(health_url) = model.health_url.as_deref() {
// do the health check
} else {
println!("mlx: no health_url configured (skipping health check)");
return;
}
That works, but it nests the main path one level deeper. let-else keeps the rest of the function at the normal indentation level. It reads like a guard: if we do not have the required value for this block of work, leave now.
This is the same ownership habit we have been using throughout the book. Configuration owns its data. A function that only needs to read a field should borrow it.
Rust:let-elselet-elseis a pattern match for the common "continue only if this pattern matches" shape. Theelseblock must not fall through to the next line. It has to leave the current flow, usually withreturn,break,continue, orpanic!. That is why Rust can trust that, after the statement, the matched values are available in the surrounding scope.
12.3 A Typed Health Report
The health endpoint returns JSON, so the model crate adds a small response type:
File: abcb/crates/abcb-models/src/lib.rs
#[derive(Debug, Deserialize)]
pub struct HealthReport {
pub status: String,
#[serde(default)]
pub loaded_model: Option<String>,
#[serde(default)]
pub loaded_adapter: Option<String>,
}
This type is based on the shape observed from the local MLX server:
{"status":"healthy","loaded_model":"/path/...","loaded_adapter":null}
The status field is required. If the server says it is healthy, starting, or in some other state, doctor can print that. The model and adapter fields are optional because the server may omit them or send null.
That is why the fields are Option<String>:
pub loaded_model: Option<String>,
pub loaded_adapter: Option<String>,
And that is why each has #[serde(default)]:
#[serde(default)]
pub loaded_model: Option<String>,
For an Option<T> field, #[serde(default)] means "if this field is missing, use None." Without it, missing fields would be a deserialization error. This is different from the config structs in Chapter 10, where I preferred accessor methods for defaults because the struct represented a user-authored file. Here the struct represents a server response, and a missing field genuinely means "not reported." Letting Serde fill in None is the simpler and more honest choice.
With that, both of these bodies are accepted:
{"status":"healthy","loaded_model":"/models/gemma4","loaded_adapter":null}
{"status":"starting"}
The tests make that behavior explicit. They call parse_health, the small parser we will look at in the next section:
File: abcb/crates/abcb-models/src/lib.rs
#[test]
fn parse_health_reads_status_and_loaded_model() {
let body = r#"{"status":"healthy","loaded_model":"/models/gemma4","loaded_adapter":null}"#;
let report = parse_health(body).expect("should parse");
assert_eq!(report.status, "healthy");
assert_eq!(report.loaded_model.as_deref(), Some("/models/gemma4"));
assert_eq!(report.loaded_adapter, None);
}
#[test]
fn parse_health_tolerates_missing_loaded_model() {
let body = r#"{"status":"starting"}"#;
let report = parse_health(body).expect("should parse");
assert_eq!(report.status, "starting");
assert_eq!(report.loaded_model, None);
}
The first test covers the full response shape. The second test covers a smaller response. This is the kind of tolerance that belongs at a provider boundary. The server might evolve slightly, but as long as the core facts are present, the diagnostic command can still report something useful.
12.4 Checking Health Over HTTP
The actual health check lives in abcb-models:
File: abcb/crates/abcb-models/src/lib.rs
pub async fn check_health(health_url: &str) -> Result<HealthReport, ProviderError> {
let response = reqwest::Client::new()
.get(health_url)
.send()
.await
.map_err(|e| ProviderError::Backend(Box::new(e)))?;
let status = response.status();
let body = response
.text()
.await
.map_err(|e| ProviderError::Backend(Box::new(e)))?;
if !status.is_success() {
return Err(ProviderError::Backend(
format!("HTTP {status} from {health_url}: {body}").into(),
));
}
parse_health(&body)
}
The shape should look familiar from OpenAiCompatProvider::complete. We send a request, await the response, read the status, consume the body as text, turn non-success HTTP statuses into provider backend errors, and parse the JSON body.
The status is read before the body is consumed:
let status = response.status();
let body = response
.text()
.await
.map_err(|e| ProviderError::Backend(Box::new(e)))?;
That matters because response.text().await consumes the response. After that call, the response value is gone. If we want the status and the body in the error message, we need to capture the status first.
The parser itself is small:
File: abcb/crates/abcb-models/src/lib.rs
fn parse_health(body: &str) -> Result<HealthReport, ProviderError> {
serde_json::from_str(body).map_err(|e| ProviderError::Backend(Box::new(e)))
}
This is another example of the boundary we chose in Chapter 11. abcb-models can know about serde_json and reqwest. abcb-core does not. Provider-specific failures are mapped into ProviderError::Backend at the edge.
Then the CLI decides how to present that result:
File: abcb/crates/abcb-cli/src/main.rs
match check_health(health_url).await {
Ok(report) => {
println!("mlx: {} ({health_url})", report.status);
match report.loaded_model {
Some(loaded) => println!("mlx model: {loaded}"),
None => println!("mlx model: (none reported)"),
}
}
Err(e) => println!("mlx: unreachable ({e})"),
}
Notice that the error is not propagated with ?. In the Err(e) branch, doctor only prints the problem:
Err(e) => println!("mlx: unreachable ({e})"),
That branch does not return an error. The match completes, control goes back to run_doctor, and run_doctor reaches its final Ok(()).
That is a policy decision. In chat or run, an unreachable provider is fatal because the command cannot do its main job. In doctor, an unreachable provider is exactly the kind of thing the command is supposed to report.
12.5 What to Test and What to Trust
HTTP code creates a testing temptation. We can try to test every possible network failure: connection refused, timeout, malformed response, 500 response, missing fields, and so on. Some of those tests are useful. Some are brittle. The question is not "can we test it?" The better question is "what does this test prove, and will it fail for the right reason?"
For this chapter, the testing rule is:
Test the pure parsing surface thoroughly. Test the HTTP wiring with a small local mock server. Do not make the unit test suite depend on a real model server being up.
That gives us three layers.
The local mock server in this chapter is wiremock. It starts a real local TCP server for the test, lets the test define which requests should match, and returns controlled responses. The production code still uses reqwest; only the server side is fake.
The tests import it like this:
File: abcb/crates/abcb-models/src/lib.rs
use serde_json::json;
use wiremock::matchers::{body_partial_json, method, path};
use wiremock::{Mock, MockServer, ResponseTemplate};
First, pure parsing tests:
File: abcb/crates/abcb-models/src/lib.rs
#[test]
fn parse_health_errors_on_malformed_json() {
let err = parse_health("not json").expect_err("malformed should error");
assert!(matches!(err, ProviderError::Backend(_)));
}
This test has no network. It is fast, deterministic, and precise. If it fails, the parser changed.
Second, HTTP integration tests with a local mock server:
File: abcb/crates/abcb-models/src/lib.rs
#[tokio::test]
async fn check_health_returns_report_on_success() {
let server = MockServer::start().await;
Mock::given(method("GET"))
.and(path("/health"))
.respond_with(ResponseTemplate::new(200).set_body_json(json!({
"status": "healthy",
"loaded_model": "/models/gemma4",
"loaded_adapter": null
})))
.mount(&server)
.await;
let report = check_health(&format!("{}/health", server.uri()))
.await
.expect("health check should succeed");
assert_eq!(report.status, "healthy");
assert_eq!(report.loaded_model.as_deref(), Some("/models/gemma4"));
}
This test does use HTTP, but it does not use the real model server. wiremock starts a local test server, reqwest sends a real request to it, and the test controls the response body.
The json! macro from serde_json builds a JSON value inline:
json!({
"status": "healthy",
"loaded_model": "/models/gemma4",
"loaded_adapter": null
})
Here, wiremock serializes that value into the response body returned by the mock server. In the next section, we will also use json! inside a request matcher.
The builder chain reads in three steps. Mock::given(method("GET")).and(path("/health")) says which request should match. respond_with(...) says what the server should return when that request arrives. mount(&server).await attaches that mock rule to the running local server.
Third, live smoke testing by hand. If I want to confirm that MLX and the configured model work on my machine today, I can run abcb doctor against the live endpoint. But that check belongs in my local workflow, not in the repeatable test suite. A test suite that fails because my model server is not running is not a good test suite.
Decision: test the stable surface
Pure functions should get direct unit tests. Thin I/O wiring should get a few integration tests against controlled local replacements, like a mock HTTP server. A real external service should be checked manually or in a dedicated environment, not as an ordinary unit test dependency.
This rule also applies backward. read_events is pure enough to test with in-memory readers and temporary files. run_replay mostly wires file I/O to printing, so we keep the deeper tests closer to the parser. The same rule will apply forward when more of the framework touches disk, time, tools, and eventually editor integration.
12.6 wiremock Matchers
The new dev dependency is wiremock:
File: abcb/Cargo.toml
[workspace.dependencies]
wiremock = "0.6.5"
The model crate uses it only for tests:
File: abcb/crates/abcb-models/Cargo.toml
[dev-dependencies]
tokio = { workspace = true }
wiremock = { workspace = true }
The same tool also tests the chat completion path, and this test adds one more matcher:
File: abcb/crates/abcb-models/src/lib.rs
#[tokio::test]
async fn complete_posts_to_chat_completions_and_returns_message() {
let server = MockServer::start().await;
Mock::given(method("POST"))
.and(path("/chat/completions"))
// Also asserts we send non-streaming requests.
.and(body_partial_json(json!({"stream": false})))
.respond_with(ResponseTemplate::new(200).set_body_json(json!({
"choices": [{"message": {"role": "assistant", "content": "hello from mock"}}]
})))
.mount(&server)
.await;
let mut provider = OpenAiCompatProvider::new(server.uri(), "/models/gemma4");
let mut session = Session::new("s");
session.push_message(Message::new(Role::User, "hi"));
let reply = provider
.complete(&session)
.await
.expect("complete should succeed");
assert_eq!(reply, Message::new(Role::Assistant, "hello from mock"));
}
The matcher is doing two jobs:
.and(body_partial_json(json!({"stream": false})))
It checks that the body contains "stream": false, and it leaves the rest of the request body flexible. That is the right level of strictness for this test. We care that this chapter still sends non-streaming requests. We do not need this one test to assert every byte of the serialized request body.
For non-success statuses, the test checks the error category:
File: abcb/crates/abcb-models/src/lib.rs
#[tokio::test]
async fn check_health_maps_non_success_status_to_backend_error() {
let server = MockServer::start().await;
Mock::given(method("GET"))
.and(path("/health"))
.respond_with(ResponseTemplate::new(503))
.mount(&server)
.await;
let err = check_health(&format!("{}/health", server.uri()))
.await
.expect_err("a 503 should map to an error");
assert!(matches!(err, ProviderError::Backend(_)));
}
Again, the test does not compare the entire error value. That would not work well with ProviderError::Backend(Box<dyn Error + Send + Sync>). The important fact is that a non-success HTTP status is mapped into the backend-error side of the provider boundary.
12.7 The Real Model Still Fails
At this point, the network boundary is better tested and easier to inspect. That does not mean the agent is ready.
This is where the local-first story becomes more honest. If we point abcb run at a raw local Gemma model here, the model may not return the exact JSON envelope that the loop expects. It might answer in prose. It might wrap JSON in a Markdown code fence. It might use the wrong field name. It might describe a tool call instead of emitting the schema:
{"kind":"tool_call","tool_name":"echo","arguments":{"text":"hello"}}
Note: tool call means our envelope here
In this book, a "tool call" usually means a JSON envelope emitted as normal assistant text, such as{"kind":"tool_call", ...}. It is not the provider's native tool-calling API. That distinction matters here. The HTTP request can succeed, and the model can still fail to produce the envelope shape thatabcbexpects. In other words, the wire protocol can be healthy while the agent protocol is still broken.
That is not an HTTP problem. doctor can say the server is reachable. check_health can say the model is loaded. wiremock can prove that our client sends and parses the expected wire shapes. The remaining failure is the contract between the agent loop and the model's text output.
That distinction matters. Without doctor, every failure feels like "the model did not work." With doctor, we can separate at least three questions:
- Is the local server reachable?
- Is the HTTP provider speaking the expected wire protocol?
- Is the model following the agent envelope contract?
This chapter only addresses the first two. The third one is postponed deliberately. Chapter 18 will add the system prompt and more robust parsing that teach the model the envelope format and tolerate common Markdown fences.
For now, the failure is useful evidence. It tells us that a local model is not simply a drop-in replacement for a commercial assistant product. The framework has to make the contract visible, testable, and recoverable.
12.8 What Changed
Chapter 11 made abcb able to send a real HTTP request to an OpenAI-compatible chat endpoint. Chapter 12 adds the supporting work around that boundary.
doctor now reads abcb.toml, reports the configured model, and optionally checks a health endpoint. HealthReport gives the health response a typed shape. check_health reuses the same provider-boundary error pattern as the chat-completions provider. let-else gives the optional health URL a clean early-return shape. as_deref() lets the CLI borrow an optional string without cloning it.
The testing story also improves. Pure parsing is tested directly. HTTP wiring is tested through wiremock, which gives us real local HTTP without depending on a live model server. Live model checks remain manual, because they verify the local machine's current state rather than the framework's repeatable behavior.
The important limitation is still present: a healthy server does not guarantee a well-behaved agent turn. The model still has to speak the envelope language. That failure is not hidden here. It is the reason the later system-prompt and parsing chapters need to exist.
To be continued