Building a Local-First Agent Framework in Rust (Part 17): The Run Summary
See Part 0 for the latest table of contents and sample code. New chapters will be added over time.
Chapter 17: The Run Summary: A Read-Model From the Log
Chapter 14 gave the loop an event log. Chapter 16 added another kind of event: a denied tool call. By now, every abcb run can leave behind a small history of what happened: the user message, model responses, tool results, denied tools, and final answer.
This chapter turns that history around.
This post is also available on Medium. If you’re a paid Medium member and happen to read it there, it helps fund my next cup of coffee. Much appreciated ☕️😄
Until now, the log has mostly been useful for replay. Replay shows the sequence. First this happened, then that happened, then the run ended. That is valuable, but it is only one view of the same data.
Another view is a summary. How many model steps did this run take? Which tools ran? Which tools were denied? Did the run complete, hit the step limit, or fail? Which endpoint answered?
The important design choice is where that summary comes from. We do not make the loop accumulate counters while it runs. We already have the facts in events.jsonl, so the summary is derived from the log after the run. The event log becomes the source of truth, and the summary becomes a read-model over that log.
That phrase, read-model, is a little formal, but the idea is plain: write down facts once, then build useful views from them.
The sample code for this chapter is in chapter17/abcb/.
17.1 Bringing the Log Types Into the CLI
Most of this chapter lives in the CLI crate, because the core loop is already doing its job: it writes events. The CLI owns the run command, the session directory, the event file, and the terminal output.
The import list now includes the event-log types and loop error type that the summary needs:
File: abcb/crates/abcb-cli/src/main.rs
use abcb_core::{
AllowAll, Event, LoggedEvent, LoopError, Message, MockProvider, Role, Session, ToolRegistry,
one_turn, read_events, run_loop, write_event,
};
Two names matter especially here: LoggedEvent and LoopError.
LoggedEvent is the timestamped event record that read_events returns. A summary is not built from raw JSON strings. It is built from parsed events.
LoopError matters because the summary also needs to know how the run ended. The log tells us what events were written. The loop result tells us whether the run completed, hit the step limit, or failed for some other reason. Those are related facts, but they come from two different places.
The CLI also imports fmt:
File: abcb/crates/abcb-cli/src/main.rs
use std::fmt;
That is for the display layer. The summary will be a small Rust value, and then Display will decide how it appears in the terminal.
17.2 A Small Status Enum
The summary starts with the run status:
File: abcb/crates/abcb-cli/src/main.rs
#[derive(Debug, Eq, PartialEq)]
enum RunStatus {
Completed,
MaxStepsExceeded,
Failed,
}
The derives are the same small testing helpers we have used before. Debug makes failures readable, and Eq plus PartialEq lets tests compare statuses directly with assert_eq!.
This is intentionally not read from the event log. The event log records what happened inside the loop, but it does not record why the loop returned to the caller. That distinction belongs to the Result returned by run_loop.
The three cases are intentionally coarse. For a first summary, I do not need to distinguish every possible provider or tool failure. I only want to separate a completed run, a run that reached the configured step ceiling, and everything else that failed.
The conversion is small:
File: abcb/crates/abcb-cli/src/main.rs
impl RunStatus {
fn from_outcome(outcome: &Result<String, LoopError>) -> Self {
match outcome {
Ok(_) => RunStatus::Completed,
Err(LoopError::MaxStepsExceeded { .. }) => RunStatus::MaxStepsExceeded,
Err(_) => RunStatus::Failed,
}
}
}
The match reads from most specific to most general. Ok(_) means the loop returned an answer. The answer itself is not needed for the status, so _ ignores it.
Err(LoopError::MaxStepsExceeded { .. }) catches one specific error variant. The { .. } ignores the fields inside that variant, because the status only needs to know that the limit was hit.
The final Err(_) catches every other loop error. This arm must come after the specific MaxStepsExceeded arm. If we put Err(_) first, it would match all errors before Rust ever reached the more specific case.
The type of outcome is worth noticing:
File: abcb/crates/abcb-cli/src/main.rs
&Result<String, LoopError>
The successful value is the final answer string. The error value is the loop error. The summary does not need to own either one, so it borrows the result.
Keeping MaxStepsExceeded separate will matter when we compare agent behavior across runs. A step-limit run is not successful, but it tells a different story from an unknown tool, provider error, event-log error, or other failure.
The enum also knows how to label itself:
File: abcb/crates/abcb-cli/src/main.rs
fn label(&self) -> &'static str {
match self {
RunStatus::Completed => "completed",
RunStatus::MaxStepsExceeded => "max steps exceeded",
RunStatus::Failed => "failed",
}
}
This returns &'static str because the labels are string literals stored in the program itself. We are not allocating new String values for labels that never change.
17.3 The Summary Shape
The summary stores five facts:
File: abcb/crates/abcb-cli/src/main.rs
#[derive(Debug, Eq, PartialEq)]
struct RunSummary {
status: RunStatus,
steps: usize,
tools_called: Vec<String>,
tools_denied: Vec<String>,
endpoint: String,
}
This is deliberately small. It is not a transcript. It does not duplicate the event log. It answers a few questions that are useful after a run:
- Did the run complete?
- How many model turns happened?
- Which tools ran?
- Which tools were denied?
- Which endpoint was used?
The summary is a projection. It is a smaller view derived from a richer record.
17.4 Walking Events Into a Summary
The core method is from_run:
File: abcb/crates/abcb-cli/src/main.rs
impl RunSummary {
fn from_run(
events: &[LoggedEvent],
outcome: &Result<String, LoopError>,
endpoint: String,
) -> Self {
let mut steps = 0;
let mut tools_called = Vec::new();
let mut tools_denied = Vec::new();
for logged in events {
match &logged.event {
Event::ModelResponse { .. } => steps += 1,
Event::ToolResult { tool_name, .. } => tools_called.push(tool_name.clone()),
Event::ToolDenied { tool_name } => tools_denied.push(tool_name.clone()),
Event::UserMessage { .. } | Event::FinalAnswer { .. } => {}
}
}
RunSummary {
status: RunStatus::from_outcome(outcome),
steps,
tools_called,
tools_denied,
endpoint,
}
}
}
This is the read-model in code. It takes a slice of logged events, walks through them, and keeps only the fields the summary cares about. Conceptually, we are folding many event records into one summary value, but the code uses a plain for loop because it is easier to read at this point in the book.
The loop borrows each event:
File: abcb/crates/abcb-cli/src/main.rs
match &logged.event {
That & matters. events is the borrowed slice passed into from_run. The for logged in events loop gives us one borrowed LoggedEvent at a time. Because the function does not own the slice, it also does not own each LoggedEvent inside it. Matching on &logged.event lets the code inspect the event value while leaving the original log records intact.
ModelResponse increments steps. This means one model response equals one model turn for the summary. That includes turns that led to a denied tool, recovery feedback, or a malformed output. The model answered, so the run took a step.
The two tool lists answer a slightly different question from steps. steps tells us how many model turns happened. tools_called and tools_denied tell us what the model tried to do during those turns. That distinction matters when looking at a run later. A run with five model responses and no tools is very different from a run with five model responses, three tool calls, and one denied action.
ToolResult adds the tool name to tools_called.
ToolDenied adds the tool name to tools_denied.
Both tool-name branches use .clone() because the summary owns its lists:
File: abcb/crates/abcb-cli/src/main.rs
tools_called.push(tool_name.clone())
Inside the match, tool_name is borrowed from the event. The event log still owns the original string. Cloning copies just the tool name into the summary's Vec<String>, so the summary can live independently after from_run returns.
UserMessage and FinalAnswer are ignored for this summary. They are still important events, but they do not change these counters. The final answer is already represented by the loop outcome: if outcome is Ok(answer), the run completed. This summary is not trying to show the answer text itself. It is trying to show status, step count, tool activity, denials, and endpoint. If we later add a richer report, we might include the final answer or its length, but that would be a different view.
This is the payoff of the event log. We do not have to change run_loop to return a bag of statistics. We can derive a new view from the facts already written down.
17.4.1 Why Take &[LoggedEvent]?
The first parameter is a slice:
File: abcb/crates/abcb-cli/src/main.rs
events: &[LoggedEvent]
A slice is a borrowed view into a sequence. It does not own the events. It just says: give me access to zero or more LoggedEvent values in order.
That makes from_run more flexible than a function that takes Vec<LoggedEvent>. A Vec<LoggedEvent> can be borrowed as a slice, but so can an array, a subrange of a vector, or any other contiguous sequence of logged events. The function does not need ownership because it only reads.
This is a common Rust API habit: if a function only needs to read a list, accept a slice.
17.4.2 The .. Pattern
Some match arms use ..:
File: abcb/crates/abcb-cli/src/main.rs
Event::ModelResponse { .. } => steps += 1,
Event::ToolResult { tool_name, .. } => tools_called.push(tool_name.clone()),
.. means "ignore the remaining fields." For ModelResponse, we do not need the content at all. For ToolResult, we need the tool_name, but not the output.
This keeps the match focused on the data the summary actually uses.
17.5 Displaying the Summary
The summary becomes printable by implementing Display:
File: abcb/crates/abcb-cli/src/main.rs
impl fmt::Display for RunSummary {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let list = |xs: &[String]| format!("[{}]", xs.join(", "));
write!(
f,
"run summary: status: {}, steps: {}, tools: {}, denied: {}, endpoint: {}",
self.status.label(),
self.steps,
list(&self.tools_called),
list(&self.tools_denied),
self.endpoint,
)
}
}
We already used Display in Chapter 5 when we made error types printable. The same trait is useful here. A RunSummary is not an error, but it is still a value that needs a human-readable representation.
The small closure:
File: abcb/crates/abcb-cli/src/main.rs
let list = |xs: &[String]| format!("[{}]", xs.join(", "));
turns a list of tool names into a compact display string. If tools_called contains echo and session_note_append, the display becomes:
[echo, session_note_append]
That is enough for now. Later, if the summary grows into JSON output or a table, the underlying RunSummary value can stay the same while the display layer changes.
17.6 Naming the Endpoint
The summary also reports which endpoint was used:
File: abcb/crates/abcb-cli/src/main.rs
fn endpoint_label(config: &Config) -> String {
match &config.model {
Some(model) => format!("{} ({})", model.base_url, model.model),
None => "unknown".to_string(),
}
}
For the real provider path, this produces a label such as:
http://localhost:8083/v1 (/models/gemma4)
For the mock path, the code uses "mock" directly. The point is not to make a perfect telemetry system. It is to make a run summary useful when comparing runs. Knowing which endpoint answered is often the first thing I want to check.
17.7 Capturing the Outcome Before ?
The most important control-flow change is inside run_run.
It helps to name the layers before looking at the code. run_run is the CLI command handler for abcb run. It prepares the registry, session, provider, event-log writer, and terminal output. Inside it, run_loop is the actual agent loop. run_loop returns a Result<String, LoopError>: Ok(answer) if the agent reaches a final answer, or Err(...) if the loop cannot finish normally.
Before this chapter, the CLI could call run_loop(...).await?. If the loop failed, ? would return immediately from run_run. That is normally good Rust style. But it prevents us from summarizing failed runs.
This chapter changes that relationship. run_run still calls run_loop, but it does not immediately unwrap or propagate the result. It keeps the result in a variable named outcome, then uses that value to build the summary.
The outer shape of run_run is easier to see as a flow first:
run_run(message, mock)
create the registry and session
run either the mock branch or the real-provider branch
setup can return early with ?
each branch opens events.jsonl
each branch awaits run_loop
normal branch completion returns (outcome, endpoint, events_path)
read events_path back into LoggedEvent values
build and print RunSummary
apply outcome?
print the final answer on success
This is the map for the rest of the section. The tuple binding:
File: abcb/crates/abcb-cli/src/main.rs
let (outcome, endpoint, events_path) = ...
means the expression on the right must produce three values. The first value is the loop result. The second value is the endpoint label. The third value is the path to the event log.
There is also a small path-handling change around the event log. In Chapter 14, open_event_log took a session directory and joined events.jsonl inside the helper. Now run_run needs the full log path twice: once to open the writer, and again after the loop to reopen the same file for reading. So the caller builds events_path directly.
The next snippet is only the branch expression from inside run_run. Read each branch in two parts. The setup lines prepare a directory, event writer, endpoint, and provider. The final line of each branch returns the tuple for normal completion.
File: abcb/crates/abcb-cli/src/main.rs
let (outcome, endpoint, events_path) = if mock {
let dir = create_session_dir(Path::new(DEFAULT_MEMORY_DIR), &session.id)?;
let events_path = dir.join("events.jsonl");
let mut events = open_event_log(&events_path)?;
let mut provider = MockProvider::new([format!(
r#"{{"kind":"final","content":"mock run: you said {message}"}}"#
)]);
let outcome = run_loop(
&mut provider,
®istry,
&mut session,
DEFAULT_MAX_STEPS,
&mut events,
&mut AllowAll,
)
.await;
events.flush()?;
(outcome, "mock".to_string(), events_path)
} else {
let config = load_required_config()?;
let dir = create_session_dir(config.memory_dir(), &session.id)?;
let events_path = dir.join("events.jsonl");
let mut events = open_event_log(&events_path)?;
let endpoint = endpoint_label(&config);
let mut provider = build_provider(&config)?;
let outcome = run_loop(
&mut provider,
®istry,
&mut session,
config.max_steps(),
&mut events,
&mut AllowAll,
)
.await;
events.flush()?;
(outcome, endpoint, events_path)
};
The if mock { ... } else { ... } is an expression. In Rust, an if used this way produces a value. That also means both branches need to produce compatible values. Here, both normal branch endings produce the same tuple type:
(Result<String, LoopError>, String, PathBuf)
That tuple type describes the normal value of the if expression. The ? lines are different. They are early exits from the surrounding run_run function, not values returned by the branch.
For example:
File: abcb/crates/abcb-cli/src/main.rs
let mut events = open_event_log(&events_path)?;
does not mean open_event_log returns (outcome, endpoint, events_path). It means: if opening the file succeeds, keep the BufWriter<File> in events and continue. If opening the file fails, return early from run_run with an error.
So there are two paths:
setup succeeds -> branch reaches its final tuple
setup fails -> ? returns early from run_run
This chapter delays only one kind of error: the Result returned by run_loop. Setup errors and file I/O errors still use ? immediately, because there may be no event log to summarize if setup itself failed.
That is where events_path comes from in the next section.
The helper now accepts that full file path:
File: abcb/crates/abcb-cli/src/main.rs
fn open_event_log(path: &Path) -> io::Result<BufWriter<File>> {
let file = OpenOptions::new().append(true).create(true).open(path)?;
Ok(BufWriter::new(file))
}
Inside each branch, the important line is the same: let outcome = run_loop(...).await;.
There is no ? after .await. The result is stored in outcome, whether it is Ok(answer) or Err(error). That captured outcome is then returned from the branch as part of the tuple.
events.flush()? writes any buffered event data before the branch returns. Then the BufWriter drops at the end of the branch scope. After that, the CLI can open events_path again for reading.
This is a subtle but important Rust move. We are not ignoring errors. We are delaying propagation. The code needs the Result value long enough to summarize the run, then it can still propagate the error afterward.
17.8 Building the Summary After the Run
There is no separate summary command yet. The summary happens as the final part of abcb run.
At this point, the agent loop has already finished. The mock or real-provider branch has returned three values: outcome, endpoint, and events_path. The event log file has been flushed, and the writer has gone out of scope. Now the CLI immediately reads that same log back, derives a RunSummary, and prints it before deciding whether the run should exit successfully or with an error.
File: abcb/crates/abcb-cli/src/main.rs
let logged = read_events(BufReader::new(File::open(&events_path)?))?;
let summary = RunSummary::from_run(&logged, &outcome, endpoint);
eprintln!("{summary}");
let answer = outcome?;
println!("{answer}");
The order is the whole point. First, read_events turns the JSONL file back into LoggedEvent values. Then RunSummary::from_run combines those events with the captured outcome and endpoint label. Then eprintln! prints the summary for the person watching the command. Only after that does outcome? run.
If the run failed inside run_loop, the summary still prints before the error leaves run_run. If the run succeeded, outcome? extracts the answer and the CLI prints it.
This connects back to two earlier design choices. In Chapter 13, the caller owned the session, so the CLI could inspect run state after the loop returned. In Chapter 14, the loop wrote partial event logs even when later parsing or tool execution failed. Those choices now pay off: failed runs can still leave enough information to summarize.
17.9 Why Summary Goes to stderr
The summary is printed with eprintln!, not println!:
File: abcb/crates/abcb-cli/src/main.rs
eprintln!("{summary}");
The final answer still goes to standard output:
File: abcb/crates/abcb-cli/src/main.rs
println!("{answer}");
This is a command-line contract. The answer is the primary output of abcb run. If a user writes:
abcb run "write a short plan for tomorrow" | pbcopy
they probably want only the assistant's answer copied. The run summary is metadata about how the command behaved: status, step count, tool activity, denials, and endpoint. It is useful for the human watching the command, but it is not the answer to the user's prompt.
That is the reason for stderr. In command-line tools, stdout is usually the data stream other commands may consume. stderr is for diagnostics, progress, warnings, and other human-facing information. The run summary belongs to that second category.
With no redirection, both streams appear in the terminal:
abcb run "write a short plan for tomorrow"
If stdout is piped, only the assistant's answer goes into the pipe. The summary still appears in the terminal:
abcb run "write a short plan for tomorrow" | pbcopy
If I want to capture the two streams separately, the shell can redirect them:
abcb run "write a short plan for tomorrow" >answer.txt 2>summary.txt
Rust's type system does not enforce this distinction. Both values are strings. This is an interface decision for CLI users.
17.10 Testing the Projection
The projection is pure enough to test with hand-built events. That is the nice part of keeping summary logic outside the loop. We do not need a provider, a tool registry, a session directory, or a real JSONL file. We can build the same LoggedEvent values that read_events would have returned, then call RunSummary::from_run directly.
File: abcb/crates/abcb-cli/src/main.rs
#[test]
fn run_summary_folds_events_and_reports_outcome() {
let events = vec![
LoggedEvent {
at: 1,
event: Event::UserMessage {
content: "hi".into(),
},
},
LoggedEvent {
at: 2,
event: Event::ModelResponse {
content: "{...}".into(),
},
},
LoggedEvent {
at: 3,
event: Event::ToolResult {
tool_name: "session_note_append".into(),
output: "ok".into(),
},
},
LoggedEvent {
at: 4,
event: Event::ModelResponse {
content: "{...}".into(),
},
},
LoggedEvent {
at: 5,
event: Event::ToolDenied {
tool_name: "danger".into(),
},
},
LoggedEvent {
at: 6,
event: Event::ModelResponse {
content: "{...}".into(),
},
},
LoggedEvent {
at: 7,
event: Event::FinalAnswer {
content: "done".into(),
},
},
];
let summary = RunSummary::from_run(&events, &Ok("done".into()), "mock".into());
assert_eq!(summary.status, RunStatus::Completed);
assert_eq!(summary.steps, 3); // three ModelResponse events
assert_eq!(
summary.tools_called,
vec!["session_note_append".to_string()]
);
assert_eq!(summary.tools_denied, vec!["danger".to_string()]);
assert_eq!(summary.endpoint, "mock");
}
The test name still uses the word folds, because conceptually many events are being folded into one summary. The implementation uses a plain for loop, but the shape is the same: construct events, project them into a summary, then assert the summary fields.
The event list is deliberately mixed. UserMessage and FinalAnswer are present, but they do not change the summary counters. Three ModelResponse events become steps == 3. One ToolResult records a called tool. One ToolDenied records a denied tool. The separate Ok("done") outcome becomes RunStatus::Completed.
This test does not open files. It does not run the model. It does not call tools. That is a good sign. The summary is a pure projection from already parsed event records.
17.11 Testing Status and Display
Status has its own test because it does not come from the event list. It comes from the loop outcome:
File: abcb/crates/abcb-cli/src/main.rs
#[test]
fn run_status_reflects_the_loop_outcome() {
assert_eq!(
RunStatus::from_outcome(&Ok("x".into())),
RunStatus::Completed
);
assert_eq!(
RunStatus::from_outcome(&Err(LoopError::MaxStepsExceeded { max_steps: 5 })),
RunStatus::MaxStepsExceeded
);
assert_eq!(
RunStatus::from_outcome(&Err(LoopError::UnknownTool("x".into()))),
RunStatus::Failed
);
}
This keeps status classification separate from event projection. One function decides how the run ended. Another function counts what happened during the run. The tests mirror that split.
The display test checks the human-readable line:
File: abcb/crates/abcb-cli/src/main.rs
#[test]
fn run_summary_display_lists_each_field() {
let summary = RunSummary {
status: RunStatus::Completed,
steps: 2,
tools_called: vec!["echo".into()],
tools_denied: vec![],
endpoint: "mock".into(),
};
let line = format!("{summary}");
assert!(line.contains("status: completed"));
assert!(line.contains("steps: 2"));
assert!(line.contains("tools: [echo]"));
assert!(line.contains("denied: []"));
assert!(line.contains("endpoint: mock"));
}
This is not a beautiful formatter yet. It is a stable line with the fields we care about. The test uses contains checks instead of comparing one full string, because the important contract is that each field appears. The exact punctuation can still change later without breaking the whole test for the wrong reason.
17.12 Why Not Accumulate in the Loop?
It would be tempting to make run_loop return a richer value:
Hypothetical sketch:
struct RunReport {
answer: String,
steps: usize,
tools_called: Vec<String>,
}
That would work for this one summary, but it would make the loop responsible for every future view we might want.
The event log gives us a better boundary. The loop records facts. Other code can project those facts into views. Replay is one view: a chronological sequence. Summary is another view: counts and lists. A future evaluator might be another view: pass or fail against expectations.
This is why the chapter calls the summary a read-model. We are not inventing new truth. We are reading the log and shaping it for a different question.
17.13 What Changed
Chapter 17 adds a run summary derived from events.jsonl.
The framework lesson is that logs are not only for replay. Once the loop writes durable events, we can build multiple projections from the same record. replay shows sequence. RunSummary shows aggregate information.
The Rust lesson is how to shape that projection cleanly. RunSummary::from_run accepts a slice, walks over parsed LoggedEvent values, and returns a small value. RunStatus classifies the loop's Result. Display turns the value into a terminal line.
The control-flow lesson is the captured Result: let outcome = run_loop(...).await; instead of run_loop(...).await?. That lets the CLI summarize both successful and failed runs, then use outcome? afterward to preserve normal error behavior.
The next chapter goes back to the model boundary. We have a loop, tools, memory, approval, and summaries. Now we need to make the model-facing contract clearer, starting with the system prompt and the rules the local model is expected to follow.
To be continued