Storage Backends Benchmark¶

This page tracks local vs S3-compatible backend behavior for CLI-level CRUD workflows.

Scope¶

Compare backend behavior for add, get, search, and delete.
Measure end-to-end CLI latency (includes command startup + daemon path), not storage microbenchmarks.
Keep embeddings/model loading disabled for stable backend-path comparisons.

Cloudflare R2: validated in this repository, including temp_credentials flow.
AWS S3: supported by the same S3-compatible path, but not currently part of automated validation here.

Backend	Store (files/s)	Retrieve mean (ms)	Retrieve ops/s	Search mean (ms)	Search qps	Delete mean (ms)	Delete ops/s
local	40.34	1675.76	0.60	1519.15	0.66	1500.77	0.67
r2	1.14	5126.69	0.20	4398.37	0.23	4190.88	0.24

Remote fallback guard stayed clean: R2 iterations wrote 0 local object files.
CRUD gate passed (PUT/GET/UPDATE/DELETE semantics validated by harness checks).
List output currently reports duplicate rows (120 rows for 60 unique paths) for both backends; benchmark normalizes to unique paths for timing.

Backend	Profile	Transport	Clients	Ops/s	Fail rate	Search p95 (ms)
local	mixed	daemon_ipc	4	23.80	0.0000	5.67
local	mixed	mcp	4	23.83	0.0000	5.75
local	mixed	daemon_ipc	8	47.56	0.0000	5.75
local	mixed	mcp	8	47.63	0.0000	5.79
local	external_agent_churn	daemon_ipc	4	23.82	0.0000	5.72
local	external_agent_churn	mcp	4	23.81	0.0000	7.56
local	external_agent_churn	daemon_ipc	8	47.52	0.0000	5.69
local	external_agent_churn	mcp	8	47.63	0.0000	7.41
r2	mixed	daemon_ipc	4	19.32	0.2308	5.76
r2	mixed	mcp	4	19.35	0.2308	5.89
r2	mixed	daemon_ipc	8	34.28	0.3913	5.72
r2	mixed	mcp	8	34.19	0.3913	5.90
r2	external_agent_churn	daemon_ipc	4	20.85	0.1429	5.70
r2	external_agent_churn	mcp	4	20.87	0.1429	7.30
r2	external_agent_churn	daemon_ipc	8	38.61	0.2308	5.70
r2	external_agent_churn	mcp	8	38.75	0.2308	7.34

Key findings:

Local backend scaled near-linearly from 4 to 8 clients, with zero observed failures in this run.
R2 had lower throughput and non-zero failure rates under this load pattern (roughly 14% to 39%).
MCP and daemon IPC showed similar throughput bands within each backend/profile/client combination.

Benchmark harness scripts are internal and are not shipped as part of public docs.
Storage backend validation remains a stabilization priority before expanding benchmark scope.