Storage Backends Benchmark¶
This page tracks local vs S3-compatible backend behavior for CLI-level CRUD workflows.
Scope¶
- Compare backend behavior for
add,get,search, anddelete. - Measure end-to-end CLI latency (includes command startup + daemon path), not storage microbenchmarks.
- Keep embeddings/model loading disabled for stable backend-path comparisons.
Provider validation status¶
- Cloudflare R2: validated in this repository, including
temp_credentialsflow. - AWS S3: supported by the same S3-compatible path, but not currently part of automated validation here.
Latest validated run (2026-03-08)¶
- Backends:
local,r2 - Params:
iterations=1,files=60,file_size_kb=8,retrieve_count=20 - R2 auth mode:
temp_credentials(bearer token + account id bootstrap)
| Backend | Store (files/s) | Retrieve mean (ms) | Retrieve ops/s | Search mean (ms) | Search qps | Delete mean (ms) | Delete ops/s |
|---|---|---|---|---|---|---|---|
| local | 40.34 | 1675.76 | 0.60 | 1519.15 | 0.66 | 1500.77 | 0.67 |
| r2 | 1.14 | 5126.69 | 0.20 | 4398.37 | 0.23 | 4190.88 | 0.24 |
Validation notes¶
- Remote fallback guard stayed clean: R2 iterations wrote
0local object files. - CRUD gate passed (
PUT/GET/UPDATE/DELETEsemantics validated by harness checks). - List output currently reports duplicate rows (
120rows for60unique paths) for both backends; benchmark normalizes to unique paths for timing.
Multi-client backend benchmark track¶
- Scope: agent-like read-heavy workloads.
- Profiles:
mixed,external_agent_churn. - Transports:
daemon_ipc,mcp. - Client counts:
4,8total concurrent clients. - Backends:
local,r2(R2 intemp_credentialsmode).
Latest validated multi-client run (2026-03-08)¶
- Parameters:
iterations=1,files=120,file_size_kb=8,ops_per_client=12
| Backend | Profile | Transport | Clients | Ops/s | Fail rate | Search p95 (ms) |
|---|---|---|---|---|---|---|
| local | mixed | daemon_ipc | 4 | 23.80 | 0.0000 | 5.67 |
| local | mixed | mcp | 4 | 23.83 | 0.0000 | 5.75 |
| local | mixed | daemon_ipc | 8 | 47.56 | 0.0000 | 5.75 |
| local | mixed | mcp | 8 | 47.63 | 0.0000 | 5.79 |
| local | external_agent_churn | daemon_ipc | 4 | 23.82 | 0.0000 | 5.72 |
| local | external_agent_churn | mcp | 4 | 23.81 | 0.0000 | 7.56 |
| local | external_agent_churn | daemon_ipc | 8 | 47.52 | 0.0000 | 5.69 |
| local | external_agent_churn | mcp | 8 | 47.63 | 0.0000 | 7.41 |
| r2 | mixed | daemon_ipc | 4 | 19.32 | 0.2308 | 5.76 |
| r2 | mixed | mcp | 4 | 19.35 | 0.2308 | 5.89 |
| r2 | mixed | daemon_ipc | 8 | 34.28 | 0.3913 | 5.72 |
| r2 | mixed | mcp | 8 | 34.19 | 0.3913 | 5.90 |
| r2 | external_agent_churn | daemon_ipc | 4 | 20.85 | 0.1429 | 5.70 |
| r2 | external_agent_churn | mcp | 4 | 20.87 | 0.1429 | 7.30 |
| r2 | external_agent_churn | daemon_ipc | 8 | 38.61 | 0.2308 | 5.70 |
| r2 | external_agent_churn | mcp | 8 | 38.75 | 0.2308 | 7.34 |
Key findings:
- Local backend scaled near-linearly from 4 to 8 clients, with zero observed failures in this run.
- R2 had lower throughput and non-zero failure rates under this load pattern (roughly 14% to 39%).
- MCP and daemon IPC showed similar throughput bands within each backend/profile/client combination.
Notes¶
- Benchmark harness scripts are internal and are not shipped as part of public docs.
- Storage backend validation remains a stabilization priority before expanding benchmark scope.