Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
Documentation Lint & Validation / Markdown Linting (push) Has been cancelled
Documentation Lint & Validation / Validate mdBook Configuration (push) Has been cancelled
Documentation Lint & Validation / Content & Structure Validation (push) Has been cancelled
Documentation Lint & Validation / Lint & Validation Summary (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
124 lines
4.9 KiB
Markdown
124 lines
4.9 KiB
Markdown
# ADR-0030: A2A Protocol Implementation
|
||
|
||
**Status**: Implemented
|
||
**Date**: 2026-02-07
|
||
**Deciders**: VAPORA Team
|
||
**Technical Story**: Standardized agent-to-agent communication for interoperability with external systems (Google kagent, ADK)
|
||
|
||
---
|
||
|
||
## Decision
|
||
|
||
Implement the A2A (Agent-to-Agent) protocol as two crates:
|
||
|
||
- **`vapora-a2a`**: Axum HTTP server exposing A2A endpoints (JSON-RPC 2.0, Agent Card discovery, SurrealDB persistence, NATS async coordination, Prometheus metrics)
|
||
- **`vapora-a2a-client`**: HTTP client with exponential backoff retry, smart error classification (5xx/network retried, 4xx not retried), full protocol type serialization
|
||
|
||
---
|
||
|
||
## Rationale
|
||
|
||
**Why Axum?** Type-safe routing with compile-time verification, composable middleware, direct Tokio integration — consistent with ADR-0002.
|
||
|
||
**Why JSON-RPC 2.0?** Industry-standard RPC over HTTP/1.1 (no special infrastructure), natural fit with A2A specification, simpler than gRPC for the current load profile.
|
||
|
||
**Why separate client/server crates?** Allows external systems to depend on only the client. Independent versioning possible. Clear API surface for testing and mocking.
|
||
|
||
**Why SurrealDB?** Follows existing VAPORA patterns (ProjectService, TaskService). Multi-tenant scopes built-in. Tasks persist across server restarts — no in-memory HashMap.
|
||
|
||
**Why NATS for async coordination?** Follows existing `orchestrator.rs` pattern. `DashMap<String, oneshot::Sender>` delivers task results to callers without polling. Graceful degradation if NATS unavailable.
|
||
|
||
---
|
||
|
||
## Alternatives Considered
|
||
|
||
**gRPC** — rejected: more complex than JSON-RPC, less portable, requires HTTP/2 infrastructure.
|
||
|
||
**PostgreSQL / SQLite** — rejected: SurrealDB already used in VAPORA; adding a second database engine increases operational burden.
|
||
|
||
**Redis for result caching** — rejected: SurrealDB sufficient for current load; addable later without architectural change.
|
||
|
||
---
|
||
|
||
## Trade-offs
|
||
|
||
**Pros:**
|
||
|
||
- Full A2A protocol compliance enables interoperability with Google kagent, ADK, and compliant third-party agents
|
||
- Production-ready persistence: tasks survive server restarts
|
||
- Real async coordination: zero `tokio::sleep` stubs — NATS oneshot channels deliver actual results
|
||
- Resilient client: exponential backoff (100ms initial, 5s max, 2× multiplier, ±20% jitter)
|
||
- Full observability: Prometheus metrics on task lifecycle, DB ops, NATS messages
|
||
|
||
**Cons:**
|
||
|
||
- Requires SurrealDB at runtime (hard dependency)
|
||
- NATS is optional but reduces functionality when absent (no real-time task completion)
|
||
- Integration tests require external services (marked `#[ignore]`)
|
||
|
||
---
|
||
|
||
## Implementation
|
||
|
||
**Key files:**
|
||
|
||
- `crates/vapora-a2a/src/protocol.rs` — Type-safe message structures, JSON-RPC 2.0 envelope, task state machine
|
||
- `crates/vapora-a2a/src/task_manager.rs` — `Surreal<Client>` persistence, parameterized queries
|
||
- `crates/vapora-a2a/src/bridge.rs` — NATS subscribers + `DashMap<String, oneshot::Sender>` coordination
|
||
- `crates/vapora-a2a/src/metrics.rs` — Prometheus counters and histograms
|
||
- `crates/vapora-a2a-client/src/retry.rs` — `RetryPolicy` with exponential backoff
|
||
- `migrations/007_a2a_tasks_schema.surql` — SurrealDB schema (SCHEMAFULL `a2a_tasks`)
|
||
|
||
**A2A endpoints:**
|
||
|
||
```text
|
||
GET /.well-known/agent.json — Agent Card discovery
|
||
POST / — JSON-RPC 2.0 dispatch (tasks/send, tasks/get, tasks/cancel)
|
||
GET /metrics — Prometheus metrics
|
||
```
|
||
|
||
**Prometheus metrics:**
|
||
|
||
- `vapora_a2a_tasks_total` (by status)
|
||
- `vapora_a2a_task_duration_seconds`
|
||
- `vapora_a2a_nats_messages_total` (by subject, result)
|
||
- `vapora_a2a_db_operations_total` (by operation, result)
|
||
|
||
---
|
||
|
||
## Verification
|
||
|
||
```bash
|
||
cargo clippy --workspace -- -D warnings
|
||
cargo test -p vapora-a2a-client # 5/5 pass
|
||
cargo test -p vapora-a2a --test integration_test --no-run # compiles
|
||
# requires SurrealDB + NATS:
|
||
cargo test -p vapora-a2a --test integration_test --ignored
|
||
```
|
||
|
||
---
|
||
|
||
## Consequences
|
||
|
||
- External agents compliant with the A2A specification can dispatch tasks to VAPORA and receive structured results
|
||
- `vapora-a2a` becomes a hard SurrealDB dependent; deployment must include DB readiness probe
|
||
- Future A2A protocol version bumps are isolated to `vapora-a2a/src/protocol.rs` and the client crate
|
||
|
||
---
|
||
|
||
## References
|
||
|
||
- `crates/vapora-a2a/` — Server implementation
|
||
- `crates/vapora-a2a-client/` — Client library
|
||
- `migrations/007_a2a_tasks_schema.surql` — Schema
|
||
- [A2A Protocol Specification](https://a2a-spec.dev)
|
||
- [JSON-RPC 2.0](https://www.jsonrpc.org/specification)
|
||
|
||
**Related ADRs:**
|
||
|
||
- [ADR-0031](./0031-kubernetes-deployment-kagent.md) — Kubernetes deployment for kagent
|
||
- [ADR-0032](./0032-a2a-error-handling-json-rpc.md) — A2A error handling and JSON-RPC compliance
|
||
- [ADR-0002](./0002-axum-backend.md) — Axum backend framework
|
||
- [ADR-0005](./0005-nats-jetstream.md) — NATS JetStream coordination
|
||
- [ADR-0004](./0004-surrealdb-database.md) — SurrealDB persistence
|