124 lines
4.9 KiB
Markdown
124 lines
4.9 KiB
Markdown
|
|
# ADR-0030: A2A Protocol Implementation
|
|||
|
|
|
|||
|
|
**Status**: Implemented
|
|||
|
|
**Date**: 2026-02-07
|
|||
|
|
**Deciders**: VAPORA Team
|
|||
|
|
**Technical Story**: Standardized agent-to-agent communication for interoperability with external systems (Google kagent, ADK)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Decision
|
|||
|
|
|
|||
|
|
Implement the A2A (Agent-to-Agent) protocol as two crates:
|
|||
|
|
|
|||
|
|
- **`vapora-a2a`**: Axum HTTP server exposing A2A endpoints (JSON-RPC 2.0, Agent Card discovery, SurrealDB persistence, NATS async coordination, Prometheus metrics)
|
|||
|
|
- **`vapora-a2a-client`**: HTTP client with exponential backoff retry, smart error classification (5xx/network retried, 4xx not retried), full protocol type serialization
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Rationale
|
|||
|
|
|
|||
|
|
**Why Axum?** Type-safe routing with compile-time verification, composable middleware, direct Tokio integration — consistent with ADR-0002.
|
|||
|
|
|
|||
|
|
**Why JSON-RPC 2.0?** Industry-standard RPC over HTTP/1.1 (no special infrastructure), natural fit with A2A specification, simpler than gRPC for the current load profile.
|
|||
|
|
|
|||
|
|
**Why separate client/server crates?** Allows external systems to depend on only the client. Independent versioning possible. Clear API surface for testing and mocking.
|
|||
|
|
|
|||
|
|
**Why SurrealDB?** Follows existing VAPORA patterns (ProjectService, TaskService). Multi-tenant scopes built-in. Tasks persist across server restarts — no in-memory HashMap.
|
|||
|
|
|
|||
|
|
**Why NATS for async coordination?** Follows existing `orchestrator.rs` pattern. `DashMap<String, oneshot::Sender>` delivers task results to callers without polling. Graceful degradation if NATS unavailable.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Alternatives Considered
|
|||
|
|
|
|||
|
|
**gRPC** — rejected: more complex than JSON-RPC, less portable, requires HTTP/2 infrastructure.
|
|||
|
|
|
|||
|
|
**PostgreSQL / SQLite** — rejected: SurrealDB already used in VAPORA; adding a second database engine increases operational burden.
|
|||
|
|
|
|||
|
|
**Redis for result caching** — rejected: SurrealDB sufficient for current load; addable later without architectural change.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Trade-offs
|
|||
|
|
|
|||
|
|
**Pros:**
|
|||
|
|
|
|||
|
|
- Full A2A protocol compliance enables interoperability with Google kagent, ADK, and compliant third-party agents
|
|||
|
|
- Production-ready persistence: tasks survive server restarts
|
|||
|
|
- Real async coordination: zero `tokio::sleep` stubs — NATS oneshot channels deliver actual results
|
|||
|
|
- Resilient client: exponential backoff (100ms initial, 5s max, 2× multiplier, ±20% jitter)
|
|||
|
|
- Full observability: Prometheus metrics on task lifecycle, DB ops, NATS messages
|
|||
|
|
|
|||
|
|
**Cons:**
|
|||
|
|
|
|||
|
|
- Requires SurrealDB at runtime (hard dependency)
|
|||
|
|
- NATS is optional but reduces functionality when absent (no real-time task completion)
|
|||
|
|
- Integration tests require external services (marked `#[ignore]`)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Implementation
|
|||
|
|
|
|||
|
|
**Key files:**
|
|||
|
|
|
|||
|
|
- `crates/vapora-a2a/src/protocol.rs` — Type-safe message structures, JSON-RPC 2.0 envelope, task state machine
|
|||
|
|
- `crates/vapora-a2a/src/task_manager.rs` — `Surreal<Client>` persistence, parameterized queries
|
|||
|
|
- `crates/vapora-a2a/src/bridge.rs` — NATS subscribers + `DashMap<String, oneshot::Sender>` coordination
|
|||
|
|
- `crates/vapora-a2a/src/metrics.rs` — Prometheus counters and histograms
|
|||
|
|
- `crates/vapora-a2a-client/src/retry.rs` — `RetryPolicy` with exponential backoff
|
|||
|
|
- `migrations/007_a2a_tasks_schema.surql` — SurrealDB schema (SCHEMAFULL `a2a_tasks`)
|
|||
|
|
|
|||
|
|
**A2A endpoints:**
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
GET /.well-known/agent.json — Agent Card discovery
|
|||
|
|
POST / — JSON-RPC 2.0 dispatch (tasks/send, tasks/get, tasks/cancel)
|
|||
|
|
GET /metrics — Prometheus metrics
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Prometheus metrics:**
|
|||
|
|
|
|||
|
|
- `vapora_a2a_tasks_total` (by status)
|
|||
|
|
- `vapora_a2a_task_duration_seconds`
|
|||
|
|
- `vapora_a2a_nats_messages_total` (by subject, result)
|
|||
|
|
- `vapora_a2a_db_operations_total` (by operation, result)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Verification
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
cargo clippy --workspace -- -D warnings
|
|||
|
|
cargo test -p vapora-a2a-client # 5/5 pass
|
|||
|
|
cargo test -p vapora-a2a --test integration_test --no-run # compiles
|
|||
|
|
# requires SurrealDB + NATS:
|
|||
|
|
cargo test -p vapora-a2a --test integration_test --ignored
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Consequences
|
|||
|
|
|
|||
|
|
- External agents compliant with the A2A specification can dispatch tasks to VAPORA and receive structured results
|
|||
|
|
- `vapora-a2a` becomes a hard SurrealDB dependent; deployment must include DB readiness probe
|
|||
|
|
- Future A2A protocol version bumps are isolated to `vapora-a2a/src/protocol.rs` and the client crate
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## References
|
|||
|
|
|
|||
|
|
- `crates/vapora-a2a/` — Server implementation
|
|||
|
|
- `crates/vapora-a2a-client/` — Client library
|
|||
|
|
- `migrations/007_a2a_tasks_schema.surql` — Schema
|
|||
|
|
- [A2A Protocol Specification](https://a2a-spec.dev)
|
|||
|
|
- [JSON-RPC 2.0](https://www.jsonrpc.org/specification)
|
|||
|
|
|
|||
|
|
**Related ADRs:**
|
|||
|
|
|
|||
|
|
- [ADR-0031](./0031-kubernetes-deployment-kagent.md) — Kubernetes deployment for kagent
|
|||
|
|
- [ADR-0032](./0032-a2a-error-handling-json-rpc.md) — A2A error handling and JSON-RPC compliance
|
|||
|
|
- [ADR-0002](./0002-axum-backend.md) — Axum backend framework
|
|||
|
|
- [ADR-0005](./0005-nats-jetstream.md) — NATS JetStream coordination
|
|||
|
|
- [ADR-0004](./0004-surrealdb-database.md) — SurrealDB persistence
|