# ADR 0003: Error Handling and JSON-RPC 2.0 Compliance **Status:** Implemented **Date:** 2026-02-07 (Initial) | 2026-02-07 (Completed) **Authors:** VAPORA Team ## Context The A2A protocol implementation required: - Consistent error representation across client and server - Full JSON-RPC 2.0 specification compliance - Clear error semantics for protocol debugging - Type-safe error handling in Rust - Seamless integration with Axum HTTP framework ## Decision We implemented a **two-layer error handling strategy**: ### Layer 1: Domain Errors (Rust) Domain-specific error types using `thiserror`: ```rust // vapora-a2a pub enum A2aError { TaskNotFound(String), InvalidStateTransition { current: String, target: String }, CoordinatorError(String), UnknownSkill(String), SerdeError, IoError, InternalError(String), } // vapora-a2a-client pub enum A2aClientError { HttpError, TaskNotFound(String), ServerError { code: i32, message: String }, ConnectionRefused(String), Timeout(String), InvalidResponse, InternalError(String), } ``` ### Layer 2: Protocol Representation (JSON-RPC) Automatic conversion to JSON-RPC 2.0 error format: ```rust impl A2aError { pub fn to_json_rpc_error(&self) -> serde_json::Value { json!({ "jsonrpc": "2.0", "error": { "code": , "message": } }) } } ``` ### Error Code Mapping | Category | JSON-RPC Code | Examples | |----------|---------------|----------| | Server/Domain Errors | -32000 | TaskNotFound, UnknownSkill, InvalidStateTransition | | Internal Errors | -32603 | SerdeError, IoError, InternalError | | Parse Errors | -32700 | (Handled by JSON parser) | | Invalid Request | -32600 | (Handled by Axum) | ## Rationale **Why two layers?** - Layer 1: Type-safe Rust error handling with `Result` - Layer 2: Protocol-compliant transmission to clients - Separation prevents protocol knowledge from leaking into domain code **Why JSON-RPC 2.0 codes?** - Industry standard (not custom codes) - Tools and clients already understand them - Specification defines code ranges clearly - Enables generic error handling in clients **Why `thiserror` crate?** - Minimal boilerplate for error types - Automatic `Display` implementation - Works well with `?` operator - Type-safe error composition **Why conversion methods?** - One-way conversion (domain → protocol) - Protocol details isolated in conversion method - Testable independently - Future protocol changes contained ## Consequences **Positive:** - Type-safe error handling throughout - Clear error semantics for API consumers - Automatic response formatting via `IntoResponse` - Easy to audit error paths - Specification compliance verified at compile time **Negative:** - Requires explicit conversion at response boundaries - Client must parse JSON-RPC error format - Some error context lost in translation (by design) - Need to maintain error code documentation ## Error Flow Example ``` User Action ↓ vapora-a2a handler ↓ TaskManager::get(id) ↓ Returns Result ↓ Error handler catches and converts via to_json_rpc_error() ↓ (StatusCode::NOT_FOUND, Json(error_json)) ↓ HTTP response sent to client ↓ vapora-a2a-client parses response ↓ Returns A2aClientError::TaskNotFound ``` ## Testing Strategy 1. **Domain Errors:** Unit tests for error variants 2. **Conversion:** Tests for JSON-RPC format correctness 3. **Integration:** End-to-end client-server error flows 4. **Specification:** Validate against JSON-RPC 2.0 spec ## Alternative Approaches Considered 1. **Custom Error Codes** - Rejected: Non-standard, clients can't understand - Harder to debug for users 2. **Single Error Type** - Rejected: Loses type safety in Rust - Difficult to handle specific errors 3. **No Protocol Conversion** - Rejected: Non-compliant with JSON-RPC 2.0 - Would break client expectations ## Implementation Status ✅ **Completed (2026-02-07):** 1. ✅ **Error Types**: Complete thiserror-based error hierarchy (A2aError, A2aClientError) 2. ✅ **JSON-RPC Conversion**: Automatic to_json_rpc_error() with proper code mapping 3. ✅ **Structured Logging**: Contextual error logging with tracing (task_id, operation, error details) 4. ✅ **Prometheus Metrics**: Error tracking via A2A_DB_OPERATIONS, A2A_NATS_MESSAGES counters 5. ✅ **Retry Logic**: Client-side exponential backoff with smart error classification **Future Enhancements:** - Error recovery strategies (automated retry at service level) - Error aggregation and trending - Error rate alerting (Prometheus alerts) ## Related Decisions - ADR-0001: A2A Protocol Implementation - ADR-0002: Kubernetes Deployment Strategy ## References - thiserror crate: https://docs.rs/thiserror/ - JSON-RPC 2.0 Specification: https://www.jsonrpc.org/specification - Axum error handling: https://docs.rs/axum/latest/axum/response/index.html