Vapora/docs/architecture/adr/0003-error-handling-and-json-rpc-compliance.md
Jesús Pérez b6a4d77421
Some checks are pending
Documentation Lint & Validation / Markdown Linting (push) Waiting to run
Documentation Lint & Validation / Validate mdBook Configuration (push) Waiting to run
Documentation Lint & Validation / Content & Structure Validation (push) Waiting to run
Documentation Lint & Validation / Lint & Validation Summary (push) Blocked by required conditions
mdBook Build & Deploy / Build mdBook (push) Waiting to run
mdBook Build & Deploy / Documentation Quality Check (push) Blocked by required conditions
mdBook Build & Deploy / Deploy to GitHub Pages (push) Blocked by required conditions
mdBook Build & Deploy / Notification (push) Blocked by required conditions
Rust CI / Security Audit (push) Waiting to run
Rust CI / Check + Test + Lint (nightly) (push) Waiting to run
Rust CI / Check + Test + Lint (stable) (push) Waiting to run
feat: add Leptos UI library and modularize MCP server
2026-02-14 20:10:55 +00:00

185 lines
5.0 KiB
Markdown

# ADR 0003: Error Handling and JSON-RPC 2.0 Compliance
**Status:** Implemented
**Date:** 2026-02-07 (Initial) | 2026-02-07 (Completed)
**Authors:** VAPORA Team
## Context
The A2A protocol implementation required:
- Consistent error representation across client and server
- Full JSON-RPC 2.0 specification compliance
- Clear error semantics for protocol debugging
- Type-safe error handling in Rust
- Seamless integration with Axum HTTP framework
## Decision
We implemented a **two-layer error handling strategy**:
### Layer 1: Domain Errors (Rust)
Domain-specific error types using `thiserror`:
```rust
// vapora-a2a
pub enum A2aError {
TaskNotFound(String),
InvalidStateTransition { current: String, target: String },
CoordinatorError(String),
UnknownSkill(String),
SerdeError,
IoError,
InternalError(String),
}
// vapora-a2a-client
pub enum A2aClientError {
HttpError,
TaskNotFound(String),
ServerError { code: i32, message: String },
ConnectionRefused(String),
Timeout(String),
InvalidResponse,
InternalError(String),
}
```
### Layer 2: Protocol Representation (JSON-RPC)
Automatic conversion to JSON-RPC 2.0 error format:
```rust
impl A2aError {
pub fn to_json_rpc_error(&self) -> serde_json::Value {
json!({
"jsonrpc": "2.0",
"error": {
"code": <domain-specific code>,
"message": <human-readable message>
}
})
}
}
```
### Error Code Mapping
| Category | JSON-RPC Code | Examples |
|----------|---------------|----------|
| Server/Domain Errors | -32000 | TaskNotFound, UnknownSkill, InvalidStateTransition |
| Internal Errors | -32603 | SerdeError, IoError, InternalError |
| Parse Errors | -32700 | (Handled by JSON parser) |
| Invalid Request | -32600 | (Handled by Axum) |
## Rationale
**Why two layers?**
- Layer 1: Type-safe Rust error handling with `Result<T>`
- Layer 2: Protocol-compliant transmission to clients
- Separation prevents protocol knowledge from leaking into domain code
**Why JSON-RPC 2.0 codes?**
- Industry standard (not custom codes)
- Tools and clients already understand them
- Specification defines code ranges clearly
- Enables generic error handling in clients
**Why `thiserror` crate?**
- Minimal boilerplate for error types
- Automatic `Display` implementation
- Works well with `?` operator
- Type-safe error composition
**Why conversion methods?**
- One-way conversion (domain → protocol)
- Protocol details isolated in conversion method
- Testable independently
- Future protocol changes contained
## Consequences
**Positive:**
- Type-safe error handling throughout
- Clear error semantics for API consumers
- Automatic response formatting via `IntoResponse`
- Easy to audit error paths
- Specification compliance verified at compile time
**Negative:**
- Requires explicit conversion at response boundaries
- Client must parse JSON-RPC error format
- Some error context lost in translation (by design)
- Need to maintain error code documentation
## Error Flow Example
```
User Action
vapora-a2a handler
TaskManager::get(id)
Returns Result<T, A2aError::TaskNotFound>
Error handler catches and converts via to_json_rpc_error()
(StatusCode::NOT_FOUND, Json(error_json))
HTTP response sent to client
vapora-a2a-client parses response
Returns A2aClientError::TaskNotFound
```
## Testing Strategy
1. **Domain Errors:** Unit tests for error variants
2. **Conversion:** Tests for JSON-RPC format correctness
3. **Integration:** End-to-end client-server error flows
4. **Specification:** Validate against JSON-RPC 2.0 spec
## Alternative Approaches Considered
1. **Custom Error Codes**
- Rejected: Non-standard, clients can't understand
- Harder to debug for users
2. **Single Error Type**
- Rejected: Loses type safety in Rust
- Difficult to handle specific errors
3. **No Protocol Conversion**
- Rejected: Non-compliant with JSON-RPC 2.0
- Would break client expectations
## Implementation Status
**Completed (2026-02-07):**
1.**Error Types**: Complete thiserror-based error hierarchy (A2aError, A2aClientError)
2.**JSON-RPC Conversion**: Automatic to_json_rpc_error() with proper code mapping
3.**Structured Logging**: Contextual error logging with tracing (task_id, operation, error details)
4.**Prometheus Metrics**: Error tracking via A2A_DB_OPERATIONS, A2A_NATS_MESSAGES counters
5.**Retry Logic**: Client-side exponential backoff with smart error classification
**Future Enhancements:**
- Error recovery strategies (automated retry at service level)
- Error aggregation and trending
- Error rate alerting (Prometheus alerts)
## Related Decisions
- ADR-0001: A2A Protocol Implementation
- ADR-0002: Kubernetes Deployment Strategy
## References
- thiserror crate: https://docs.rs/thiserror/
- JSON-RPC 2.0 Specification: https://www.jsonrpc.org/specification
- Axum error handling: https://docs.rs/axum/latest/axum/response/index.html