Vapora/docs/adrs/0020-audit-trail.md
Jesús Pérez 7110ffeea2
Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
chore: extend doc: adr, tutorials, operations, etc
2026-01-12 03:32:47 +00:00

7.7 KiB

ADR-020: Audit Trail para Compliance

Status: Accepted | Implemented Date: 2024-11-01 Deciders: Security & Compliance Team Technical Story: Logging all significant workflow events for compliance and incident investigation


Decision

Implementar comprehensive audit trail con logging de todos los workflow events, queryable por workflow/actor/tipo.


Rationale

  1. Compliance: Regulaciones requieren audit trail (HIPAA, SOC2, etc.)
  2. Incident Investigation: Reconstruir qué pasó cuando
  3. Event Sourcing Ready: Audit trail puede ser base para event sourcing architecture
  4. User Accountability: Track quién hizo qué cuándo

Alternatives Considered

Logs Only (No Structured Audit)

  • Pros: Simple
  • Cons: Hard to query, no compliance value

Application-Embedded Logging

  • Pros: Close to business logic
  • Cons: Fragmented, easy to miss events

Centralized Audit Trail (CHOSEN)

  • Queryable, compliant, comprehensive

Trade-offs

Pros:

  • Queryable by workflow, actor, event type
  • Compliance-ready
  • Incident investigation support
  • Event sourcing ready

Cons:

  • ⚠️ Storage overhead (every event logged)
  • ⚠️ Query performance depends on indexing
  • ⚠️ Retention policy tradeoff

Implementation

Audit Event Model:

// crates/vapora-backend/src/audit.rs

pub struct AuditEvent {
    pub id: String,
    pub timestamp: DateTime<Utc>,
    pub actor: String,                // User ID or service name
    pub action: AuditAction,          // Create, Update, Delete, Execute
    pub resource_type: String,        // Project, Task, Agent, Workflow
    pub resource_id: String,
    pub details: serde_json::Value,   // Action-specific details
    pub outcome: AuditOutcome,        // Success, Failure, PartialSuccess
    pub error: Option<String>,        // Error message if failed
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum AuditAction {
    Create,
    Update,
    Delete,
    Execute,
    Assign,
    Complete,
    Override,
    QuerySecret,
    ViewAudit,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum AuditOutcome {
    Success,
    Failure,
    PartialSuccess,
}

Logging Events:

pub async fn log_event(
    db: &Surreal<Ws>,
    actor: &str,
    action: AuditAction,
    resource_type: &str,
    resource_id: &str,
    details: serde_json::Value,
    outcome: AuditOutcome,
) -> Result<String> {
    let event = AuditEvent {
        id: uuid::Uuid::new_v4().to_string(),
        timestamp: Utc::now(),
        actor: actor.to_string(),
        action,
        resource_type: resource_type.to_string(),
        resource_id: resource_id.to_string(),
        details,
        outcome,
        error: None,
    };

    let id = db
        .create("audit_events")
        .content(&event)
        .await?
        .id
        .unwrap();

    Ok(id)
}

pub async fn log_event_with_error(
    db: &Surreal<Ws>,
    actor: &str,
    action: AuditAction,
    resource_type: &str,
    resource_id: &str,
    error: String,
) -> Result<String> {
    let event = AuditEvent {
        id: uuid::Uuid::new_v4().to_string(),
        timestamp: Utc::now(),
        actor: actor.to_string(),
        action,
        resource_type: resource_type.to_string(),
        resource_id: resource_id.to_string(),
        details: json!({}),
        outcome: AuditOutcome::Failure,
        error: Some(error),
    };

    let id = db
        .create("audit_events")
        .content(&event)
        .await?
        .id
        .unwrap();

    Ok(id)
}

Audit Integration in Handlers:

// In task creation handler
pub async fn create_task(
    State(app_state): State<AppState>,
    Path(project_id): Path<String>,
    Json(req): Json<CreateTaskRequest>,
) -> Result<Json<Task>, ApiError> {
    let user = get_current_user()?;

    // Create task
    let task = app_state
        .task_service
        .create_task(&user.tenant_id, &project_id, &req)
        .await?;

    // Log audit event
    app_state.audit_log(
        &user.id,
        AuditAction::Create,
        "task",
        &task.id,
        json!({
            "project_id": &project_id,
            "title": &task.title,
            "priority": &task.priority,
        }),
        AuditOutcome::Success,
    ).await.ok();  // Don't fail if audit logging fails

    Ok(Json(task))
}

Querying Audit Trail:

pub async fn query_audit_trail(
    db: &Surreal<Ws>,
    filters: AuditQuery,
) -> Result<Vec<AuditEvent>> {
    let mut query = String::from(
        "SELECT * FROM audit_events WHERE 1=1"
    );

    if let Some(workflow_id) = filters.workflow_id {
        query.push_str(&format!(" AND resource_id = '{}'", workflow_id));
    }
    if let Some(actor) = filters.actor {
        query.push_str(&format!(" AND actor = '{}'", actor));
    }
    if let Some(action) = filters.action {
        query.push_str(&format!(" AND action = '{:?}'", action));
    }
    if let Some(since) = filters.since {
        query.push_str(&format!(" AND timestamp > '{}'", since));
    }

    query.push_str(" ORDER BY timestamp DESC LIMIT 1000");

    let events = db.query(&query).await?
        .take::<Vec<AuditEvent>>(0)?
        .unwrap_or_default();

    Ok(events)
}

Compliance Report:

pub async fn generate_compliance_report(
    db: &Surreal<Ws>,
    start_date: Date,
    end_date: Date,
) -> Result<ComplianceReport> {
    // Query all events in date range
    let events = db.query(
        "SELECT COUNT() as event_count, actor, action \
         FROM audit_events \
         WHERE timestamp >= $1 AND timestamp < $2 \
         GROUP BY actor, action"
    )
    .bind((start_date, end_date))
    .await?;

    // Generate report with statistics
    Ok(ComplianceReport {
        period: (start_date, end_date),
        total_events: events.len(),
        unique_actors: /* count unique */,
        actions_by_type: /* aggregate */,
        failures: /* filter failures */,
    })
}

Key Files:

  • /crates/vapora-backend/src/audit.rs (audit implementation)
  • /crates/vapora-backend/src/api/ (audit logging in handlers)
  • /crates/vapora-backend/src/services/ (audit logging in services)

Verification

# Test audit event creation
cargo test -p vapora-backend test_audit_event_logging

# Test audit trail querying
cargo test -p vapora-backend test_query_audit_trail

# Test filtering by actor/action/resource
cargo test -p vapora-backend test_audit_filtering

# Test error logging
cargo test -p vapora-backend test_audit_error_logging

# Integration: full workflow with audit
cargo test -p vapora-backend test_audit_full_workflow

# Compliance report generation
cargo test -p vapora-backend test_compliance_report_generation

Expected Output:

  • All significant events logged
  • Queryable by workflow/actor/action
  • Timestamps accurate
  • Errors captured with messages
  • Compliance reports generated correctly

Consequences

Data Management

  • Audit events retained per compliance policy
  • Separate archive for long-term retention
  • Immutable logs (append-only)

Performance

  • Audit logging should not block main operation
  • Async logging to avoid latency impact
  • Indexes on (resource_id, timestamp) for queries

Privacy

  • Sensitive data (passwords, keys) not logged
  • PII handled per data protection regulations
  • Access to audit trail restricted

Compliance

  • Supports HIPAA, SOC2, GDPR requirements
  • Incident investigation support
  • Regulatory audit trail available

References

  • /crates/vapora-backend/src/audit.rs (implementation)
  • ADR-011 (SecretumVault - secrets management)
  • ADR-025 (Multi-Tenancy - tenant isolation)

Related ADRs: ADR-011 (Secrets), ADR-025 (Multi-Tenancy), ADR-009 (Istio)