Jesús Pérez b6a4d77421
Some checks are pending
Documentation Lint & Validation / Markdown Linting (push) Waiting to run
Documentation Lint & Validation / Validate mdBook Configuration (push) Waiting to run
Documentation Lint & Validation / Content & Structure Validation (push) Waiting to run
Documentation Lint & Validation / Lint & Validation Summary (push) Blocked by required conditions
mdBook Build & Deploy / Build mdBook (push) Waiting to run
mdBook Build & Deploy / Documentation Quality Check (push) Blocked by required conditions
mdBook Build & Deploy / Deploy to GitHub Pages (push) Blocked by required conditions
mdBook Build & Deploy / Notification (push) Blocked by required conditions
Rust CI / Security Audit (push) Waiting to run
Rust CI / Check + Test + Lint (nightly) (push) Waiting to run
Rust CI / Check + Test + Lint (stable) (push) Waiting to run
feat: add Leptos UI library and modularize MCP server
2026-02-14 20:10:55 +00:00

9.2 KiB
Raw Permalink Blame History

VAPORA Grafana Dashboards

This directory contains 4 pre-configured Grafana dashboards for monitoring VAPORA.

Dashboards

1. VAPORA Overview (vapora-overview.json)

UID: vapora-overview

Panels:

  • Request Rate (req/sec)
  • Error Rate (%)
  • P95 Latency (ms)
  • Request Rate by Endpoint (timeseries)
  • Response Latency (P50, P95, P99) (timeseries)
  • Response Status Distribution (pie chart)
  • Database Operations (timeseries)

Metrics Used:

  • vapora_http_requests_total
  • vapora_http_request_duration_seconds_bucket
  • vapora_db_operations_total

Refresh: 10 seconds


2. VAPORA Agent Metrics (agent-metrics.json)

UID: vapora-agents

Panels:

  • Active Agents (count)
  • Task Assignment Rate (assignments/sec)
  • Task Failure Rate (%)
  • Average Agent Load
  • Task Execution Time by Agent Role (P50, P95, P99)
  • Task Assignments by Skill (stacked)
  • Agent Load Distribution (donut chart)
  • Agent Expertise Scores (Learning Profiles)
  • NATS Message Coordination (A2A)

Metrics Used:

  • vapora_swarm_agents_registered
  • vapora_swarm_task_assignments_total
  • vapora_swarm_agent_load
  • vapora_agent_task_duration_seconds_bucket
  • vapora_agent_expertise_score
  • vapora_a2a_nats_messages_total

Refresh: 10 seconds


3. VAPORA LLM Cost Tracking (llm-cost-tracking.json)

UID: vapora-llm-cost

Panels:

  • Total LLM Cost (USD)
  • Total Input Tokens
  • Total Output Tokens
  • Budget Usage % (gauge)
  • Cost by Provider (timeseries)
  • Token Usage by Provider (timeseries)
  • Cost Distribution by Provider (donut chart)
  • Cost Distribution by Role (donut chart)
  • Request Distribution by Provider (donut chart)
  • Hourly Budget Usage by Role (bars)
  • Budget Status by Role (table)

Metrics Used:

  • vapora_llm_cost_total_cents
  • vapora_llm_provider_token_usage
  • vapora_llm_role_budget_used_cents
  • vapora_llm_role_budget_limit_cents
  • vapora_llm_provider_requests_total

Refresh: 10 seconds


4. VAPORA Knowledge Graph Analytics (knowledge-graph-analytics.json)

UID: vapora-kg-analytics

Panels:

  • Total Executions in KG
  • KG Nodes
  • KG Relationships
  • Average Learning Curve Slope
  • Learning Curves (Improvement Over Time)
  • Average Execution Duration by Task Type
  • Execution Count by Task Type (table)
  • Execution Status Distribution (donut chart)
  • Recency Bias Weights (7-day 3×, 30-day 1×)
  • Similarity Searches (Hourly)
  • Agent Success Rates by Task Type (table)

Metrics Used:

  • vapora_kg_total_executions
  • vapora_kg_total_nodes
  • vapora_kg_total_relationships
  • vapora_kg_learning_curve_slope
  • vapora_kg_learning_curve_improvement
  • vapora_kg_execution_duration_seconds
  • vapora_kg_executions_by_task_type
  • vapora_kg_executions_by_status
  • vapora_kg_recency_bias_weight
  • vapora_kg_similarity_searches_total
  • vapora_kg_agent_success_rate

Refresh: 30 seconds


Import Instructions

  1. Access Grafana:

    kubectl port-forward -n observability svc/grafana 3000:3000
    

    Open: http://localhost:3000

  2. Login:

    • Username: admin
    • Password: prom-operator (or your configured password)
  3. Import Dashboards:

    • Click "+""Import" in the left sidebar
    • Click "Upload JSON file" or "Import via panel json"
    • Select one of the JSON files from this directory
    • Select Prometheus as the datasource
    • Click "Import"
  4. Repeat for all 4 dashboards

Option 2: Kubernetes ConfigMap (Automated)

Create a ConfigMap to auto-provision dashboards:

# Create ConfigMap for dashboards
kubectl create configmap vapora-dashboards \
  --from-file=vapora-overview.json \
  --from-file=agent-metrics.json \
  --from-file=llm-cost-tracking.json \
  --from-file=knowledge-graph-analytics.json \
  -n observability

# Label for Grafana auto-discovery
kubectl label configmap vapora-dashboards \
  grafana_dashboard=1 \
  -n observability

Note: This assumes your Grafana instance is configured with a dashboard provider that watches for ConfigMaps with the grafana_dashboard=1 label.

Option 3: Direct File Mount (Docker/Local)

If running Grafana locally via Docker:

# Copy dashboards to Grafana provisioning directory
cp *.json /path/to/grafana/provisioning/dashboards/

# Restart Grafana
docker restart grafana

Verification

After importing, verify dashboards are working:

  1. Check Prometheus Data Source:

    • Go to ConfigurationData Sources
    • Verify Prometheus datasource exists and is reachable
    • Test connection
  2. Check Metrics Availability:

    Open Prometheus UI:

    kubectl port-forward -n observability svc/prometheus 9090:9090
    

    Query test metrics:

    • vapora_http_requests_total
    • vapora_agent_task_duration_seconds_bucket
    • vapora_llm_cost_total_cents
    • vapora_kg_total_executions
  3. View Dashboards:

    • Go to DashboardsBrowse
    • Look for "VAPORA" folder or tag
    • Open each dashboard
    • Verify panels show data (may take a few minutes after VAPORA starts)

Customization

Update Datasource

If your Prometheus datasource has a different name:

  1. Open dashboard JSON file
  2. Find all instances of "uid": "${DS_PROMETHEUS}"
  3. Replace with your datasource UID
  4. Re-import

Adjust Refresh Rate

To change auto-refresh interval:

  1. Open dashboard in Grafana
  2. Click Dashboard settings (gear icon)
  3. Go to General tab
  4. Update Refresh dropdown
  5. Click Save dashboard

Add Custom Panels

To add new panels:

  1. Edit dashboard
  2. Click "Add panel""Add a new panel"
  3. Select Prometheus datasource
  4. Write PromQL query (see Metrics Used above for examples)
  5. Configure visualization
  6. Click "Apply"
  7. Save dashboard

Troubleshooting

No Data Shown

Problem: Panels show "No data"

Solutions:

  1. Check VAPORA is running:

    kubectl get pods -n vapora
    # All pods should be Running
    
  2. Check Prometheus is scraping VAPORA:

    kubectl port-forward -n observability svc/prometheus 9090:9090
    

    Open: http://localhost:9090/targets

    Look for vapora-backend, vapora-a2a, etc. targets

  3. Check metrics endpoint manually:

    kubectl port-forward -n vapora svc/vapora-backend 8001:8001
    curl http://localhost:8001/metrics | grep vapora_
    

    Should show Prometheus-format metrics

  4. Wait a few minutes for metrics to accumulate

Wrong Datasource

Problem: Dashboard shows "Data source not found"

Solution:

  • Edit dashboard
  • Click Dashboard settingsVariables
  • Update DS_PROMETHEUS variable to match your datasource name
  • Save

Missing Metrics

Problem: Some panels show "No data" while others work

Solution:

  • Check if specific VAPORA features are enabled:
    • Agent metrics: Requires vapora-agents running
    • LLM cost: Requires LLM provider configured
    • KG analytics: Requires Knowledge Graph enabled
  • Some metrics only appear after certain actions (e.g., task assignments, LLM calls)

Dashboard Organization

Recommended Grafana folder structure:

📁 VAPORA/
├── 📊 Overview (vapora-overview)
├── 📊 Agent Metrics (vapora-agents)
├── 📊 LLM Cost Tracking (vapora-llm-cost)
└── 📊 Knowledge Graph Analytics (vapora-kg-analytics)

To create folder:

  1. Go to DashboardsBrowse
  2. Click "New""New folder"
  3. Name: "VAPORA"
  4. Move imported dashboards into this folder

Alerting (Optional)

To set up alerts based on dashboard panels:

Example: High Error Rate Alert

  1. Open VAPORA Overview dashboard
  2. Edit "Error Rate" panel
  3. Go to Alert tab
  4. Click "Create alert rule from this panel"
  5. Configure:
    • Name: "VAPORA High Error Rate"
    • Condition: avg() > 0.05 (5%)
    • For: 5 minutes
    • Annotations: "VAPORA error rate exceeded 5%"
  6. Save

Example: Budget Exceeded Alert

  1. Open VAPORA LLM Cost Tracking dashboard
  2. Edit "Budget Usage %" panel
  3. Create alert:
    • Name: "LLM Budget Near Limit"
    • Condition: last() > 0.9 (90%)
    • For: 1 minute
    • Annotations: "LLM budget usage exceeded 90%"

Maintenance

Update Dashboards

When VAPORA metrics change:

  1. Export current dashboard JSON
  2. Edit JSON file with new metrics
  3. Increment version number
  4. Re-import (overwrites existing)

Backup Dashboards

# Export all VAPORA dashboards
curl -H "Authorization: Bearer $GRAFANA_API_KEY" \
  "http://localhost:3000/api/dashboards/uid/vapora-overview" \
  > vapora-overview-backup.json

# Repeat for other dashboard UIDs:
# - vapora-agents
# - vapora-llm-cost
# - vapora-kg-analytics

Support

For dashboard issues:

  • Check VAPORA Metrics Documentation: docs/architecture/metrics.md
  • Check Prometheus Setup: docs/operations/monitoring.md
  • Review Grafana Docs: https://grafana.com/docs/

For VAPORA metrics questions:

  • See: .claude/CLAUDE.mdDebugging & Monitoring section
  • Check: crates/*/src/metrics.rs files for metric definitions

Last Updated: 2026-02-08 VAPORA Version: 1.2.0 Grafana Version: 10.0+ Prometheus Version: 2.40+