Vapora/docs/features/notification-channels.md
Jesús Pérez 027b8f2836
Some checks failed
Documentation Lint & Validation / Markdown Linting (push) Has been cancelled
Documentation Lint & Validation / Validate mdBook Configuration (push) Has been cancelled
Documentation Lint & Validation / Content & Structure Validation (push) Has been cancelled
Documentation Lint & Validation / Lint & Validation Summary (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
feat(channels): webhook notification channels with built-in secret resolution
Add vapora-channels crate with trait-based Slack/Discord/Telegram webhook
  delivery. ${VAR}/${VAR:-default} interpolation is mandatory inside
  ChannelRegistry::from_config — callers cannot bypass secret resolution.
  Fire-and-forget dispatch via tokio::spawn in both vapora-workflow-engine
  (four lifecycle events) and vapora-backend (task Done, proposal approve/reject).
  New REST endpoints: GET /channels, POST /channels/:name/test.
  dispatch_notifications extracted as pub(crate) fn for inline testability;
  5 handler tests + 6 workflow engine tests + 7 secret resolution unit tests.

  Closes: vapora-channels bootstrap, notification gap in workflow/backend layer
  ADR: docs/adrs/0035-notification-channels.md
2026-02-26 14:49:34 +00:00

237 lines
7.7 KiB
Markdown

# Notification Channels
Real-time outbound alerts to Slack, Discord, and Telegram via webhook delivery.
## Overview
`vapora-channels` provides a trait-based webhook notification layer. When VAPORA events occur (task completion, proposal decisions, workflow lifecycle), configured channels receive a message immediately — no polling required.
**Key properties**:
- No vendor SDKs — plain HTTP POST to webhook URLs
- Secret tokens resolved from environment variables at startup; a raw `${VAR}` placeholder never reaches the HTTP layer
- Fire-and-forget delivery: channel failures never surface as API errors
## Configuration
All channel configuration lives in `vapora.toml`.
### Declaring channels
```toml
[channels.team-slack]
type = "slack"
webhook_url = "${SLACK_WEBHOOK_URL}"
[channels.ops-discord]
type = "discord"
webhook_url = "${DISCORD_WEBHOOK_URL}"
[channels.alerts-telegram]
type = "telegram"
bot_token = "${TELEGRAM_BOT_TOKEN}"
chat_id = "${TELEGRAM_CHAT_ID}"
```
Channel names (`team-slack`, `ops-discord`, `alerts-telegram`) are arbitrary identifiers used in event routing below.
### Routing events to channels
```toml
[notifications]
on_task_done = ["team-slack"]
on_proposal_approved = ["team-slack", "ops-discord"]
on_proposal_rejected = ["ops-discord"]
```
Each key is an event name; the value is a list of channel names declared in `[channels.*]`. An empty list or absent key means no notification for that event.
### Workflow lifecycle notifications
Per-workflow notification targets are set in the workflow template:
```toml
[[workflows]]
name = "nightly_analysis"
trigger = "schedule"
[workflows.nightly_analysis.notifications]
on_stage_complete = ["team-slack"]
on_stage_failed = ["team-slack", "ops-discord"]
on_completed = ["team-slack"]
on_cancelled = ["ops-discord"]
```
## Secret Resolution
Token values in `[channels.*]` blocks are interpolated from the environment before any network call is made. Two syntaxes are supported:
| Syntax | Behaviour |
|--------|-----------|
| `"${VAR}"` | Replaced with `$VAR`; startup fails if the variable is unset |
| `"${VAR:-default}"` | Replaced with `$VAR` if set, otherwise `default` |
Resolution happens inside `ChannelRegistry::from_config` — the single mandatory call site. There is no way to construct a registry with an unresolved placeholder.
**Example**:
```bash
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/T.../..."
export DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/..."
export TELEGRAM_BOT_TOKEN="123456:ABC..."
export TELEGRAM_CHAT_ID="-1001234567890"
```
If a required variable is absent and no default is provided, VAPORA exits at startup with:
```text
Error: Secret reference '${SLACK_WEBHOOK_URL}' not resolved: env var not set and no default provided
```
## Supported Channel Types
### Slack
Uses the [Incoming Webhooks](https://api.slack.com/messaging/webhooks) API. The webhook URL is obtained from Slack's app configuration.
```toml
[channels.my-slack]
type = "slack"
webhook_url = "${SLACK_WEBHOOK_URL}"
```
Payload format: `{ "text": "**Title**\nBody" }`. No SDK dependency.
### Discord
Uses the [Discord Webhook](https://discord.com/developers/docs/resources/webhook) endpoint. The webhook URL includes the token — obtain it from the channel's Integrations settings.
```toml
[channels.my-discord]
type = "discord"
webhook_url = "${DISCORD_WEBHOOK_URL}"
```
Payload format: `{ "embeds": [{ "title": "...", "description": "...", "color": <level-color> }] }`.
### Telegram
Uses the [Bot API](https://core.telegram.org/bots/api#sendmessage) `sendMessage` endpoint. Requires a bot token from `@BotFather` and the numeric chat ID of the target group or channel.
```toml
[channels.my-telegram]
type = "telegram"
bot_token = "${TELEGRAM_BOT_TOKEN}"
chat_id = "${TELEGRAM_CHAT_ID}"
```
Payload format: `{ "chat_id": "...", "text": "**Title**\nBody", "parse_mode": "Markdown" }`.
## Message Levels
Every notification carries a level that controls colour and emoji in the rendered message:
| Level | Constructor | Use case |
|-------|-------------|----------|
| `Info` | `Message::info(title, body)` | General status updates |
| `Success` | `Message::success(title, body)` | Task done, workflow completed |
| `Warning` | `Message::warning(title, body)` | Proposal rejected, stage failed |
| `Error` | `Message::error(title, body)` | Unrecoverable failure |
## REST API
Two endpoints are available under `/api/v1/channels`:
### List channels
```http
GET /api/v1/channels
```
Returns the names of all registered channels (sorted alphabetically). Returns an empty list when no channels are configured.
**Response**:
```json
{
"channels": ["ops-discord", "team-slack"]
}
```
### Test a channel
```http
POST /api/v1/channels/:name/test
```
Sends a connectivity test message to the named channel and returns synchronously.
| Status | Meaning |
|--------|---------|
| `200 OK` | Message delivered successfully |
| `404 Not Found` | Channel name unknown or no channels configured |
| `502 Bad Gateway` | Delivery attempt failed at the remote platform |
**Example**:
```bash
curl -X POST http://localhost:8001/api/v1/channels/team-slack/test
```
Expected Slack message: `Test notification — Connectivity test from VAPORA backend for channel 'team-slack'`
## Delivery Semantics
Delivery is **fire-and-forget**: `AppState::notify` spawns a background Tokio task and returns immediately. The API response does not wait for webhook delivery to complete.
Behaviour on failure:
- Unknown channel name: `warn!` log, delivery to other targets continues
- HTTP error from the remote platform: `warn!` log, delivery to other targets continues
- No channels configured (`channel_registry = None`): silent no-op
There is no built-in retry. A channel that is consistently unreachable produces `warn!` log lines but no escalation. Use the `/test` endpoint to confirm connectivity after configuration changes.
## Events Reference
| Event key | Trigger | Default level |
|-----------|---------|---------------|
| `on_task_done` | Task moved to `Done` status | `Success` |
| `on_proposal_approved` | Proposal approved via API | `Success` |
| `on_proposal_rejected` | Proposal rejected via API | `Warning` |
| `on_stage_complete` | Workflow stage finished | `Info` |
| `on_stage_failed` | Workflow stage failed | `Warning` |
| `on_completed` | Workflow reached terminal `Completed` state | `Success` |
| `on_cancelled` | Workflow cancelled | `Warning` |
## Troubleshooting
### Channel not receiving messages
1. Verify the channel name in `[notifications]` matches the name in `[channels.*]` exactly (case-sensitive).
2. Confirm the env variable is set: `echo $SLACK_WEBHOOK_URL`.
3. Send a test message: `POST /api/v1/channels/<name>/test`.
4. Check backend logs for `warn` entries with `channel = "<name>"`.
### Startup fails with `SecretNotFound`
The env variable referenced in `webhook_url` or `bot_token`/`chat_id` is not set. Either export the variable or add a default value:
```toml
webhook_url = "${SLACK_WEBHOOK_URL:-https://hooks.slack.com/...}"
```
### Discord returns 400
The webhook URL must end with `/slack` for Slack-compatible mode, or be the raw Discord webhook URL. Ensure the URL copied from Discord's channel settings is used without modification.
### Telegram chat_id not found
The bot must be a member of the target group or channel. For groups, prefix the numeric ID with `-` (e.g. `-1001234567890`). Use `@userinfobot` in Telegram to retrieve your chat ID.
## Related Documentation
- [Workflow Orchestrator](./workflow-orchestrator.md) — workflow lifecycle events and notification config
- [ADR-0035: Notification Channels](../adrs/0035-notification-channels.md) — design rationale
- [ADR-0011: SecretumVault](../adrs/0011-secretumvault.md) — secret management philosophy