provisioning/docs/src/development/ctrl-c-implementation-notes.md
2026-01-14 03:09:18 +00:00

2 lines
8.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CTRL-C Handling Implementation Notes\n\n## Overview\n\nImplemented graceful CTRL-C handling for sudo password prompts during server creation/generation operations.\n\n## Problem Statement\n\nWhen `fix_local_hosts: true` is set, the provisioning tool requires sudo access to\nmodify `/etc/hosts` and SSH config. When a user cancels the sudo password prompt (no\npassword, wrong password, timeout), the system would:\n\n1. Exit with code 1 (sudo failed)\n2. Propagate null values up the call stack\n3. Show cryptic Nushell errors about pipeline failures\n4. Leave the operation in an inconsistent state\n\n**Important Unix Limitation**: Pressing CTRL-C at the sudo password prompt sends SIGINT to the entire process group, interrupting Nushell before exit\ncode handling can occur. This **cannot be caught** and is expected Unix behavior.\n\n## Solution Architecture\n\n### Key Principle: Return Values, Not Exit Codes\n\nInstead of using `exit 130` which kills the entire process, we use **return values**\nto signal cancellation and let each layer of the call stack handle it gracefully.\n\n### Three-Layer Approach\n\n1. **Detection Layer** (ssh.nu helper functions)\n - Detects sudo cancellation via exit code + stderr\n - Returns `false` instead of calling `exit`\n\n2. **Propagation Layer** (ssh.nu core functions)\n - `on_server_ssh()`: Returns `false` on cancellation\n - `server_ssh()`: Uses `reduce` to propagate failures\n\n3. **Handling Layer** (create.nu, generate.nu)\n - Checks return values\n - Displays user-friendly messages\n - Returns `false` to caller\n\n## Implementation Details\n\n### 1. Helper Functions (ssh.nu:11-32)\n\n```\ndef check_sudo_cached []: nothing -> bool {\n let result = (do --ignore-errors { ^sudo -n true } | complete)\n $result.exit_code == 0\n}\n\ndef run_sudo_with_interrupt_check [\n command: closure\n operation_name: string\n]: nothing -> bool {\n let result = (do --ignore-errors { do $command } | complete)\n if $result.exit_code == 1 and ($result.stderr | str contains "password is required") {\n print "\n⚠ Operation cancelled - sudo password required but not provided"\n print " Run 'sudo -v' first to cache credentials, or run without --fix-local-hosts"\n return false # Signal cancellation\n } else if $result.exit_code != 0 and $result.exit_code != 1 {\n error make {msg: $"($operation_name) failed: ($result.stderr)"}\n }\n true\n}\n```\n\n**Design Decision**: Return `bool` instead of throwing error or calling `exit`. This allows the caller to decide how to handle cancellation.\n\n### 2. Pre-emptive Warning (ssh.nu:155-160)\n\n```\nif $server.fix_local_hosts and not (check_sudo_cached) {\n print "\n⚠ Sudo access required for --fix-local-hosts"\n print " You will be prompted for your password, or press CTRL-C to cancel"\n print " Tip: Run 'sudo -v' beforehand to cache credentials\n"\n}\n```\n\n**Design Decision**: Warn users upfront so they're not surprised by the password prompt.\n\n### 3. CTRL-C Detection (ssh.nu:171-199)\n\nAll sudo commands wrapped with detection:\n\n```\nlet result = (do --ignore-errors { ^sudo <command> } | complete)\nif $result.exit_code == 1 and ($result.stderr | str contains "password is required") {\n print "\n⚠ Operation cancelled"\n return false\n}\n```\n\n**Design Decision**: Use `do --ignore-errors` + `complete` to capture both exit code and stderr without throwing exceptions.\n\n### 4. State Accumulation Pattern (ssh.nu:122-129)\n\nUsing Nushell's `reduce` instead of mutable variables:\n\n```\nlet all_succeeded = ($settings.data.servers | reduce -f true { |server, acc|\n if $text_match == null or $server.hostname == $text_match {\n let result = (on_server_ssh $settings $server $ip_type $request_from $run)\n $acc and $result\n } else {\n $acc\n }\n})\n```\n\n**Design Decision**: Nushell doesn't allow mutable variable capture in closures. Use `reduce` for accumulating boolean state across iterations.\n\n### 5. Caller Handling (create.nu:262-266, generate.nu:269-273)\n\n```\nlet ssh_result = (on_server_ssh $settings $server "pub" "create" false)\nif not $ssh_result {\n _print "\n✗ Server creation cancelled"\n return false\n}\n```\n\n**Design Decision**: Check return value and provide context-specific message before returning.\n\n## Error Flow Diagram\n\n```\nUser presses CTRL-C during password prompt\n ↓\nsudo exits with code 1, stderr: "password is required"\n ↓\ndo --ignore-errors captures exit code & stderr\n ↓\nDetection logic identifies cancellation\n ↓\nPrint user-friendly message\n ↓\nReturn false (not exit!)\n ↓\non_server_ssh returns false\n ↓\nCaller (create.nu/generate.nu) checks return value\n ↓\nPrint "✗ Server creation cancelled"\n ↓\nReturn false to settings.nu\n ↓\nsettings.nu handles false gracefully (no append)\n ↓\nClean exit, no cryptic errors\n```\n\n## Nushell Idioms Used\n\n### 1. `do --ignore-errors` + `complete`\n\nCaptures both stdout, stderr, and exit code without throwing:\n\n```\nlet result = (do --ignore-errors { ^sudo command } | complete)\n# result = { stdout: "...", stderr: "...", exit_code: 1 }\n```\n\n### 2. `reduce` for Accumulation\n\nInstead of mutable variables in loops:\n\n```\n# ❌ BAD - mutable capture in closure\nmut all_succeeded = true\n$servers | each { |s|\n $all_succeeded = false # Error: capture of mutable variable\n}\n\n# ✅ GOOD - reduce with accumulator\nlet all_succeeded = ($servers | reduce -f true { |s, acc|\n $acc and (check_server $s)\n})\n```\n\n### 3. Early Returns for Error Handling\n\n```\nif not $condition {\n print "Error message"\n return false\n}\n# Continue with happy path\n```\n\n## Testing Scenarios\n\n### Scenario 1: CTRL-C During First Sudo Command\n\n```\nprovisioning -c server create\n# Password: [CTRL-C]\n\n# Expected Output:\n# ⚠ Operation cancelled - sudo password required but not provided\n# Run 'sudo -v' first to cache credentials\n# ✗ Server creation cancelled\n```\n\n### Scenario 2: Pre-cached Credentials\n\n```\nsudo -v\nprovisioning -c server create\n\n# Expected: No password prompt, smooth operation\n```\n\n### Scenario 3: Wrong Password 3 Times\n\n```\nprovisioning -c server create\n# Password: [wrong]\n# Password: [wrong]\n# Password: [wrong]\n\n# Expected: Same as CTRL-C (treated as cancellation)\n```\n\n### Scenario 4: Multiple Servers, Cancel on Second\n\n```\n# If creating multiple servers and CTRL-C on second:\n# - First server completes successfully\n# - Second server shows cancellation message\n# - Operation stops, doesn't proceed to third\n```\n\n## Maintenance Notes\n\n### Adding New Sudo Commands\n\nWhen adding new sudo commands to the codebase:\n\n1. Wrap with `do --ignore-errors` + `complete`\n2. Check for exit code 1 + "password is required"\n3. Return `false` on cancellation\n4. Let caller handle the `false` return value\n\nExample template:\n\n```\nlet result = (do --ignore-errors { ^sudo new-command } | complete)\nif $result.exit_code == 1 and ($result.stderr | str contains "password is required") {\n print "\n⚠ Operation cancelled - sudo password required"\n return false\n}\n```\n\n### Common Pitfalls\n\n1. **Don't use `exit`**: It kills the entire process\n2. **Don't use mutable variables in closures**: Use `reduce` instead\n3. **Don't ignore return values**: Always check and propagate\n4. **Don't forget the pre-check warning**: Users should know sudo is needed\n\n## Future Improvements\n\n1. **Sudo Credential Manager**: Optionally use a credential manager (keychain, etc.)\n2. **Sudo-less Mode**: Alternative implementation that doesn't require root\n3. **Timeout Handling**: Detect when sudo times out waiting for password\n4. **Multiple Password Attempts**: Distinguish between CTRL-C and wrong password\n\n## References\n\n- Nushell `complete` command: <https://www.nushell.sh/commands/docs/complete.html>\n- Nushell `reduce` command: <https://www.nushell.sh/commands/docs/reduce.html>\n- Sudo exit codes: man sudo (exit code 1 = authentication failure)\n- POSIX signal conventions: SIGINT (CTRL-C) = 130\n\n## Related Files\n\n- `provisioning/core/nulib/servers/ssh.nu` - Core implementation\n- `provisioning/core/nulib/servers/create.nu` - Calls on_server_ssh\n- `provisioning/core/nulib/servers/generate.nu` - Calls on_server_ssh\n- `docs/troubleshooting/CTRL-C_SUDO_HANDLING.md` - User-facing docs\n- `docs/quick-reference/SUDO_PASSWORD_HANDLING.md` - Quick reference\n\n## Changelog\n\n- **2025-01-XX**: Initial implementation with return values (v2)\n- **2025-01-XX**: Fixed mutable variable capture with `reduce` pattern\n- **2025-01-XX**: First attempt with `exit 130` (reverted, caused process termination)