8.5 KiB
8.5 KiB
fix_local_hosts: true is set, the provisioning tool requires sudo access to\nmodify /etc/hosts and SSH config. When a user cancels the sudo password prompt (no\npassword, wrong password, timeout), the system would:\n\n1. Exit with code 1 (sudo failed)\n2. Propagate null values up the call stack\n3. Show cryptic Nushell errors about pipeline failures\n4. Leave the operation in an inconsistent state\n\nImportant Unix Limitation: Pressing CTRL-C at the sudo password prompt sends SIGINT to the entire process group, interrupting Nushell before exit\ncode handling can occur. This cannot be caught and is expected Unix behavior.\n\n## Solution Architecture\n\n### Key Principle: Return Values, Not Exit Codes\n\nInstead of using exit 130 which kills the entire process, we use return values\nto signal cancellation and let each layer of the call stack handle it gracefully.\n\n### Three-Layer Approach\n\n1. Detection Layer (ssh.nu helper functions)\n - Detects sudo cancellation via exit code + stderr\n - Returns false instead of calling exit\n\n2. Propagation Layer (ssh.nu core functions)\n - on_server_ssh(): Returns false on cancellation\n - server_ssh(): Uses reduce to propagate failures\n\n3. Handling Layer (create.nu, generate.nu)\n - Checks return values\n - Displays user-friendly messages\n - Returns false to caller\n\n## Implementation Details\n\n### 1. Helper Functions (ssh.nu:11-32)\n\n\ndef check_sudo_cached []: nothing -> bool {\n let result = (do --ignore-errors { ^sudo -n true } | complete)\n $result.exit_code == 0\n}\n\ndef run_sudo_with_interrupt_check [\n command: closure\n operation_name: string\n]: nothing -> bool {\n let result = (do --ignore-errors { do $command } | complete)\n if $result.exit_code == 1 and ($result.stderr | str contains "password is required") {\n print "\n⚠ Operation cancelled - sudo password required but not provided"\n print "ℹ Run 'sudo -v' first to cache credentials, or run without --fix-local-hosts"\n return false # Signal cancellation\n } else if $result.exit_code != 0 and $result.exit_code != 1 {\n error make {msg: $"($operation_name) failed: ($result.stderr)"}\n }\n true\n}\n\n\nDesign Decision: Return bool instead of throwing error or calling exit. This allows the caller to decide how to handle cancellation.\n\n### 2. Pre-emptive Warning (ssh.nu:155-160)\n\n\nif $server.fix_local_hosts and not (check_sudo_cached) {\n print "\n⚠ Sudo access required for --fix-local-hosts"\n print "ℹ You will be prompted for your password, or press CTRL-C to cancel"\n print " Tip: Run 'sudo -v' beforehand to cache credentials\n"\n}\n\n\nDesign Decision: Warn users upfront so they're not surprised by the password prompt.\n\n### 3. CTRL-C Detection (ssh.nu:171-199)\n\nAll sudo commands wrapped with detection:\n\n\nlet result = (do --ignore-errors { ^sudo <command> } | complete)\nif $result.exit_code == 1 and ($result.stderr | str contains "password is required") {\n print "\n⚠ Operation cancelled"\n return false\n}\n\n\nDesign Decision: Use do --ignore-errors + complete to capture both exit code and stderr without throwing exceptions.\n\n### 4. State Accumulation Pattern (ssh.nu:122-129)\n\nUsing Nushell's reduce instead of mutable variables:\n\n\nlet all_succeeded = ($settings.data.servers | reduce -f true { |server, acc|\n if $text_match == null or $server.hostname == $text_match {\n let result = (on_server_ssh $settings $server $ip_type $request_from $run)\n $acc and $result\n } else {\n $acc\n }\n})\n\n\nDesign Decision: Nushell doesn't allow mutable variable capture in closures. Use reduce for accumulating boolean state across iterations.\n\n### 5. Caller Handling (create.nu:262-266, generate.nu:269-273)\n\n\nlet ssh_result = (on_server_ssh $settings $server "pub" "create" false)\nif not $ssh_result {\n _print "\n✗ Server creation cancelled"\n return false\n}\n\n\nDesign Decision: Check return value and provide context-specific message before returning.\n\n## Error Flow Diagram\n\n\nUser presses CTRL-C during password prompt\n ↓\nsudo exits with code 1, stderr: "password is required"\n ↓\ndo --ignore-errors captures exit code & stderr\n ↓\nDetection logic identifies cancellation\n ↓\nPrint user-friendly message\n ↓\nReturn false (not exit!)\n ↓\non_server_ssh returns false\n ↓\nCaller (create.nu/generate.nu) checks return value\n ↓\nPrint "✗ Server creation cancelled"\n ↓\nReturn false to settings.nu\n ↓\nsettings.nu handles false gracefully (no append)\n ↓\nClean exit, no cryptic errors\n\n\n## Nushell Idioms Used\n\n### 1. do --ignore-errors + complete\n\nCaptures both stdout, stderr, and exit code without throwing:\n\n\nlet result = (do --ignore-errors { ^sudo command } | complete)\n# result = { stdout: "...", stderr: "...", exit_code: 1 }\n\n\n### 2. reduce for Accumulation\n\nInstead of mutable variables in loops:\n\n\n# ❌ BAD - mutable capture in closure\nmut all_succeeded = true\n$servers | each { |s|\n $all_succeeded = false # Error: capture of mutable variable\n}\n\n# ✅ GOOD - reduce with accumulator\nlet all_succeeded = ($servers | reduce -f true { |s, acc|\n $acc and (check_server $s)\n})\n\n\n### 3. Early Returns for Error Handling\n\n\nif not $condition {\n print "Error message"\n return false\n}\n# Continue with happy path\n\n\n## Testing Scenarios\n\n### Scenario 1: CTRL-C During First Sudo Command\n\n\nprovisioning -c server create\n# Password: [CTRL-C]\n\n# Expected Output:\n# ⚠ Operation cancelled - sudo password required but not provided\n# ℹ Run 'sudo -v' first to cache credentials\n# ✗ Server creation cancelled\n\n\n### Scenario 2: Pre-cached Credentials\n\n\nsudo -v\nprovisioning -c server create\n\n# Expected: No password prompt, smooth operation\n\n\n### Scenario 3: Wrong Password 3 Times\n\n\nprovisioning -c server create\n# Password: [wrong]\n# Password: [wrong]\n# Password: [wrong]\n\n# Expected: Same as CTRL-C (treated as cancellation)\n\n\n### Scenario 4: Multiple Servers, Cancel on Second\n\n\n# If creating multiple servers and CTRL-C on second:\n# - First server completes successfully\n# - Second server shows cancellation message\n# - Operation stops, doesn't proceed to third\n\n\n## Maintenance Notes\n\n### Adding New Sudo Commands\n\nWhen adding new sudo commands to the codebase:\n\n1. Wrap with do --ignore-errors + complete\n2. Check for exit code 1 + "password is required"\n3. Return false on cancellation\n4. Let caller handle the false return value\n\nExample template:\n\n\nlet result = (do --ignore-errors { ^sudo new-command } | complete)\nif $result.exit_code == 1 and ($result.stderr | str contains "password is required") {\n print "\n⚠ Operation cancelled - sudo password required"\n return false\n}\n\n\n### Common Pitfalls\n\n1. Don't use exit: It kills the entire process\n2. Don't use mutable variables in closures: Use reduce instead\n3. Don't ignore return values: Always check and propagate\n4. Don't forget the pre-check warning: Users should know sudo is needed\n\n## Future Improvements\n\n1. Sudo Credential Manager: Optionally use a credential manager (keychain, etc.)\n2. Sudo-less Mode: Alternative implementation that doesn't require root\n3. Timeout Handling: Detect when sudo times out waiting for password\n4. Multiple Password Attempts: Distinguish between CTRL-C and wrong password\n\n## References\n\n- Nushell complete command: https://www.nushell.sh/commands/docs/complete.html\n- Nushell reduce command: https://www.nushell.sh/commands/docs/reduce.html\n- Sudo exit codes: man sudo (exit code 1 = authentication failure)\n- POSIX signal conventions: SIGINT (CTRL-C) = 130\n\n## Related Files\n\n- provisioning/core/nulib/servers/ssh.nu - Core implementation\n- provisioning/core/nulib/servers/create.nu - Calls on_server_ssh\n- provisioning/core/nulib/servers/generate.nu - Calls on_server_ssh\n- docs/troubleshooting/CTRL-C_SUDO_HANDLING.md - User-facing docs\n- docs/quick-reference/SUDO_PASSWORD_HANDLING.md - Quick reference\n\n## Changelog\n\n- 2025-01-XX: Initial implementation with return values (v2)\n- 2025-01-XX: Fixed mutable variable capture with reduce pattern\n- 2025-01-XX: First attempt with exit 130 (reverted, caused process termination)