syntaxis/docs/howto/bootstrap-explained.md

728 lines
23 KiB
Markdown
Raw Normal View History

# Bootstrap Compiler: Complete Technical Walkthrough
## The Fundamental Problem: Bootstrapping Rust
The core challenge Rust faces is a **circular dependency**:
```
To compile Rust source code → You need rustc compiler
To get rustc compiler → You need to compile Rust source code
🔄 Circular dependency!
```
This is called the **"bootstrap problem"** and exists for all programming languages:
- To compile GCC (C compiler), you need an existing C compiler
- To compile Go, you need an existing Go compiler
- To compile Python, you need C (since CPython is written in C)
**Rust's solution:** Provide pre-compiled binaries so you can break the cycle.
---
## The Solution: rustup Bootstrap Architecture
### Level 1: The Very First Bootstrap
When you run:
```bash
curl https://sh.rustup.rs | bash
```
Here's what actually happens, step-by-step:
**Step 1: Download the Installer Script**
```bash
curl https://sh.rustup.rs
```
- Downloads a simple bash script (~50KB) from GitHub's CDN
- This script is NOT compiled - it's human-readable bash code
- No circular dependency here: bash shell exists on all Unix systems
- The script is the "key that unlocks Rust installation"
**Step 2: Execute the Installer**
```bash
| bash
```
- Pipes the script directly to bash interpreter
- Bash executes the script line-by-line
- No compilation needed for the installer itself
**Step 3: What the Installer Script Does**
The installer script (`sh.rustup.rs`) performs these operations:
```bash
# 1. Detect the system
SYSTEM_ARCH=$(uname -m) # arm64, x86_64, etc.
SYSTEM_OS=$(uname -s) # Darwin, Linux, etc.
# 2. Determine the target triple
TARGET_TRIPLE="$SYSTEM_ARCH-$SYSTEM_OS" # aarch64-apple-darwin
# 3. Download pre-compiled Rust binary
curl -O https://static.rust-lang.org/dist/rust-latest-$TARGET_TRIPLE.tar.gz
# 4. Extract and install
tar -xzf rust-latest-$TARGET_TRIPLE.tar.gz
./install.sh # Installs to ~/.rustup/toolchains/
# 5. Configure PATH
export PATH="$HOME/.rustup/toolchains/stable-$TARGET_TRIPLE/bin:$PATH"
```
**Key Insight:** The installer script doesn't compile Rust. It **downloads pre-compiled Rust binaries** from `https://static.rust-lang.org/dist/`.
---
## Level 2: Pre-Compiled Binaries Repository
The real magic happens at: **https://static.rust-lang.org/dist/**
This is a massive CDN hosting pre-compiled Rust binaries for 50+ architectures:
### Available Pre-Compiled Rust Binaries
```
# macOS (Apple Silicon)
rust-1.91.1-aarch64-apple-darwin.tar.gz
rust-1.91.1-aarch64-apple-darwin-FIPS.tar.gz
# macOS (Intel)
rust-1.91.1-x86_64-apple-darwin.tar.gz
rust-1.91.1-x86_64-apple-darwin-FIPS.tar.gz
# Linux (ARM 64-bit)
rust-1.91.1-aarch64-unknown-linux-gnu.tar.gz
rust-1.91.1-aarch64-unknown-linux-musl.tar.gz
# Linux (Intel 64-bit)
rust-1.91.1-x86_64-unknown-linux-gnu.tar.gz
rust-1.91.1-x86_64-unknown-linux-musl.tar.gz
# Windows (Intel 64-bit)
rust-1.91.1-x86_64-pc-windows-gnu.tar.gz
rust-1.91.1-x86_64-pc-windows-msvc.tar.gz
# Windows (ARM 64-bit)
rust-1.91.1-aarch64-pc-windows-msvc.tar.gz
# Plus 40+ more for: Android, FreeBSD, iOS, WebAssembly, RISC-V, PowerPC, MIPS, etc.
```
Each binary is:
- **Pre-compiled** for that specific architecture
- **Ready to use immediately** - no compilation step
- **Signed and checksummed** for security verification
- **Tested** on that architecture before release
### Inside a Pre-Compiled Binary Archive
When you download `rust-1.91.1-aarch64-apple-darwin.tar.gz`:
```
rust-1.91.1-aarch64-apple-darwin/
├── rustc # The Rust compiler executable
│ ├── bin/rustc # Actual compiled ARM64 binary
│ ├── bin/cargo # Package manager
│ ├── bin/rustfmt # Code formatter
│ ├── bin/clippy-driver # Linter
│ └── lib/ # Supporting libraries (.dylib, .a files)
├── cargo/ # Package manager
│ ├── bin/cargo
│ └── lib/
├── rust-std/ # Standard library (pre-compiled for ARM64)
│ └── lib/rustlib/aarch64-apple-darwin/lib/
│ ├── libstd.rlib # Standard library
│ ├── libcore.rlib # Core library
│ └── ... (250+ .rlib files)
└── rust-docs/ # Documentation
└── share/doc/
```
**Critical Detail:** The standard library (`libstd.rlib`) is **pre-compiled for ARM64**. This is essential because:
- When YOU compile your Rust code with `cargo build`, rustc links your code with the pre-compiled `libstd`
- rustc doesn't recompile the standard library each time (would be slow)
- The pre-compiled standard library is compatible with ARM64 architecture
---
## Level 3: The Bootstrapping Timeline
Here's the actual sequence of events when you install Rust:
### Timeline of `curl https://sh.rustup.rs | bash`
```
┌─────────────────────────────────────────────────────────────┐
│ 1. You run: curl https://sh.rustup.rs | bash │
│ ⏱️ IMMEDIATE (< 1 second)
│ - Downloads 50KB bash script from GitHub │
│ - Pipes to bash interpreter (already exists on system) │
│ - No compilation │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 2. Installer script detects system │
│ ⏱️ IMMEDIATE (< 1 second)
│ - Runs: uname -s → "Darwin" │
│ - Runs: uname -m → "arm64" │
│ - Constructs target: "aarch64-apple-darwin" │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 3. Downloads pre-compiled Rust │
│ ⏱️ 30-120 seconds (depending on network) │
│ - URL: https://static.rust-lang.org/dist/ │
│ - File: rust-1.91.1-aarch64-apple-darwin.tar.gz (~100MB)│
│ - Uses curl (already exists on system) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 4. Extracts and installs │
│ ⏱️ 5-10 seconds │
│ - Unpacks to ~/.rustup/toolchains/stable-aarch64-... │
│ - Runs installer.sh (bash script, not compiled) │
│ - Sets up ~/.rustup/settings.toml │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 5. Configures PATH │
│ ⏱️ IMMEDIATE (< 1 second)
│ - Adds ~/.rustup/toolchains/stable-.../bin to PATH │
│ - rustc is now available globally │
└─────────────────────────────────────────────────────────────┘
🎉 Rust is installed and ready to use!
```
**Total time:** ~40-150 seconds (mostly network download)
### After Rust is Installed
Now that you have a working `rustc`, you can:
```bash
# Compile Rust source code
cargo build
# Why does this work now?
# 1. rustc exists (we just installed it)
# 2. cargo exists (came with rustc download)
# 3. rust-std (standard library) is pre-compiled for your arch
# 4. No circular dependency anymore!
```
---
## Level 4: Why Pre-Compiled Binaries?
Rust COULD require you to compile from source on first install:
```bash
git clone https://github.com/rust-lang/rust.git
cd rust
./configure
make # Takes 2-3 HOURS on modern hardware
make install
```
But Rust chose the pre-compiled approach instead. Why?
### 1. **Speed & Accessibility**
- First-time users get Rust in 1-2 minutes, not 3 hours
- No compiler toolchain required on your machine (no C compiler needed)
- Perfect for beginners and CI/CD systems
### 2. **Consistency**
- Everyone gets the exact same pre-tested binaries
- No variation based on compiler version, flags, or local environment
- Fewer bugs due to unusual build configurations
### 3. **Offline Installation**
- Once downloaded, no internet needed
- Pre-compiled binaries work offline
- Perfect for closed networks or airplane mode
### 4. **Size**
- Pre-compiled binary: ~100MB
- Source code + build artifacts: 2-5GB
- Pre-compiled is much more portable
### 5. **Trust & Security**
- Rust can cryptographically sign binaries
- Users can verify authenticity and integrity
- Pre-compiled binaries are tested on actual hardware
---
## Level 5: Integration with our install.sh
Now, how does this relate to our `install.sh`?
### Our Installation Flow
```bash
$ curl -sSL https://raw.githubusercontent.com/.../install.sh | bash
```
**Our install.sh does this:**
```bash
# Phase 1: Detect system (same as rustup)
SYSTEM_OS=$(uname -s)
SYSTEM_ARCH=$(uname -m)
# Phase 2: Install Rust using the bootstrap approach
if ! command -v rustc &> /dev/null; then
curl https://sh.rustup.rs | bash # ← Uses rustup bootstrap!
source $HOME/.cargo/env
fi
# Phase 3: Install NuShell using cargo (now that rustc exists)
cargo install nu --locked
# Phase 4: Install Just using cargo
cargo install just --locked
# Phase 5: Clone our syntaxis repository
git clone https://github.com/.../syntaxis.git
# Phase 6: Build our project using cargo
cd syntaxis
cargo build --release
# Phase 7: Deploy configuration
./scripts/install-cli.nu
# Phase 8: Verify installation
./verify-installation.sh
```
### The Chain of Dependencies
```
┌─────────────────────────────┐
│ curl https://sh.rustup.rs │ ← Uses: bash (system), curl (system)
│ (Pre-compiled Rust) │
└──────────────┬──────────────┘
✅ rustc + cargo
┌─────────────────────────────────────┐
│ cargo install nu --locked │ ← Uses: rustc (just installed)
│ (Compiles NuShell from source) │
└──────────────┬──────────────────────┘
✅ nu shell
┌─────────────────────────────────────┐
│ cargo install just --locked │ ← Uses: rustc (just installed)
│ (Compiles Just from source) │
└──────────────┬──────────────────────┘
✅ just recipe runner
┌─────────────────────────────────────┐
│ cargo build --release │ ← Uses: rustc (just installed)
│ (Compiles syntaxis) │
└──────────────┬──────────────────────┘
✅ syntaxis binaries
```
**Key Insight:** Everything chains from the one pre-compiled binary: rustc. Once we have rustc, we can compile anything else.
---
## Level 6: Comparison: Bootstrap vs. All Pre-Compiled
### Approach A: Bootstrap Only (What Rust Does) ✅
```
You have: bash, curl, tar
You download: Pre-compiled rustc (1 binary)
You can now: Compile anything (NuShell, Just, our project)
Advantages:
- Fast (1-2 min)
- Small download (100MB)
- Works with only bash + curl
- User gets latest Rust features
```
### Approach B: Everything Pre-Compiled ❌
```
You have: bash, curl, tar
You download: Pre-compiled rustc + nu + just + workspace binaries (500MB+)
You can now: Use everything immediately
Disadvantages:
- Slow (much slower to download 500MB)
- Large download (bloated)
- Maintenance nightmare (rebuild binaries for every version)
- Can't customize builds
- Breaks with compiler updates
```
### Approach C: Compile Everything from Source ❌
```
You have: C compiler, C++ compiler, CMake, GNU Make, etc.
You download: Rust source code (2GB)
You compile: rustc from source (3 hours)
You compile: NuShell from source (1 hour)
You compile: Just from source (5 minutes)
You compile: syntaxis from source (10 minutes)
Total time: ~4 hours 15 minutes
Total pain: Extreme
Disadvantages:
- Extremely slow
- Complex toolchain requirements
- Compiler variations cause bugs
- Not suitable for one-click installation
```
**Rust chose Approach A: Bootstrap** because it's the optimal trade-off.
---
## Level 7: Deep Dive: Inside sh.rustup.rs
Let's look at the actual installer script structure:
### The Installer Script Architecture
```bash
#!/bin/bash
# File: sh.rustup.rs (yes, .rs extension is misleading - it's bash!)
# 1. CONFIGURATION SECTION
RUSTUP_REPO="https://github.com/rust-lang/rustup"
RUST_DIST_SERVER="https://static.rust-lang.org/dist/"
# 2. DETECTION SECTION
main() {
# Detect CPU architecture
local _arch="$(uname -m)"
case "$_arch" in
arm64)
_arch="aarch64"
;;
x86_64)
_arch="x86_64"
;;
aarch64)
_arch="aarch64"
;;
*)
echo "Unsupported architecture: $_arch"
exit 1
;;
esac
# Detect OS
local _os="$(uname -s)"
case "$_os" in
Darwin)
_os="apple-darwin"
;;
Linux)
_os="unknown-linux-gnu"
;;
*)
echo "Unsupported OS: $_os"
exit 1
;;
esac
# Construct target triple
local _target="$_arch-$_os"
echo "Detected target: $_target"
# 3. DOWNLOAD SECTION
local _url="$RUST_DIST_SERVER/rustup/dist/$_target/rustup-init"
echo "Downloading: $_url"
curl -sSL "$_url" -o rustup-init
# 4. VERIFY SECTION
# Download signature
curl -sSL "$_url.sha256" -o rustup-init.sha256
# Verify integrity
sha256sum -c rustup-init.sha256
# 5. EXECUTE SECTION
chmod +x rustup-init
./rustup-init -y --default-toolchain stable
# 6. CLEANUP
rm rustup-init rustup-init.sha256
}
main "$@"
```
### What Makes This Work
1. **Minimal Dependencies:**
- `bash` ✅ (every Unix system has this)
- `curl` ✅ (standard on most systems)
- `uname` ✅ (standard Unix command)
- `chmod` ✅ (standard Unix command)
2. **Universal Approach:**
- Doesn't require GCC, Clang, LLVM, CMake, or any compiler
- Works on minimal systems (embedded Linux, minimal Docker images, etc.)
3. **Smart Architecture Detection:**
- Maps system architectures to Rust target triples
- Handles naming variations (arm64 vs aarch64)
4. **Security:**
- Downloads and verifies SHA256 checksums
- Ensures binary wasn't corrupted or tampered with
5. **Chainable:**
- Output designed to be piped directly: `curl ... | bash`
- Returns appropriate exit codes
---
## Level 8: Why Our install.sh Depends on This
Our `install.sh` script is 843 lines of bash that automates everything, but it relies on rustup bootstrap for Rust because:
### Phase 2 of Our Install.sh
```bash
# ────────────────────────────────────────────────────────────
# Phase 2: Install Rust (using rustup bootstrap)
# ────────────────────────────────────────────────────────────
install_rust() {
echo "Installing Rust..."
if command -v rustc &> /dev/null; then
echo "✓ Rust already installed: $(rustc --version)"
return 0
fi
# Use the official rustup bootstrap installer
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
# Source the cargo environment
source "$HOME/.cargo/env"
echo "✓ Rust installed: $(rustc --version)"
}
```
**Why we do this:**
1. **Reliability:** rustup is the official Rust installer, maintained by Rust core team
2. **Compatibility:** Handles all edge cases and architectures
3. **Security:** Officially signed and verified binaries
4. **Simplicity:** Don't reinvent the wheel - let the experts handle Rust installation
5. **Maintainability:** If Rust changes, they update rustup, we don't need to update our script
### Could We Bundle Rust in Our Repo?
**Theoretically yes, practically no:**
```
If we pre-compiled for all 50+ Rust targets:
- Repository size: +200GB
- Build time: +4 hours per release
- Maintenance: Rebuild for every Rust version (weekly releases)
- Distribution: CDN would cost thousands per month
Instead, we:
- Rely on Rust's official distribution system
- Repository size: ~500MB
- Build time: ~15 minutes
- Maintenance: Zero additional work
- Distribution: Free (GitHub's CDN)
```
---
## Summary: The Bootstrap Chain
### What Exists Before Installation
```
System Tools (Already on every OS):
├── bash shell
├── curl (or wget)
├── tar/gzip
├── uname
└── chmod
```
### Phase 1: Bootstrap Rust
```
bash shell + curl
→ Download https://sh.rustup.rs (50KB script)
→ Execute installer (bash)
→ Download pre-compiled rustc (100MB)
→ Install to ~/.rustup/
```
### Phase 2: Rust Installs Everything Else
```
rustc (now available)
→ cargo install nu (1-2 minutes)
→ cargo install just (1-2 minutes)
→ cargo build our project (5-10 minutes)
```
### Phase 3: Everything Works
```
✅ rustc - Rust compiler (pre-compiled)
✅ cargo - Package manager (pre-compiled)
✅ nu - Shell (compiled from source)
✅ just - Task runner (compiled from source)
✅ Our app - Compiled from source
```
---
## Technical FAQ
### Q1: What happens if the pre-compiled binary is for the wrong architecture?
**A:** The installer detects and fails gracefully:
```bash
$ curl https://sh.rustup.rs | bash
# On unsupported architecture:
# ERROR: The Rust working group has not yet provided
# a stable Rust build for aarch64-unknown-freebsd
```
You'd need to compile from source (takes 3+ hours).
### Q2: Does the pre-compiled binary include all architectures?
**A:** No, each download is architecture-specific:
- `aarch64-apple-darwin` (M1/M2 Mac)
- `x86_64-apple-darwin` (Intel Mac)
- `x86_64-unknown-linux-gnu` (Linux Intel)
- `aarch64-unknown-linux-gnu` (Linux ARM)
- `x86_64-pc-windows-msvc` (Windows Intel)
- etc. (50+ variants)
Each is ~100MB. You only download what you need (~100MB).
### Q3: Is downloading pre-compiled binaries safe?
**A:** Yes, multiple security layers:
1. **HTTPS only** - Connection is encrypted
2. **Signature verification** - SHA256 checksums published
3. **Code signing** - Binaries are cryptographically signed
4. **Mirror infrastructure** - Uses CloudFront CDN (Amazon's infrastructure)
5. **Audit trail** - Build is fully reproducible
### Q4: What if I want to compile Rust from source?
**A:** You can override rustup:
```bash
# Use the nightly compiler
rustup install nightly
rustup default nightly
# Or compile from source entirely
git clone https://github.com/rust-lang/rust.git
cd rust
./configure
make # 3+ hours
make install
```
### Q5: How does Cargo know to use the pre-compiled standard library?
**A:** The standard library comes pre-compiled in the rustup download:
```
~/.rustup/toolchains/stable-aarch64-apple-darwin/
├── lib/rustlib/aarch64-apple-darwin/lib/
│ ├── libstd.rlib ← Pre-compiled standard library
│ ├── libcore.rlib
│ └── ... (250+ files)
```
When you run `cargo build`, rustc automatically links against these .rlib files.
### Q6: What if the downloaded pre-compiled binary is corrupt?
**A:** The installer verifies integrity:
```bash
# Download binary
curl https://static.rust-lang.org/dist/rust-1.91.1-aarch64-apple-darwin.tar.gz
# Download official SHA256
curl https://static.rust-lang.org/dist/rust-1.91.1-aarch64-apple-darwin.tar.gz.sha256
# Verify
sha256sum -c rust-1.91.1-aarch64-apple-darwin.tar.gz.sha256
# rust-1.91.1-aarch64-apple-darwin.tar.gz: OK
```
If checksums don't match, installation stops with error.
---
## Connection to Your Project
### How syntaxis Uses Bootstrap
When someone installs your project:
```bash
$ curl -sSL https://raw.githubusercontent.com/user/core/main/install.sh | bash
```
1.**Step 1:** Your `install.sh` (bash script) is downloaded and executed
2.**Step 2:** Your script detects the system (same logic as rustup)
3.**Step 3:** Your script calls rustup bootstrap: `curl https://sh.rustup.rs | bash`
4.**Step 4:** Rust is installed using official pre-compiled binaries
5.**Step 5:** Your script uses `cargo` to install NuShell and Just
6.**Step 6:** Your script uses `cargo` to build your project
7.**Step 7:** Users have fully functional syntaxis
**All without requiring any pre-compilation or hosting of binaries in your repository!**
### The Genius of This Approach
- Your repo stays small (<500MB)
- Installation is fast (5-15 minutes)
- Works on any supported architecture
- No maintenance burden for binary updates
- Leverages Rust's existing infrastructure
---
## Conclusion
The bootstrap compiler is elegant because it solves the circular dependency with minimum overhead:
1. **Pre-compile ONLY rustc** (the minimal dependency)
2. **Use rustc to compile everything else** (NuShell, Just, your app)
3. **No recursion or chicken-egg problem**
This is why `curl https://sh.rustup.rs | bash` works and why our `install.sh` simply delegates to it rather than trying to reinvent the wheel.
The pre-compiled Rust binaries at `static.rust-lang.org` are the "seed" that grows into the entire ecosystem.