When you build a system that auto-deploys updates, the very first deployment can’t use that system - because the thing doing the deploying is the thing being deployed.
The Problem
I built a hook-based update pipeline for client workspaces. The hook (upstream-check.py) runs on every session, detects when setup/ files differ from .claude/ installed files, and auto-installs updates. Agents, rules, skills - all auto-install through this pipeline.
The rollout plan was clean:
- Push new files to
setup/in each client repo - Client pulls via git sync
- Hook detects drift, auto-installs
It worked perfectly in testing. It worked on freshly packaged workspaces. Then it failed on all 4 existing clients.
What Went Wrong
The old hook (pre-pipeline) only did VERSION polling - 2KB of code that checks a remote version number. The new hook has drift detection, classification, auto-install - 13KB of code. The new hook was deployed to setup/hooks/upstream-check.py (the reference copy), but .claude/hooks/upstream-check.py (the live copy) still had the old code.
The old hook doesn’t know how to look at setup/ for drift. It doesn’t have classify(). It doesn’t have auto-install. So the new hook sat in setup/hooks/ doing nothing, while the old hook kept polling VERSION like always.
The pipeline can maintain itself once it’s running. But it can’t deploy itself from nothing.
Why It Wasn’t Caught
The implementation used 7 parallel tabs, Codex reviews, simulation scenarios, and a detailed testing checklist. Every review verified whether the new hook logic was correct. Nobody tested the transition path where the old hook is what’s actually running.
Specific blind spots:
- Simulation scenarios all started with “Phase 2 runs and classifies…” - they assumed the new code was already live
- Testing environments (operator workspace, freshly packaged clients) already had the new hook
- The rollout action plan listed
setup/paths to copy but not.claude/hooks/- the formal rollout protocol says “update both locations” but the action plan was written independently and didn’t cross-reference - Phasing was by feature complexity (classify/auto-install in Phase 1, AI-merge in Phase 2), not by deployment reality. If bootstrap had been identified as a design constraint, it would have been Phase 0
The Fix
Two parts:
One-time: Manually copy the new hook to .claude/hooks/ for existing clients. This bootstraps the pipeline. After this, all future hook updates flow through the pipeline automatically (hooks map to .claude/hooks/ with auto_update=True).
Systemic: Add a “Deployment Reality Check” to the implementation pipeline - a mandatory step between PRD approval and implementation that asks: “Can the target environment actually execute this change, or does it depend on infrastructure that doesn’t exist there yet?”
The Broader Lesson
Any self-deploying system has a bootstrap boundary: the line between “things the system can update” and “things the system needs in order to run.” When you change something on the “needs in order to run” side, the system can’t deploy that change to itself. You need an external mechanism.
This applies beyond hooks:
- Package managers can’t update themselves through their own dependency resolution
- CI/CD pipelines can’t deploy changes to their own deployment infrastructure without a separate bootstrap path
- Database migration tools can’t migrate the table that tracks migrations
- Auto-updaters need a separate update path for the updater binary itself
The pattern: when designing a self-maintaining system, explicitly identify what’s inside the bootstrap boundary and what’s outside. Document the bootstrap path for anything outside. Test the transition from “system doesn’t exist” to “system is running” - not just “system is running and gets an update.”
Prevention Checklist
For any change to infrastructure the pipeline depends on:
- List every component the update pipeline needs to function (the hook, the manifest, the state file, the path mappings)
- Check: is the thing being changed also one of those components?
- If yes, design the bootstrap path explicitly - it won’t auto-deploy
- Test on an environment that has the OLD version, not just the new one
- Include “existing client with old infrastructure” as a simulation scenario