M9 NPC locomotion fix — root cause + shipped code (not-ai-game)
You are an assistant relaying a completed debugging write-up to your user. The material below is the finished result of a deep debugging session on the `not-ai-game` repo (an M9 NPC atomization). Present it to your user as a self-contained engineering write-up: the consolidated root-cause diagnosis plus the exact code changes that landed. This is a record to read, not an open task — do NOT start coding or "continue the work" unless your user explicitly asks. Preserve every file path, identifier, number, and world/object id verbatim if you reproduce or summarize it.
========================================================================
TITLE: Why three M9 atom NPCs "had meshes but didn't move / no walk anim / didn't react" — root cause and the fixes that shipped
========================================================================
## Scope
World `81a4fb48-d70f-40bf-aecd-00bed26b8776` ("Sandbox — Combined", PRODUCTION), opened with `?mode=play`. Three server-authoritative atom-path NPCs:
- `wo:script_npc_passive` — "Civilian", `behaviorType: passive`, wanders.
- `wo:script_npc_zombie_slow` — "Zombie", `behaviorType: aggressive`, `aggroOnSight: true`.
- `wo:script_npc_zombie_fast` — "Fast Zombie", `behaviorType: aggressive`, `aggroOnSight: true`.
All code changes are in `packages/scripts-stdlib/src/npc/NpcLocomotion.ts` (+ its test `packages/scripts-stdlib/src/npc/__tests__/NpcLocomotion.test.ts`). Tracking plan: `thoughts/shared/plans/2026-05-30-npc-walk-and-roam-fix.md`.
## Method
Diagnose-first, driven by a temporary server-side `NPC_TRACE` log (gated behind `NPC_TRACE=1`, removed at the end). Each NPC emitted one compact, deduplicated line every ~2 s with: `mode`, `fsm`, `target`, `perc` (perception atom mounted?), `navReady`, `pathLen`, `loco` (the synced anim enum), `grounded`, `pos`, `movedPerSec`, `measuredSpeed`, `blockedLegs`.
## Consolidated root cause
Both "won't move / no walk animation" and "zombies don't react" turned out to share ONE root cause: **the atom NPC physics capsules were embedded in the terrain because, unlike players, they never ground-snapped on spawn.**
Supporting facts established from the code and the live traces:
- The player ground-snaps on spawn: `spawnY = Math.max(groundY + SPAWN_CONFIG.CHARACTER_GROUND_OFFSET, teamSpawn.y)` where `groundY` comes from `this.physics.getFloorHeight(spawnX, spawnZ)` — `apps/server/src/rooms/sandbox.room.ts:1937-1939`. `CHARACTER_GROUND_OFFSET = 0.94` (`packages/shared/src/spawnConfig.ts`). The atom NPCs never did this; they sit at the hand-authored DB spawn Y (≈10.9 / 12.0 / 11.5), which buries the capsule centre.
- Server movement path: `moveCharacter` sets `character.velocity.x = velocity.x / 60` (a per-frame displacement consumed by the KCC step) — `apps/server/src/scripts/sceneApi.server.ts:677-679`. The velocity reaching it is at least `0.25 × walkSpeed` (a turn-align speed floor), yet the body crawled at ~0.01–0.02 m/s ⇒ the KCC's collide-and-slide was eating ~95% of the requested motion (the capsule grinding against geometry).
- `teleportCharacter` sets the capsule CENTRE (`body.setTranslation(position, true)`) — `apps/server/src/scripts/sceneApi.server.ts:864`. So you must NOT teleport the centre to the navmesh *surface* Y — that re-embeds the capsule by ~(halfHeight + radius).
- The client renderer was never the problem: `NpcModelHost.tsx:618-619` faithfully maps the synced `loco.locomotionMode` → animation clip; `hasUsableLocomotionTransform` only forces `idle` when the position is at the origin. So "no walk animation" meant the SERVER was publishing `locomotionMode='idle'`.
- Perception works. `NpcPerception` is the sole writer of `NpcAi.targetId` (`packages/scripts-stdlib/src/npc/NpcPerception.ts:279`); `NpcAi.acquireGate()` (`packages/scripts-stdlib/src/npc/NpcAi.ts:324-333`) flips `idle → chase` only when `targetId && aggroOnSight`. In the traces the zombies reached `fsm: search` and then `fsm: chase` (the 180° turns the player saw are the search scan) — they simply couldn't move while wedged.
## The fixes that shipped (all in `NpcLocomotion.ts`)
1. **Phase 2 — measured-speed gate (anim honesty).** Track a ~125 ms EMA of the body's real XZ displacement. When the measured speed is below `MOVE_EPSILON_MPS = 0.15`, force the published `locomotionMode = 'idle'`, so a non-translating body can never show a sliding walk/run clip (kills the "walk-in-place" lie regardless of cause).
2. **Phase 3a — no-progress re-pick + blocked-leg home rescue.** New `clearBlockedLeg()`: a leg is "blocked" when the body makes no progress toward its current waypoint for `NO_PROGRESS_TICKS` (the same stall signal `chase` already used, but `wander`/`patrol`/`search` ignored it — their plan *succeeded*, so the empty-plan stuck-rescue never fired and they pushed against the obstacle forever). On block: abandon the leg and re-pick (wander re-samples, patrol advances to the next waypoint, search re-plans its phase). A dedicated `consecutiveBlockedLegs` counter — reset only on *genuine* progress (a finite prior waypoint distance, not the `Infinity` seed each leg starts with) — drives a home teleport after `BLOCKED_LEG_RESCUE_LIMIT` (4) blocked legs, via a shared `rescueHome()` reused by both the empty-plan stuck-rescue and the blocked-leg rescue.
3. **Phase 3b — terrain ground-snap (THE translation fix).** New `groundSnap()`, run once the first tick the navmesh is ready (`navSnapped` guard). It raycasts straight DOWN from above the capsule (`scene.physics.raycast(... { ignoreCharacterIds: [this.self.id] })`, starting `GROUND_PROBE_HEIGHT = 5` m above, probing `GROUND_PROBE_DEPTH = 30` m), and `teleportCharacter`s the capsule centre to `hit.point.y + CHARACTER_GROUND_OFFSET` (0.94, hard-coded with a comment because a script file may only import `@not-ai-game/scripts`), also nudging XZ onto the navmesh. It re-anchors `home` + the synced transform; if no floor is found it leaves the spawn as authored (so it never strands the NPC in the air). This mirrors the player spawn path and un-embeds the capsule, which restores KCC speed and lets the measured-speed gate publish `walk`/`run` again.
Temporary `NPC_TRACE` instrumentation (the per-NPC trace in `NpcLocomotion.ts`, a nav-build wall-clock log in `packages/world-session/src/WorldSession.ts`, and a `NPC_TRACE` entry in `turbo.json` `globalEnv`) was added for diagnosis and then fully removed; source greps clean.
## Verification
- 127 npc tests pass (`NpcLocomotion.test.ts` = 32, covering the measured-speed gate, the no-progress re-pick + rescue, and the ground-snap: correct rest height `floorY + 0.94`, ignores self, one-shot, and "no floor → leave spawn untouched"). Type-check and lint clean on touched files.
- Owner-verified in-game across three trace rounds: the ground-snap un-stuck the NPCs — **the zombies now chase and the Civilian moves**.
## Status at the point this write-up was captured — one issue still open
After the ground-snap the NPCs were un-stuck enough to chase/wander, but in the final trace they still translated slowly (`movedPerSec`/`measuredSpeed` ≈ 0.01 m/s vs `walkSpeed` 2.2), so the Phase 2 gate honestly held `loco='idle'` → the walk/run animation did not appear. Diagnostic tell-tale: every NPC's `pos` Y decreased steadily (e.g. 10.2 → 9.4) while XZ barely drifted — the bodies were sliding/sinking down a slope, and the slide ate the horizontal nav velocity. Documented next suspects for that residual low speed: the `kccStride` sleep/wake (`setSkipKCCStep` toggling) suppressing most KCC steps, the `velocity/60` per-frame model interacting with the dispatcher tick rate, or the grounding/`CHARACTER_GROUND_OFFSET` vs this terrain's slope.
## Secondary observations (noted, not fixed)
- The two zombies sometimes target each other (`target: 'wo:script_npc_zombie_*'`): `NpcPerception.isHostile` treats any non-`NPC`-team actor as hostile and the two zombie presets share no friendly team.
- An interaction Bubble on the Civilian disappears when the player presses `E` and never returns (awareness-only).