{"conversation":{"id":"pub_iaBFRRSyx4hMGRYI","title":"AI Features and Models Powering Mirror Mirror"},"messages":[{"parts":[{"content":"Please describe the list of AI features and models and techniques that are at play from end-to-end to make mirror mirror happen.","timestamp":"2026-03-31T12:29:28.674072+00:00","part_kind":"user-prompt"}],"kind":"request","metadata":{"zmid":"joiqfxqa95mj"}},{"parts":[{"content":"The user is asking about the \"mirror mirror\" project. They have the folder `/home/workspace/mirrormirror` open. Let me explore this project to understand what it is and what AI features/models/techniques are involved.","id":null,"signature":"EpUDClkIDBgCKkAilTr1JPy894M80S8x/YqlupXrpZHOK3siOFsFgzWFEzPGqCyj9qqwmBpP7B+aso+G/FZNb+GD8DOVxk0occ03Mg9jbGF1ZGUtb3B1cy00LTY4ABIM5PLVNQmwAggxxfoKGgytHSa/fqfCU6+lmvMiMLETgXQ0GxelAxawVy6p6Ec9k4GnVWK35IDTAXRe0I4kQ5pq/VnSXXHIr7Z+1LChEirpAekE5NIiQEc97O9KdqYUvd45Gcnd7AmyCaLOiVabMf5d5CI14vJ9178jEb1zoje4SYmMw6zecZ9C5Xpw1CasUZGpvdmvqXDygkRs2Q0CWAeT6Zpb4knnAnZR9URLA7cdPhzPVW4zvzBp3qku7Eknr/uihYSw4VRlh3H08Y2yVNTuMyR78u85q4/VHpE5JHYTOWncOGOLZnsXAPSiMxwIJEJSeVt3P9I1wbqumnwFXnC0PKzl5i3fPnDyRk/vwpXDpf2ycL1inWc13mUMAMJRarNm3sbFjFzFyleqIQEoXN1eREE8eLnwyFJjGAE=","provider_name":"anthropic","provider_details":null,"part_kind":"thinking"},{"tool_name":"list_files","args":"{\"path\": \"/home/workspace/mirrormirror\"}","tool_call_id":"toolu_01K2k935jBunKLUmkHGgRSQL","id":null,"provider_name":null,"provider_details":null,"part_kind":"tool-call"}],"model_name":"zo:anthropic/claude-opus-4-6","input_tokens":36363,"output_tokens":120,"cache_read_tokens":null,"cache_write_tokens":36353,"timestamp":"2026-03-31T12:29:33.134342+00:00","kind":"response","persona_id":null,"persona_name":null,"persona_image":null,"metadata":{"zmid":"8ibeki6wzduf"}},{"parts":[{"tool_name":"list_files","content":"Showing 74 entries (12 directories, 62 files).\n\n- /home/workspace/mirrormirror/\n  - docs/\n    - superpowers/\n      - plans/\n        - 2026-03-24-3d-face-replica.md\n        - 2026-03-25-depth-backend.md\n        - 2026-03-26-eye-fill-and-multi-pose-head-coverage.md\n        - 2026-03-28-face-deformation.md\n        - 2026-03-28-scan-persistence.md\n        - 2026-03-29-mirror-voice-ux.md\n        - 2026-03-29-text-to-voice-mask.md\n      - specs/\n        - 2026-03-24-3d-face-replica-design.md\n        - 2026-03-25-depth-backend-design.md\n        - 2026-03-26-eye-fill-and-multi-pose-head-coverage.md\n        - 2026-03-28-face-deformation-design.md\n        - 2026-03-28-scan-persistence-design.md\n        - 2026-03-29-mirror-frame-design.md\n        - 2026-03-29-text-to-voice-mask-design.md\n  - index.html\n  - package-lock.json\n  - package.json\n  - public/\n    - assets/\n      - mirror-frame.png\n  - server/\n    - __init__.py\n    - chat.py\n    - depth.py\n    - main.py\n    - requirements.txt\n    - tests/\n      - __init__.py\n      - test_chat.py\n      - test_depth.py\n      - test_voice.py\n    - voice.py\n  - src/\n    - calibrate.js\n    - capture.js\n    - conversation.js\n    - deform.js\n    - face-topology.js\n    - gaze.js\n    - main.js\n    - motion.js\n    - phoneme.js\n    - reconstruction.js\n    - ... (3 more entries)\n    - storage.js\n    - style.css\n    - tests/\n      - conversation.test.js\n      - phoneme.test.js\n      - recorder.test.js\n      - storage-voice.test.js\n      - tts.test.js\n      - viseme.test.js\n    - tts.js\n    - viseme.js\n  - start.sh\n  - tests/\n    - capture.test.js\n    - deform.test.js\n    - face-topology.test.js\n    - motion.test.js\n    - reconstruction.test.js\n    - renderer.test.js\n    - setup.js\n    - storage.test.js\n  - vite.config.js","tool_call_id":"toolu_01K2k935jBunKLUmkHGgRSQL","metadata":null,"timestamp":"2026-03-31T12:29:36.327508+00:00","outcome":"success","part_kind":"tool-return"}],"kind":"request","metadata":{"zmid":"gtoufnlm6zt4"}},{"parts":[{"tool_name":"run_parallel_cmds","args":"{\"cmd_list\": [\"cat /home/workspace/mirrormirror/package.json\",\"cat /home/workspace/mirrormirror/server/requirements.txt\",\"cat /home/workspace/mirrormirror/server/main.py\",\"cat /home/workspace/mirrormirror/src/main.js\",\"cat /home/workspace/mirrormirror/start.sh\"]}","tool_call_id":"toolu_01Ay3pNSqVhtmxaGAex4kTpc","id":null,"provider_name":null,"provider_details":null,"part_kind":"tool-call"}],"model_name":"zo:anthropic/claude-opus-4-6","input_tokens":37187,"output_tokens":148,"cache_read_tokens":36353,"cache_write_tokens":833,"timestamp":"2026-03-31T12:29:39.086779+00:00","kind":"response","persona_id":null,"persona_name":null,"persona_image":null,"metadata":{"zmid":"p4yck40ej3cw"}},{"parts":[{"tool_name":"run_parallel_cmds","content":{"stdout":"[Command: cat /home/workspace/mirrormirror/package.json]\n{\n  \"name\": \"mirrormirror\",\n  \"type\": \"module\",\n  \"scripts\": {\n    \"dev\": \"vite\",\n    \"server\": \"uvicorn server.main:app --reload --port 8001\",\n    \"start\": \"npx concurrently -n vite,api -c cyan,magenta \\\"vite\\\" \\\"uvicorn server.main:app --reload --port 8001\\\"\",\n    \"build\": \"vite build\",\n    \"test\": \"vitest run\"\n  },\n  \"dependencies\": {\n    \"@mediapipe/face_mesh\": \"^0.4.1633559619\",\n    \"@mediapipe/tasks-vision\": \"^0.10.14\"\n  },\n  \"devDependencies\": {\n    \"fake-indexeddb\": \"^6.2.5\",\n    \"vite\": \"^5.0.0\",\n    \"vitest\": \"^1.0.0\"\n  }\n}\n\n[Command: cat /home/workspace/mirrormirror/server/requirements.txt]\nfastapi==0.111.0\nuvicorn[standard]==0.29.0\ntransformers==4.41.0\ntorch==2.3.0\nPillow==10.3.0\nnumpy>=1.26,<2\npytest==8.2.0\nhttpx==0.27.0\npython-dotenv==1.0.1\nrespx==0.21.1\npytest-asyncio==0.23.6\n\n[Command: cat /home/workspace/mirrormirror/server/main.py]\n# server/main.py\nimport asyncio\nimport logging\nfrom contextlib import asynccontextmanager\nfrom pathlib import Path\n\nlogging.basicConfig(\n    level=logging.INFO,\n    format=\"%(asctime)s [%(name)s] %(message)s\",\n    datefmt=\"%H:%M:%S\",\n)\n\nfrom fastapi import FastAPI, HTTPException, Request, Response\nfrom fastapi.staticfiles import StaticFiles\nfrom dotenv import load_dotenv\nfrom pydantic import BaseModel\nfrom starlette.middleware.base import BaseHTTPMiddleware\n\nload_dotenv()\n\nfrom server.depth import load_model, _infer\nfrom server.voice import clone_voice, tts_with_timestamps, transcribe_audio\nfrom server.chat import chat as chat_fn\n\n\n@asynccontextmanager\nasync def lifespan(app: FastAPI):\n    import server.depth as _depth_module\n    if _depth_module._model is None:\n        try:\n            load_model()\n        except ModuleNotFoundError:\n            # transformers not installed (e.g. in test environments where\n            # the model is injected via mocking); skip loading silently.\n            pass\n    yield\n\napp = FastAPI(lifespan=lifespan)\n\n@app.middleware(\"http\")\nasync def cache_control(request: Request, call_next):\n    response = await call_next(request)\n    if response.status_code >= 400:\n        response.headers[\"Cache-Control\"] = \"no-store, no-cache, must-revalidate\"\n        response.headers[\"CDN-Cache-Control\"] = \"no-store\"\n        response.headers[\"Cloudflare-CDN-Cache-Control\"] = \"no-store\"\n        response.headers[\"Surrogate-Control\"] = \"no-store\"\n    return response\n\n\nasync def run_depth(jpeg_bytes: bytes) -> bytes:\n    loop = asyncio.get_running_loop()\n    try:\n        return await asyncio.wait_for(\n            loop.run_in_executor(None, _infer, jpeg_bytes),\n            timeout=15.0,\n        )\n    except asyncio.TimeoutError:\n        raise HTTPException(status_code=503, detail=\"Depth inference timed out\")\n\n\n@app.post(\"/depth\")\nasync def depth_endpoint(request: Request) -> Response:\n    body = await request.body()\n    if not body:\n        raise HTTPException(status_code=400, detail=\"Empty body\")\n    png_bytes = await run_depth(body)\n    return Response(content=png_bytes, media_type=\"image/png\")\n\n\n@app.post(\"/api/clone\")\nasync def clone_endpoint(request: Request) -> dict:\n    audio_bytes = await request.body()\n    if not audio_bytes:\n        raise HTTPException(status_code=400, detail=\"Empty body\")\n    voice_id = await clone_voice(audio_bytes)\n    return {\"voice_id\": voice_id}\n\n\nclass TTSRequest(BaseModel):\n    text: str\n    voice_id: str\n\n@app.post(\"/api/tts\")\nasync def tts_endpoint(req: TTSRequest) -> dict:\n    return await tts_with_timestamps(req.text, req.voice_id)\n\n\n@app.post(\"/api/transcribe\")\nasync def transcribe_endpoint(request: Request) -> dict:\n    audio_bytes = await request.body()\n    if not audio_bytes:\n        raise HTTPException(status_code=400, detail=\"Empty body\")\n    text = await transcribe_audio(audio_bytes)\n    return {\"text\": text}\n\n\nclass ChatMessage(BaseModel):\n    role: str\n    content: str\n\nclass ChatRequest(BaseModel):\n    messages: list[ChatMessage]\n    system: str | None = None\n\n@app.post(\"/api/chat\")\nasync def chat_endpoint(req: ChatRequest) -> dict:\n    kwargs = {}\n    if req.system:\n        kwargs[\"system\"] = req.system\n    text = await chat_fn(\n        [m.model_dump() for m in req.messages],\n        **kwargs,\n    )\n    return {\"text\": text}\n\n\n# Serve compiled frontend from dist/ — only if the directory exists\n_dist = Path(__file__).parent.parent / \"dist\"\nif _dist.exists():\n    app.mount(\"/\", StaticFiles(directory=str(_dist), html=True), name=\"static\")\n\n[Command: cat /home/workspace/mirrormirror/src/main.js]\nwindow.__BUILD_TS = Date.now();\n// src/main.js\nimport { runCapture, POSE_INSTRUCTIONS, POSE_SEQUENCE } from './capture.js';\nimport { buildMesh } from './reconstruction.js';\nimport { createRenderer } from './renderer.js';\nimport { createMotionController, bindDeviceOrientation, bindMouseDrag } from './motion.js';\nimport { saveScan, listScans, loadScan, deleteScan, loadVoiceData } from './storage.js';\nimport { startGazeTracking } from './gaze.js';\nimport { createDeformer } from './deform.js';\nimport { startCalibration }                    from './calibrate.js';\nimport { INCANTATION }                         from './phoneme.js';\nimport { createConversation }                  from './conversation.js';\nimport { synthesize, transcribe }              from './tts.js';\nimport { textToPhonemes }                      from './phoneme.js';\nimport { buildTimeline, createVisemePlayer }   from './viseme.js';\nimport { createRecorder } from './recorder.js';\n\n// ── Utilities ─────────────────────────────────────────────────────────────────\n\nfunction showScreen(id) {\n  document.querySelectorAll('.screen').forEach(s => s.classList.remove('active'));\n  document.getElementById(id).classList.add('active');\n}\n\n\nfunction setInstruction(text) {\n  document.getElementById('pose-instruction').textContent = text;\n}\n\nconst ARROW_CHARS = { left: '◀', right: '▶', up: '▲', down: '▼' };\n\nfunction setPoseArrow(pose, countdown) {\n  const el = document.getElementById('pose-arrow');\n  if (countdown === 0 || !pose || pose === 'center') {\n    el.className = 'hidden'; return;\n  }\n  el.textContent = ARROW_CHARS[pose];\n  if (countdown === null) {\n    el.className = '';\n  } else {\n    el.className = '';\n    void el.offsetWidth;\n    el.classList.add('ticking');\n  }\n}\n\nfunction triggerFlash() {\n  const el = document.getElementById('flash-overlay');\n  el.classList.remove('flashing');\n  void el.offsetWidth;\n  el.classList.add('flashing');\n}\n\nfunction setProgress(pct) {\n  document.getElementById('progress-fill').style.width = `${pct}%`;\n}\n\nfunction formatDate(ts) {\n  const d = new Date(ts);\n  return d.toLocaleDateString('en-US', { month: 'short', day: 'numeric' }).toLowerCase();\n}\n\n// ── Module-scope viewer handles (set by processAndRender, cleared by teardownViewer) ──\nlet renderer, rafHandle, stopGaze = () => {}, deformer, currentBundles;\nlet captureInProgress = false;\nlet conversation = null;\nlet visemePlayer = null;\nlet currentScanId = null;\nlet currentMeshData = null;\nlet audioCtx = null;\nlet calibStop      = null;\nlet questionRec    = null;\nlet voiceState     = 'idle';\n\n// States: idle | recording-incantation | invoking | calibrated | recording-question | thinking | speaking\nfunction setVoiceUI(state) {\n  voiceState = state;\n  const btn     = document.getElementById('btn-invoke');\n  const portrait = document.getElementById('viewer-portrait');\n  const incant  = document.getElementById('incantation-text');\n\n  portrait.classList.remove('glow-record', 'glow-speak', 'glow-think');\n  btn.disabled = false;\n\n  switch (state) {\n    case 'idle':\n      btn.textContent = 'mirror, mirror ...';\n      incant.classList.add('hidden');\n      break;\n    case 'recording-incantation':\n      btn.textContent = '[ done ]';\n      portrait.classList.add('glow-record');\n      break;\n    case 'invoking':\n      btn.textContent = '[ invoking… ]';\n      btn.disabled = true;\n      incant.classList.add('hidden');\n      break;\n    case 'calibrated':\n      btn.textContent = 'mirror, mirror ...';\n      incant.classList.add('hidden');\n      break;\n    case 'recording-question':\n      btn.textContent = '[ done ]';\n      incant.classList.add('hidden');\n      portrait.classList.add('glow-speak');\n      break;\n    case 'thinking':\n      btn.textContent = '[ … ]';\n      btn.disabled = true;\n      incant.classList.add('hidden');\n      portrait.classList.add('glow-think');\n      break;\n    case 'speaking':\n      btn.textContent = '[ … ]';\n      btn.disabled = true;\n      incant.classList.add('hidden');\n      portrait.classList.add('glow-speak');\n      break;\n  }\n}\n\nfunction teardownViewer() {\n  cancelAnimationFrame(rafHandle);\n  if (deformer)  deformer.destroy();\n  stopGaze();\n  if (renderer) renderer.destroy();\n  renderer = rafHandle = deformer = currentBundles = undefined;\n  stopGaze = () => {};\n  // Reset voice state\n  if (calibStop) { calibStop(); calibStop = null; }\n  if (questionRec) { questionRec.stop().catch(() => {}); questionRec = null; }\n  if (visemePlayer) visemePlayer.stop();\n  visemePlayer = null;\n  conversation = null;\n  currentScanId = null;\n  currentMeshData = null;\n  window._calibOnFrame = null;\n  if (audioCtx) { audioCtx.close().catch(() => {}); audioCtx = null; }\n  setVoiceUI('idle');\n}\n\n// ── processAndRender ──────────────────────────────────────────────────────────\nasync function processAndRender(bundles, depthPromises, isSaved = false, scanId = null) {\n  teardownViewer();\n  showScreen('screen-processing');\n  setProgress(20);\n  await new Promise(r => setTimeout(r, 50));\n\n  currentBundles = bundles;\n  const meshData = await buildMesh(bundles, depthPromises.center);\n  setProgress(70);\n  await new Promise(r => setTimeout(r, 50));\n\n  showScreen('screen-viewer');\n  const viewerEl = document.getElementById('screen-viewer');\n  viewerEl.classList.toggle('is-saved', isSaved);\n  if (isSaved) renderScanList(document.getElementById('viewer-scan-list'));\n  const canvas  = document.getElementById('webgpu-canvas');\n  canvas.width  = window.innerWidth;\n  canvas.height = window.innerHeight;\n\n  renderer = await createRenderer(canvas, meshData);\n  deformer = createDeformer(meshData, renderer);\n  currentMeshData = meshData;\n  currentScanId = scanId;\n  conversation = createConversation();\n  visemePlayer = null;\n  if (scanId) {\n    const vd = await loadVoiceData(scanId);\n    if (vd) {\n      visemePlayer = createVisemePlayer(vd.viseme_shapes, deformer);\n      setVoiceUI('calibrated');\n    } else {\n      setVoiceUI('idle');\n    }\n  } else {\n    setVoiceUI('idle');\n  }\n  const ctrl = createMotionController();\n\n  bindDeviceOrientation(ctrl);\n  bindMouseDrag(ctrl, canvas);\n\n  stopGaze = () => {};\n  startGazeTracking(\n    (pitch, yaw, landmarks) => {\n      ctrl.setInput(pitch, yaw, 'gaze');\n      if (landmarks) {\n        if (voiceState !== 'thinking' && voiceState !== 'speaking') {\n          deformer.applyLandmarks(landmarks);\n        }\n        if (window._calibOnFrame) window._calibOnFrame(landmarks);\n      }\n    },\n    document.getElementById('gaze-preview'),\n  ).then(stop => { stopGaze = stop; }).catch(console.error);\n\n  let lastTime = performance.now();\n  function frame() {\n    const now = performance.now();\n    const dt  = (now - lastTime) / 1000;\n    lastTime  = now;\n    ctrl.tick(dt);\n    renderer.draw(ctrl.getModelMatrix());\n    rafHandle = requestAnimationFrame(frame);\n  }\n  setProgress(100);\n  rafHandle = requestAnimationFrame(frame);\n}\n\n// ── Scan list rendering ───────────────────────────────────────────────────────\nconst PAGE_SIZE = 5;\nlet scanPage = 0;\n\nasync function renderScanList(container = document.getElementById('scan-list')) {\n  const all = await listScans();\n  const total = all.length;\n\n  if (total === 0) {\n    container.innerHTML = '';\n    return;\n  }\n\n  const slice = all.slice(scanPage * PAGE_SIZE, (scanPage + 1) * PAGE_SIZE);\n\n  container.innerHTML = '<div class=\"scan-list-label\">saved scans</div>';\n\n  for (const scan of slice) {\n    const item = document.createElement('div');\n    item.className = 'scan-item';\n    item.dataset.id = scan.id;\n\n    const nameEl = document.createElement('span');\n    nameEl.className = scan.name ? 'scan-item-name' : 'scan-item-name untitled';\n    nameEl.textContent = scan.name || 'untitled';\n\n    const dateEl = document.createElement('span');\n    dateEl.className = 'scan-item-date';\n    dateEl.textContent = formatDate(scan.ts);\n\n    const arrowEl = document.createElement('span');\n    arrowEl.className = 'scan-item-arrow';\n    arrowEl.textContent = '›';\n\n    const delEl = document.createElement('span');\n    delEl.className = 'scan-item-del';\n    delEl.dataset.del = scan.id;\n    delEl.textContent = '×';\n\n    item.append(nameEl, dateEl, arrowEl, delEl);\n    container.appendChild(item);\n  }\n\n  const hasPrev = scanPage > 0;\n  const hasNext = (scanPage + 1) * PAGE_SIZE < total;\n  if (total > PAGE_SIZE) {\n    const paginationHTML = `<div class=\"scan-pagination\">\n      <span id=\"scan-prev\" class=\"${hasPrev ? '' : 'disabled'}\">‹ prev</span>\n      <span id=\"scan-next\" class=\"${hasNext ? '' : 'disabled'}\">next ›</span>\n    </div>`;\n    container.insertAdjacentHTML('beforeend', paginationHTML);\n  }\n\n  // Attach event listeners\n  container.querySelectorAll('.scan-item').forEach(el => {\n    el.addEventListener('click', e => {\n      // Don't load if the delete button was clicked\n      if (e.target.dataset.del) return;\n      if (captureInProgress) return;\n      loadScanById(el.dataset.id);\n    });\n  });\n\n  container.querySelectorAll('.scan-item-del').forEach(el => {\n    el.addEventListener('click', async e => {\n      e.stopPropagation();\n      if (!confirm('Delete this scan?')) return;\n      await deleteScan(el.dataset.del);\n      // Stay on current page or step back if it becomes empty\n      const remaining = total - 1;\n      if (scanPage * PAGE_SIZE >= remaining) scanPage = Math.max(0, scanPage - 1);\n      renderScanList();\n    });\n  });\n\n  const prevBtn = container.querySelector('#scan-prev');\n  const nextBtn = container.querySelector('#scan-next');\n  if (prevBtn) prevBtn.addEventListener('click', () => { scanPage--; renderScanList(); });\n  if (nextBtn) nextBtn.addEventListener('click', () => { scanPage++; renderScanList(); });\n}\n\n// ── Load saved scan by id ─────────────────────────────────────────────────────\nasync function loadScanById(id) {\n  try {\n    const saved = await loadScan(id);\n    if (!saved) return;\n    await processAndRender(saved.bundles, saved.depthPromises, true, id);\n  } catch (err) {\n    teardownViewer();\n    console.error(err);\n    alert(`Error loading scan: ${err.message ?? err}`);\n    showScreen('screen-home');\n  }\n}\n\n// ── New scan — circle click ───────────────────────────────────────────────────\ndocument.getElementById('camera-frame-wrapper').addEventListener('click', async () => {\n  if (captureInProgress) return;\n  captureInProgress = true;\n  document.getElementById('camera-hint').style.display = 'none';\n\n  const videoEl       = document.getElementById('video');\n  const overlayCanvas = document.getElementById('guide-canvas');\n\n  try {\n    const { bundles, depthPromises } = await runCapture(videoEl, overlayCanvas, (poseIdx, countdown) => {\n      setInstruction(POSE_INSTRUCTIONS[POSE_SEQUENCE[poseIdx]]);\n      setPoseArrow(POSE_SEQUENCE[poseIdx], countdown);\n      if (countdown === 0) triggerFlash();\n    });\n\n    // Clear capture UI before transitioning\n    setInstruction('');\n\n    await processAndRender(bundles, depthPromises);\n\n  } catch (err) {\n    teardownViewer();\n    setInstruction('');\n    console.error(err);\n    alert(`Error: ${err.message ?? err}\\n\\nMake sure you are on HTTPS or localhost and using a supported browser (Chrome 113+).`);\n    showScreen('screen-home');\n  } finally {\n    captureInProgress = false;\n  }\n});\n\n// ── Home button — tear down viewer ───────────────────────────────────────────\ndocument.getElementById('btn-home').addEventListener('click', () => {\n  teardownViewer();\n  document.getElementById('screen-viewer').classList.remove('debug-visible');\n  showScreen('screen-home');\n  document.getElementById('video').play().catch(() => {});\n  renderScanList();  // refresh list in case a save just happened\n  document.getElementById('camera-hint').style.display = '';\n});\n\n// ── Save flow ─────────────────────────────────────────────────────────────────\nconst savePrompt = document.getElementById('save-prompt');\nconst saveNameInput = document.getElementById('save-name');\n\ndocument.getElementById('btn-save').addEventListener('click', () => {\n  saveNameInput.value = '';\n  savePrompt.classList.remove('hidden');\n  saveNameInput.focus();\n});\n\ndocument.getElementById('save-cancel').addEventListener('click', () => {\n  savePrompt.classList.add('hidden');\n});\n\nconst saveConfirmBtn = document.getElementById('save-confirm');\nsaveConfirmBtn.addEventListener('click', async () => {\n  if (!currentBundles || saveConfirmBtn.disabled) return;\n  const name = saveNameInput.value.trim();\n  saveConfirmBtn.disabled = true;\n  saveConfirmBtn.textContent = '[ saving… ]';\n  try {\n    await saveScan(currentBundles, name);\n    savePrompt.classList.add('hidden');\n    const btnSave = document.getElementById('btn-save');\n    btnSave.textContent = '[ saved ]';\n    setTimeout(() => { btnSave.textContent = '[ save ]'; }, 1500);\n    scanPage = 0;\n    renderScanList();\n  } catch (err) {\n    console.error('Save failed:', err);\n    alert(`Save failed: ${err.message ?? err}`);\n  } finally {\n    saveConfirmBtn.disabled = false;\n    saveConfirmBtn.textContent = '[ save ]';\n  }\n});\n\n// Also confirm on Enter in the name field\nsaveNameInput.addEventListener('keydown', e => {\n  if (e.key === 'Enter')  document.getElementById('save-confirm').click();\n  if (e.key === 'Escape') document.getElementById('save-cancel').click();\n});\n\n// ── Debug toggle (d key on viewer screen) ────────────────────────────────────\ndocument.addEventListener('keydown', e => {\n  if (e.key === 'd'\n    && document.getElementById('screen-viewer').classList.contains('active')\n    && e.target.tagName !== 'INPUT'\n    && e.target.tagName !== 'TEXTAREA') {\n    document.getElementById('screen-viewer').classList.toggle('debug-visible');\n  }\n});\n\n// ── Invoke button — voice state machine ───────────────────────────────────────\ndocument.getElementById('btn-invoke').addEventListener('click', async () => {\n  if (voiceState === 'recording-incantation') {\n    if (calibStop) { calibStop(); calibStop = null; }\n    return;\n  }\n\n  if (voiceState === 'recording-question') {\n    const rec = questionRec;\n    if (!rec) return; // VAD already handling\n    questionRec = null;\n    const blob = await rec.stop();\n    await handleQuestion(blob);\n    return;\n  }\n\n  if (voiceState === 'idle') {\n    if (!currentMeshData) return;\n\n    if (!currentScanId) {\n      if (!currentBundles) return;\n      try {\n        const id = await saveScan(currentBundles);\n        currentScanId = id;\n        const viewerEl = document.getElementById('screen-viewer');\n        viewerEl.classList.add('is-saved');\n        renderScanList(document.getElementById('viewer-scan-list'));\n      } catch (err) {\n        console.error('Auto-save failed:', err);\n        return;\n      }\n    }\n\n    const incant = document.getElementById('incantation-text');\n    incant.innerHTML = `<span id=\"incant-label\">to invoke the mirror, speak the incantation:</span><br>\"${INCANTATION}\"`;\n    incant.classList.remove('hidden');\n    setVoiceUI('recording-incantation');\n\n    const { onFrame, stop } = await startCalibration({\n      scanId:      currentScanId,\n      meshData:    currentMeshData,\n      onAutoStop:  () => setVoiceUI('invoking'),\n      onDone: () => {\n        calibStop = null;\n        window._calibOnFrame = null;\n        const capturedScanId   = currentScanId;\n        const capturedDeformer = deformer;\n        loadVoiceData(capturedScanId).then(vd => {\n          if (vd && deformer === capturedDeformer)\n            visemePlayer = createVisemePlayer(vd.viseme_shapes, deformer);\n        });\n        setVoiceUI('calibrated');\n      },\n      onError: err => {\n        calibStop = null;\n        window._calibOnFrame = null;\n        console.error('Calibration failed:', err);\n        setVoiceUI('idle');\n      },\n    });\n    calibStop = stop;\n    window._calibOnFrame = onFrame;\n    return;\n  }\n\n  if (voiceState === 'calibrated') {\n    if (!currentScanId || !visemePlayer) return;\n    setVoiceUI('recording-question');\n    questionRec = createRecorder({\n      onSilence: async () => {\n        if (voiceState !== 'recording-question' || !questionRec) return;\n        const rec = questionRec;\n        questionRec = null;\n        const blob = await rec.stop();\n        await handleQuestion(blob);\n      },\n      silenceMs: 1500,\n    });\n    await questionRec.start();\n  }\n});\n\nasync function handleQuestion(blob) {\n  const capturedScanId = currentScanId; // capture before any await\n  setVoiceUI('thinking');\n  try {\n    const voiceData    = await loadVoiceData(capturedScanId);\n    if (!voiceData) { setVoiceUI('calibrated'); return; }  // Fix 3: null guard\n    const question     = await transcribe(blob);\n    if (!question.trim()) { setVoiceUI('calibrated'); return; }\n    if (currentScanId !== capturedScanId) return; // torn down — bail silently\n    const responseText = await conversation.send(question);\n    if (!audioCtx) audioCtx = new AudioContext();\n    const { audioBuffer, alignment } = await synthesize(responseText, voiceData.voice_id, audioCtx);\n    const phonemes = textToPhonemes(responseText);\n    const timeline = buildTimeline(phonemes, alignment);\n    if (currentScanId !== capturedScanId) return; // torn down — bail silently\n    setVoiceUI('speaking');\n    visemePlayer.play(timeline, audioBuffer, audioCtx);\n    setTimeout(() => {\n      if (currentScanId === capturedScanId) setVoiceUI('calibrated');\n    }, audioBuffer.duration * 1000 + 200);\n  } catch (err) {\n    console.error('Question failed:', err);\n    if (currentScanId === capturedScanId) setVoiceUI('calibrated');\n  }\n}\n\n// ── Init: start camera + populate scan list ───────────────────────────────────\nrenderScanList();\n\n// Start camera feed on load so the circle shows live video immediately\ntry {\n  navigator.mediaDevices.getUserMedia({ video: { facingMode: 'user' }, audio: false })\n    .then(stream => { document.getElementById('video').srcObject = stream; })\n    .catch((err) => {\n      const h = document.getElementById('camera-hint');\n      if (h) h.textContent = 'Camera async: ' + err.name + ' – ' + err.message;\n    });\n} catch (err) {\n  const h = document.getElementById('camera-hint');\n  if (h) h.textContent = 'Camera sync: ' + (err.message || err);\n}\n// build 1774876912\n\n[Command: cat /home/workspace/mirrormirror/start.sh]\n#!/bin/bash\nsource /root/.zo_secrets 2>/dev/null\nexec uvicorn server.main:app --host 0.0.0.0 --port 8051","stderr":"","returncode":0},"tool_call_id":"toolu_01Ay3pNSqVhtmxaGAex4kTpc","metadata":null,"timestamp":"2026-03-31T12:29:41.163117+00:00","outcome":"success","part_kind":"tool-return"}],"kind":"request","metadata":{"zmid":"ks9m1nbe5sn1"}},{"parts":[{"tool_name":"run_parallel_cmds","args":"{\"cmd_list\": [\"cat /home/workspace/mirrormirror/server/depth.py\",\"cat /home/workspace/mirrormirror/server/voice.py\",\"cat /home/workspace/mirrormirror/server/chat.py\",\"cat /home/workspace/mirrormirror/src/capture.js\",\"cat /home/workspace/mirrormirror/src/reconstruction.js\",\"cat /home/workspace/mirrormirror/src/gaze.js\",\"cat /home/workspace/mirrormirror/src/deform.js\",\"cat /home/workspace/mirrormirror/src/phoneme.js\",\"cat /home/workspace/mirrormirror/src/viseme.js\",\"cat /home/workspace/mirrormirror/src/tts.js\"]}","tool_call_id":"toolu_01DmDuzqgGJGuxMs5DUkrkd8","id":null,"provider_name":null,"provider_details":null,"part_kind":"tool-call"}],"model_name":"zo:anthropic/claude-opus-4-6","input_tokens":45228,"output_tokens":252,"cache_read_tokens":37186,"cache_write_tokens":8041,"timestamp":"2026-03-31T12:29:43.871760+00:00","kind":"response","persona_id":null,"persona_name":null,"persona_image":null,"metadata":{"zmid":"hphn3wkil5mn"}},{"parts":[{"tool_name":"run_parallel_cmds","content":["(showing head+tail; 346 lines omitted — full output at /home/.z/workspaces/con_S88A6IG033XsHoZk/read_webpage/bash_cat~~20~~2fhome~~2fworkspace.txt, use read_file for all)","[Command: cat /home/workspace/mirrormirror/server/depth.py]\n# server/depth.py\nimport io\nimport numpy as np\nfrom PIL import Image\n\n_model = None\n_processor = None\n\n\ndef load_model() -> None:\n    \"\"\"Load Depth Anything v2 Small. Called once at server startup.\"\"\"\n    global _model, _processor\n    from transformers import AutoImageProcessor, AutoModelForDepthEstimation\n    _processor = AutoImageProcessor.from_pretrained(\n        \"depth-anything/Depth-Anything-V2-Small-hf\"\n    )\n    _model = AutoModelForDepthEstimation.from_pretrained(\n        \"depth-anything/Depth-Anything-V2-Small-hf\"\n    )\n    _model.eval()\n\n\ndef _infer(jpeg_bytes: bytes) -> bytes:\n    \"\"\"Run depth inference. Returns 256x256 8-bit grayscale PNG bytes.\"\"\"\n    import torch\n\n    image = Image.open(io.BytesIO(jpeg_bytes)).convert(\"RGB\")\n    inputs = _processor(images=image, return_tensors=\"pt\")\n\n    with torch.no_grad():\n        outputs = _model(**inputs)\n\n    depth = outputs.predicted_depth.squeeze().numpy()\n\n    # Normalise to [0, 255]; guard against flat depth maps\n    d_min, d_max = depth.min(), depth.max()\n    if d_max > d_min:\n        depth = (depth - d_min) / (d_max - d_min)\n    else:\n        depth = np.zeros_like(depth)\n\n    depth_uint8 = (depth * 255).astype(np.uint8)\n    img = Image.fromarray(depth_uint8, mode='L').resize((256, 256), Image.BILINEAR)\n\n    buf = io.BytesIO()\n    img.save(buf, format='PNG')\n    return buf.getvalue()\n\n[Command: cat /home/workspace/mirrormirror/server/voice.py]\nimport logging\nimport os\nimport time\n\nimport httpx\n\nELEVENLABS_BASE = \"https://api.elevenlabs.io/v1\"\nlog = logging.getLogger(\"voice\")\n\n\ndef _api_key() -> str:\n    key = os.environ.get(\"ELEVENLABS_API_KEY\", \"\")\n    if not key:\n        raise RuntimeError(\"ELEVENLABS_API_KEY not set\")\n    return key\n\n\nasync def clone_voice(audio_bytes: bytes, name: str = \"3dme-voice\") -> str:\n    \"\"\"Upload an audio sample and return the new ElevenLabs voice_id.\"\"\"\n    log.info(\"ElevenLabs clone_voice → POST /voices/add (%.1f KB)\", len(audio_bytes) / 1024)\n    t0 = time.monotonic()\n    async with httpx.AsyncClient() as client:\n        try:\n            r = await client.post(\n                f\"{ELEVENLABS_BASE}/voices/add\",\n                headers={\"xi-api-key\": _api_key()},\n                files={\"files\": (\"sample.webm\", audio_bytes, \"audio/webm\")},\n                data={\"name\": name},\n                timeout=30.0,\n            )\n            r.raise_for_status()\n            voice_id = r.json()[\"voice_id\"]\n            log.info(\"ElevenLabs clone_voice ✓ %d ms — voice_id=%s\", int((time.monotonic() - t0) * 1000), voice_id)\n            return voice_id\n        except Exception as e:\n            log.error(\"ElevenLabs clone_voice ✗ %d ms — %s\", int((time.monotonic() - t0) * 1000), e)\n            raise\n\n\nasync def tts_with_timestamps(text: str, voice_id: str) -> dict:\n    \"\"\"Call ElevenLabs /with-timestamps; return {audio_base64, alignment}.\"\"\"\n    log.info(\"ElevenLabs tts → POST /text-to-speech/%s/with-timestamps (%d chars)\", voice_id, len(text))\n    t0 = time.monotonic()\n    async with httpx.AsyncClient() as client:\n        try:\n            r = await client.post(\n                f\"{ELEVENLABS_BASE}/text-to-speech/{voice_id}/with-timestamps\",\n                headers={\"xi-api-key\": _api_key(), \"Content-Type\": \"application/json\"},\n                json={\"text\": text, \"model_id\": \"eleven_multilingual_v2\"},\n                timeout=30.0,\n            )\n            r.raise_for_status()\n            log.info(\"ElevenLabs tts ✓ %d ms\", int((time.monotonic() - t0) * 1000))\n            return r.json()\n        except Exception as e:\n            log.error(\"ElevenLabs tts ✗ %d ms — %s\", int((time.monotonic() - t0) * 1000), e)\n            raise\n\n\nasync def transcribe_audio(audio_bytes: bytes) -> str:\n    \"\"\"Call ElevenLabs /speech-to-text; return transcript string.\"\"\"\n    log.info(\"ElevenLabs transcribe → POST /speech-to-text (%.1f KB)\", len(audio_bytes) / 1024)\n    t0 = time.monotonic()\n    async with httpx.AsyncClient() as client:\n        try:\n            r = await client.post(\n                f\"{ELEVENLABS_BASE}/speech-to-text\",\n                headers={\"xi-api-key\": _api_key()},\n                files={\"file\": (\"audio.webm\", audio_bytes, \"audio/webm\")},\n                data={\"model_id\": \"scribe_v1\"},\n                timeout=60.0,\n            )\n            r.raise_for_status()\n            text = r.json()[\"text\"]\n            log.info(\"ElevenLabs transcribe ✓ %d ms — %d chars\",\n                     int((time.monotonic() - t0) * 1000), len(text))\n            return text\n        except Exception as e:\n            log.error(\"ElevenLabs transcribe ✗ %d ms — %s\",\n                      int((time.monotonic() - t0) * 1000), e)\n            raise\n\n[Command: cat /home/workspace/mirrormirror/server/chat.py]\nimport logging\nimport os\nimport time\n\nimport httpx\n\nlog = logging.getLogger(\"chat\")\n\nANTHROPIC_BASE = \"https://api.anthropic.com/v1\"\n\nDEFAULT_SYSTEM = (\n    \"You are the reflection in this mirror, speaking back to the face looking at you. \"\n    \"Respond in one short sentence. Be direct, honest, and a little surprising — \"\n    \"like something your own face might actually say. Not poetic, not cryptic. \"\n    \"Funny is good. Uncomfortable is good. Evasive is not.\"\n)\n\n\ndef _api_key() -> str:\n    key = os.environ.get(\"ANTHROPIC_API_KEY\", \"\")\n    if not key:\n        raise RuntimeError(\"ANTHROPIC_API_KEY not set\")\n    return key\n\n\nasync def chat(messages: list[dict], system: str = DEFAULT_SYSTEM) -> str:\n    \"\"\"Send messages to Claude, return the response text.\"\"\"\n    log.info(\"Anthropic chat → POST /messages (%d turn(s))\", len(messages))\n    t0 = time.monotonic()\n    async with httpx.AsyncClient() as client:\n        try:\n            r = await client.post(\n                f\"{ANTHROPIC_BASE}/messages\",\n                headers={\n                    \"x-api-key\": _api_key(),\n                    \"anthropic-version\": \"2023-06-01\",\n                    \"content-type\": \"application/json\",\n                },\n                json={\n                    \"model\": \"claude-opus-4-6\",\n                    \"max_tokens\": 80,\n                    \"system\": system,\n                    \"messages\": messages,\n                },\n                timeout=30.0,\n            )\n            r.raise_for_status()\n            text = r.json()[\"content\"][0][\"text\"]\n            log.info(\"Anthropic chat ✓ %d ms — %d chars\", int((time.monotonic() - t0) * 1000), len(text))\n            return text\n        except Exception as e:\n            log.error(\"Anthropic chat ✗ %d ms — %s\", int((time.monotonic() - t0) * 1000), e)\n            raise\n\n[Command: cat /home/workspace/mirrormirror/src/capture.js]\n// src/capture.js\nimport { FaceLandmarker, FilesetResolver } from '@mediapipe/tasks-vision';\n\nconst POSE_SEQUENCE = ['center', 'left', 'right', 'up', 'down'];\nconst YAW_THRESHOLD   = 0.35;  // ~20°\nconst PITCH_THRESHOLD = 0.25;  // ~14°\nconst ALIGN_HOLD_MS   = 600;   // must hold pose for this long before capture\n\n/**\n * Classify current pose from MediaPipe result.\n * Returns 'center' | 'left' | 'right' | 'up' | 'down' | null\n */\nexport async function imageDataToJpeg(imageData) {\n  const c = new OffscreenCanvas(imageData.width, imageData.height);\n  c.getContext('2d').putImageData(imageData, 0, 0);\n  return c.convertToBlob({ type: 'image/jpeg', quality: 0.85 });\n}\n\nexport function detectPose(result) {\n  if (!result.faceLandmarks?.length) return null;\n  const matrices = result.facialTransformationMatrixes;\n  if (!matrices?.length) return 'center';  // no matrix = assume center\n\n  const m = matrices[0].data;\n  // Extract yaw and pitch from column-major rotation matrix\n  // Row 2, col 0 = -sin(yaw); Row 1, col 2 = -sin(pitch)\n  const yaw   = Math.atan2(-m[2],  m[10]);\n  const pitch = Math.asin( Math.max(-1, Math.min(1, m[6])));\n\n  if (Math.abs(yaw) < YAW_THRESHOLD && Math.abs(pitch) < PITCH_THRESHOLD) return 'center';\n  if (yaw   >  YAW_THRESHOLD)   return 'left';\n  if (yaw   < -YAW_THRESHOLD)   return 'right';\n  if (pitch < -PITCH_THRESHOLD) return 'up';\n  if (pitch >  PITCH_THRESHOLD) return 'down';\n  return null;\n}\n\nexport function isPoseAligned(detectedPose, targetPose) {\n  return detectedPose === targetPose;\n}\n\n/**\n * Fire a depth POST request for a given JPEG blob promise.\n * Returns a Promise that resolves to a Blob (depth map) or null on failure.\n * @param {Promise<Blob>} jpegPromise\n */\nfunction fetchDepth(jpegPromise) {\n  const controller = new AbortController();\n  const depthTimeout = setTimeout(() => controller.abort(), 20_000);\n  return jpegPromise\n    .then(jpeg => fetch('/depth', {\n      method: 'POST',\n      headers: { 'Content-Type': 'image/jpeg' },\n      body: jpeg,\n      signal: controller.signal,\n    }))\n    .then(r => {\n      if (!r.ok) return null;   // 4xx/5xx → fallback\n      return r.blob();\n    })\n    .catch(() => null)           // network error / abort → fallback\n    .finally(() => clearTimeout(depthTimeout));\n}\n\n/**\n * Full capture flow. Returns Promise<{ center, left, right, up, down }>.\n * @param {HTMLVideoElement} videoEl\n * @param {HTMLCanvasElement} overlayCanvas\n * @param {(poseIndex: number, countdown: number|null) => void} onProgress\n */\nexport async function runCapture(videoEl, overlayCanvas, onProgress) {\n  // Camera access\n  const stream = await navigator.mediaDevices.getUserMedia({\n    video: { facingMode: 'user', width: { ideal: 640 }, height: { ideal: 480 } },\n  });\n  videoEl.srcObject = stream;\n  await new Promise(r => videoEl.addEventListener('loadeddata', r, { once: true }));\n\n  try {\n    // MediaPipe setup\n    const filesetResolver = await FilesetResolver.forVisionTasks(\n      'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm'\n    );\n    const landmarker = await FaceLandmarker.createFromOptions(filesetResolver, {\n      baseOptions: {\n        modelAssetPath: 'https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.task',\n      },\n      outputFacialTransformationMatrixes: true,\n      runningMode: 'VIDEO',\n      numFaces: 1,\n    });\n\n    const bundles = {};\n    const ctx = overlayCanvas.getContext('2d');\n\n    for (let poseIdx = 0; poseIdx < POSE_SEQUENCE.length; poseIdx++) {\n      const targetPose = POSE_SEQUENCE[poseIdx];\n      onProgress(poseIdx, null);\n\n      await captureOnePose(\n        videoEl, overlayCanvas, ctx, landmarker,\n        targetPose, poseIdx, onProgress\n      ).then(bundle => {\n        bundles[targetPose] = bundle;\n      });\n    }\n\n    const depthPromises = {};\n    for (const pose of POSE_SEQUENCE) {\n      depthPromises[pose] = fetchDepth(imageDataToJpeg(bundles[pose].imageData));\n    }\n    return { bundles, depthPromises };\n  } finally {\n    stream.getTracks().forEach(t => t.stop());\n  }\n}\n\nasync function captureOnePose(videoEl, canvas, ctx, landmarker, pose, poseIdx, onProgress) {\n  return new Promise(resolve => {\n    let holdStart = null;\n    let captured = false;\n\n    // Set canvas size once (not per-frame)\n    canvas.width = canvas.offsetWidth;\n    canvas.height = canvas.offsetHeight;\n\n    function tick() {\n      if (videoEl.readyState < 2) { requestAnimationFrame(tick); return; }\n      if (captured) return;\n\n      const result = landmarker.detectForVideo(videoEl, performance.now());\n      const detected = detectPose(result);\n      const aligned  = isPoseAligned(detected, pose);\n\n      // Draw oval guide\n      drawGuide(ctx, canvas, aligned, pose);\n\n      if (aligned) {\n        if (!holdStart) holdStart = performance.now();\n        const held = performance.now() - holdStart;\n\n        if (held >= ALIGN_HOLD_MS) {\n          captured = true;\n          // Capture frame immediately on hold\n          const frameCanvas = document.createElement('canvas');\n          frameCanvas.width  = videoEl.videoWidth;\n          frameCanvas.height = videoEl.videoHeight;\n          const fctx = frameCanvas.getContext('2d');\n          fctx.scale(-1, 1);\n          fctx.drawImage(videoEl, -frameCanvas.width, 0);\n          const imageData = fctx.getImageData(0, 0, frameCanvas.width, frameCanvas.height);\n          onProgress(poseIdx, 0);  // signal capture (triggers flash)\n          resolve({\n            landmarks:       result.faceLandmarks[0],\n            imageData,\n            transformMatrix: result.facialTransformationMatrixes?.[0]?.data ?? null,\n          });\n          return;\n        }\n      } else {\n        holdStart = null;\n      }\n\n      requestAnimationFrame(tick);\n    }\n\n    requestAnimationFrame(tick);\n  });\n}\n\nconst POSE_INSTRUCTIONS = {\n  center: '',\n  left:   'Turn left',\n  right:  'Turn right',\n  up:     'Look up',\n  down:   'Look down',\n};\n\nfunction drawGuide(ctx, canvas, aligned, pose) {\n  ctx.clearRect(0, 0, canvas.width, canvas.height);\n}\n\nexport { POSE_INSTRUCTIONS, POSE_SEQUENCE };\n\n[Command: cat /home/workspace/mirrormirror/src/reconstruction.js]\n// src/reconstruction.js\nimport { FACE_TRIANGLES, EYE_FILL_TRIANGLES, FACE_OVAL_INDICES } from './face-topology.js';\n\n// Invert a rigid-body 4×4 matrix (rotation + translation only, column-major).\n// inverse([R|t; 0|1]) = [R^T | -R^T·t; 0 | 1]\nfunction mat4InvertRigid(m) {\n  const out = new Float32Array(16);\n  // Transpose the 3×3 rotation block\n  out[0] = m[0]; out[1] = m[4]; out[2] = m[8];\n  out[4] = m[1]; out[5] = m[5]; out[6] = m[9];\n  out[8] = m[2]; out[9] = m[6]; out[10] = m[10];\n  // Translation: -R^T · t\n  out[12] = -(m[0]*m[12] + m[1]*m[13] + m[2]*m[14]);\n  out[13] = -(m[4]*m[12] + m[5]*m[13] + m[6]*m[14]);\n  out[14] = -(m[8]*m[12] + m[9]*m[13] + m[10]*m[14]);\n  out[15] = 1;\n  return out;\n}\n\n// Multiply two 4×4 column-major matrices: a · b\nfunction mat4Mul(a, b) {\n  const out = new Float32Array(16);\n  for (let col = 0; col < 4; col++) {\n    for (let row = 0; row < 4; row++) {\n      let s = 0;\n      for (let k = 0; k < 4; k++) s += a[row + k*4] * b[k + col*4];\n      out[row + col*4] = s;\n    }\n  }\n  return out;\n}\n\n// Transform a 3D point by a 4×4 column-major matrix; returns [x, y, z].\nfunction mat4MulPt(m, x, y, z) {\n  const w = m[3]*x + m[7]*y + m[11]*z + m[15];\n  return [\n    (m[0]*x + m[4]*y + m[8]*z  + m[12]) / w,\n    (m[1]*x + m[5]*y + m[9]*z  + m[13]) / w,\n    (m[2]*x + m[6]*y + m[10]*z + m[14]) / w,\n  ];\n}\n\nconst ATLAS_SIZE = 512;\nconst ATLAS_W    = ATLAS_SIZE * 2;   // 1024 — full atlas width\nconst PANEL_W    = 128;              // width of each side panel in the atlas\nconst PANEL_X    = { left: 512, right: 640, up: 768, down: 896 };\nconst SIDE_POSES = ['left', 'right', 'up', 'down'];\n\n/**\n * Sample an 8-bit grayscale depth map at UV position (u, v) ∈ [0,1].\n * Returns depth in [0, 1]. Clamps out-of-bounds coordinates.\n * Reads the red channel (R=G=B in a grayscale PNG).\n */\nexport function sampleDepthMap(imageData, u, v) {\n  const x = Math.max(0, Math.min(Math.floor(u * imageData.width),  imageData.width  - 1));\n  const y = Math.max(0, Math.min(Math.floor(v * imageData.height), imageData.height - 1));\n  return imageData.data[(y * imageData.width + x) * 4] / 255;\n}\n\nasync function resolveDep\n\n... [346 lines truncated] ...\n\n];\n        const cw = dotsCanvas.width, ch = dotsCanvas.height;\n        dotsCtx.clearRect(0, 0, cw, ch);\n\n        // CSS video is mirrored (scaleX(-1)); x-flip dots to match.\n        allLandmarks(dotsCtx, lm, cw, ch,\n          x => (1 - x) * cw,\n          y => y * ch);\n\n        const dp = rawPitch - calibPitch, dy = rawYaw - calibYaw;\n        statusBar(dotsCtx,\n          calibrated\n            ? `p:${fmt(dp)} y:${fmt(dy)} ✓`\n            : `cal ${stableCount}/${CALIB_FRAMES}`,\n          calibrated, cw, ch);\n      }\n\n      if (calibrated) onPose(\n        -(rawPitch - calibPitch),\n        rawYaw - calibYaw,\n        result.faceLandmarks[0].slice(0, 474),\n      );\n\n    } catch (err) {\n      console.warn('[gaze] tick error:', err);\n    }\n    requestAnimationFrame(tick);\n  }\n\n  requestAnimationFrame(tick);\n\n  return function stop() {\n    stopped = true;\n    stream.getTracks().forEach(t => t.stop());\n    if (offScreenVideo && video.parentNode) video.parentNode.removeChild(video);\n    if (capCanvas.parentNode) capCanvas.parentNode.removeChild(capCanvas);\n    landmarker.close();\n  };\n}\n\n// ── Helpers ───────────────────────────────────────────────────────────────────\n\nfunction allLandmarks(ctx, lm, cw, ch, fx, fy) {\n  if (lm[1])   dot(ctx, fx(lm[1].x),   fy(lm[1].y),   'rgba(255,80,80,0.9)',    4);\n  for (const i of [33, 133, 263, 362]) {\n    if (lm[i]) dot(ctx, fx(lm[i].x),   fy(lm[i].y),   'rgba(255,220,50,0.85)', 3);\n  }\n  if (lm[468]) dot(ctx, fx(lm[468].x), fy(lm[468].y), '#00ff88', 5);\n  if (lm[473]) dot(ctx, fx(lm[473].x), fy(lm[473].y), '#00ff88', 5);\n}\n\nfunction dot(ctx, x, y, color, r) {\n  ctx.beginPath();\n  ctx.arc(x, y, r, 0, Math.PI * 2);\n  ctx.fillStyle = color;\n  ctx.fill();\n}\n\nfunction statusBar(ctx, statusLine, calibrated, cw, ch) {\n  const fs    = Math.max(10, Math.round(cw / 26));\n  const lineH = fs + 3;\n  const boxH  = lineH + 4;\n\n  ctx.font = `bold ${fs}px monospace`;\n  ctx.fillStyle = 'rgba(0,0,0,0.65)';\n  ctx.fillRect(0, ch - boxH, cw, boxH);\n  ctx.fillStyle = calibrated ? 'rgba(0,255,136,0.95)' : 'rgba(255,220,50,0.95)';\n  ctx.fillText(statusLine, 5, ch - boxH + lineH);\n}\n\nfunction fmt(n) { return (n >= 0 ? '+' : '') + n.toFixed(2); }\n\n[Command: cat /home/workspace/mirrormirror/src/deform.js]\n// src/deform.js\n\n// Anchor landmark indices — bony, expression-stable.\nconst ANCHOR_INDICES = [6, 351, 133, 362, 33, 263, 1, 10];\n\n/**\n * Solve a 2D similarity transform T (translate + rotate + uniform scale) such that\n * T(live_anchors) ≈ rest_anchors, in closed-form Procrustes.\n *\n * @param {Array<{x,y,z}>} live  — live anchor landmarks (any array, indexed by position)\n * @param {Array<{x,y,z}>} rest  — rest anchor landmarks (same length as live)\n * @returns {{ apply(x, y): [number, number] }}\n */\nexport function solveProcrustes2D(live, rest) {\n  const n = live.length;\n\n  // Centroids\n  let clx = 0, cly = 0, crx = 0, cry = 0;\n  for (let i = 0; i < n; i++) {\n    clx += live[i].x; cly += live[i].y;\n    crx += rest[i].x; cry += rest[i].y;\n  }\n  clx /= n; cly /= n; crx /= n; cry /= n;\n\n  // Cross-covariance and variance of live\n  let a = 0, b = 0, D = 0;\n  for (let i = 0; i < n; i++) {\n    const lx = live[i].x - clx, ly = live[i].y - cly;\n    const rx = rest[i].x - crx, ry = rest[i].y - cry;\n    a += lx * rx + ly * ry;\n    b += lx * ry - ly * rx;\n    D += lx * lx + ly * ly;  // sum of squared norms — NOT divided by n\n  }\n\n  const s = D > 0 ? Math.sqrt(a * a + b * b) / D : 1;\n  const theta = Math.atan2(b, a);\n  const cosT = Math.cos(theta), sinT = Math.sin(theta);\n\n  // Translation: t = rest_centroid − s·R·live_centroid\n  const tx = crx - s * (cosT * clx - sinT * cly);\n  const ty = cry - s * (sinT * clx + cosT * cly);\n\n  return {\n    apply(x, y) {\n      return [\n        s * (cosT * x - sinT * y) + tx,\n        s * (sinT * x + cosT * y) + ty,\n      ];\n    },\n  };\n}\n\n/**\n * Compute landmark-space deltas for a single frame of live landmarks vs rest.\n * Returns a Float32Array(N*3) — same shape as the deformer's smoothedLmDelta.\n * No smoothing applied. Used during calibration to extract viseme shapes.\n *\n * @param {Array<{x,y,z}>} restLandmarks\n * @param {Array<{x,y,z}>} liveLandmarks\n * @returns {Float32Array}\n */\nexport function computeLandmarkDelta(restLandmarks, liveLandmarks) {\n  const N = restLandmarks.length;\n  const delta = new Float32Array(N * 3);\n  if (!liveLandmarks || liveLandmarks.length < N) return delta;\n\n  const restAnchors = ANCHOR_INDICES.map(i => restLandmarks[i]);\n  const liveAnchors = ANCHOR_INDICES.map(i => liveLandmarks[i]);\n  if (liveAnchors.some(a => !a)) return delta;\n\n  const T = solveProcrustes2D(liveAnchors, restAnchors);\n\n  for (let i = 0; i < N; i++) {\n    const lm = liveLandmarks[i];\n    if (!lm) continue;\n    const [ax, ay] = T.apply(lm.x, lm.y);\n    const rl = restLandmarks[i];\n    const j  = i * 3;\n    delta[j]   = ax - rl.x;\n    delta[j+1] = ay - rl.y;\n    delta[j+2] = lm.z - rl.z;\n  }\n\n  return delta;\n}\n\nconst ALPHA = 0.35;  // EMA smoothing factor\n\n/**\n * createDeformer(meshData, renderer) → { applyLandmarks(liveLandmarks), destroy() }\n *\n * Each call to applyLandmarks:\n *   1. Guards against short arrays\n *   2. Extracts anchor subsets, solves Procrustes T\n *   3. Applies T (x,y) to all 474 live landmarks\n *   4. Computes landmark-space deltas vs restLandmarks\n *   5. EMA-smooths the deltas\n *   6. Converts to vertex-space, adds to restVertices → newPositions\n *   7. Calls renderer.updateVertices(newPositions)\n */\nexport function createDeformer(meshData, renderer) {\n  const { restVertices, restLandmarks, imgAspect, faceScale } = meshData;\n  const N = restLandmarks.length;           // 474\n  const smoothedLmDelta = new Float32Array(N * 3);  // init to 0 = rest pose\n  const newPositions    = new Float32Array(N * 3);\n\n  // Pre-extract rest anchors (subset of restLandmarks at ANCHOR_INDICES)\n  const restAnchors = ANCHOR_INDICES.map(idx => restLandmarks[idx]);\n\n  function applyLandmarks(live) {\n    if (!live || live.length < N) return;\n\n    // 1. Extract live anchors\n    const liveAnchors = ANCHOR_INDICES.map(idx => live[idx]);\n\n    // Guard against null/undefined anchor elements\n    if (liveAnchors.some(a => !a)) return;\n\n    // 2. Solve Procrustes (x,y only)\n    const T = solveProcrustes2D(liveAnchors, restAnchors);\n\n    // 3–5. Align, compute delta, EMA smooth\n    for (let i = 0; i < N; i++) {\n      const lm = live[i];\n      if (!lm) continue;\n      const [ax, ay] = T.apply(lm.x, lm.y);\n      const rl = restLandmarks[i];\n\n      const dlx = ax - rl.x;\n      const dly = ay - rl.y;\n      const dlz = lm.z - rl.z;   // z unaligned — minor noise, see spec\n\n      const j = i * 3;\n      smoothedLmDelta[j]   = ALPHA * dlx + (1 - ALPHA) * smoothedLmDelta[j];\n      smoothedLmDelta[j+1] = ALPHA * dly + (1 - ALPHA) * smoothedLmDelta[j+1];\n      smoothedLmDelta[j+2] = ALPHA * dlz + (1 - ALPHA) * smoothedLmDelta[j+2];\n    }\n\n    // 6. Convert to vertex space and write newPositions\n    for (let i = 0; i < N; i++) {\n      const j = i * 3;\n      newPositions[j]   = restVertices[j]   + smoothedLmDelta[j]   *  imgAspect * faceScale;\n      newPositions[j+1] = restVertices[j+1] + smoothedLmDelta[j+1] * -faceScale;\n      newPositions[j+2] = restVertices[j+2] + smoothedLmDelta[j+2] * -faceScale;\n    }\n\n    // 7. Push to renderer\n    renderer.updateVertices(newPositions);\n  }\n\n  function applyDelta(delta) {\n    // delta: Float32Array(N*3) landmark-space deltas — applied directly, no Procrustes.\n    for (let i = 0; i < N; i++) {\n      const j = i * 3;\n      newPositions[j]   = restVertices[j]   + delta[j]   *  imgAspect * faceScale;\n      newPositions[j+1] = restVertices[j+1] + delta[j+1] * -faceScale;\n      newPositions[j+2] = restVertices[j+2] + delta[j+2] * -faceScale;\n    }\n    renderer.updateVertices(newPositions);\n  }\n\n  return {\n    applyLandmarks,\n    applyDelta,\n    destroy() {},  // no external resources to release\n  };\n}\n\n[Command: cat /home/workspace/mirrormirror/src/phoneme.js]\n// Rule-based English text → phoneme sequence.\n// Returns Array<{phoneme: string, charStart: number, charEnd: number}>.\n// phoneme values match VISEME_MAP keys in viseme.js.\n\nexport const INCANTATION =\n  'Speak to me, spirit of glass — breathe through me, breathe and pass';\n\n// Digraph rules checked before single-char rules (order matters).\nconst DIGRAPHS = [\n  ['th', 'TH'], ['sh', 'SH'], ['ch', 'SH'], ['ph', 'FV'],\n  ['wh', 'W'],  ['gh', 'KG'], ['ck', 'KG'], ['ng', 'KG'],\n  ['qu', 'KG'], ['oo', 'W'],  ['ee', 'EE'], ['ea', 'EE'],\n  ['ai', 'AH'], ['ay', 'AH'], ['oa', 'OH'], ['ou', 'W'],\n  ['ow', 'OH'], ['ew', 'W'],  ['ie', 'EE'],\n];\n\nconst SINGLES = {\n  a: 'AH', e: 'EE', i: 'AH', o: 'OH', u: 'W',\n  m: 'MBP', b: 'MBP', p: 'MBP',\n  f: 'FV',  v: 'FV',\n  t: 'TDNL', d: 'TDNL', n: 'TDNL', l: 'TDNL',\n  s: 'SZ',  z: 'SZ',\n  r: 'R',\n  k: 'KG',  g: 'KG',  c: 'KG',  q: 'KG',  x: 'KG',\n  w: 'W',   y: 'EE',\n  h: 'REST', j: 'SH',\n};\n\nexport function textToPhonemes(text) {\n  const result = [];\n  const lower = text.toLowerCase();\n  let i = 0;\n\n  while (i < lower.length) {\n    const ch = lower[i];\n\n    // Spaces and punctuation → REST\n    if (/[\\s\\p{P}]/u.test(ch)) {\n      result.push({ phoneme: 'REST', charStart: i, charEnd: i + 1 });\n      i++;\n      continue;\n    }\n\n    // Try digraph\n    let matched = false;\n    if (i + 1 < lower.length) {\n      const pair = lower.slice(i, i + 2);\n      for (const [digraph, phoneme] of DIGRAPHS) {\n        if (pair === digraph) {\n          result.push({ phoneme, charStart: i, charEnd: i + 2 });\n          i += 2;\n          matched = true;\n          break;\n        }\n      }\n    }\n\n    if (!matched) {\n      const phoneme = SINGLES[ch] ?? 'REST';\n      result.push({ phoneme, charStart: i, charEnd: i + 1 });\n      i++;\n    }\n  }\n\n  return result;\n}\n\n[Command: cat /home/workspace/mirrormirror/src/viseme.js]\nexport const VISEME_NAMES = [\n  'rest', 'mbp', 'fv', 'th', 'tdnl', 'sz', 'sh', 'r', 'kg', 'w', 'oh', 'ah', 'ee',\n];\n\n// Maps phoneme token → viseme name\nexport const VISEME_MAP = {\n  REST: 'rest',\n  MBP:  'mbp',\n  FV:   'fv',\n  TH:   'th',\n  TDNL: 'tdnl',\n  SZ:   'sz',\n  SH:   'sh',\n  R:    'r',\n  KG:   'kg',\n  W:    'w',\n  OH:   'oh',\n  AH:   'ah',\n  EE:   'ee',\n};\n\n/**\n * Merge a phoneme array with ElevenLabs character-level timestamps.\n *\n * @param {Array<{phoneme, charStart, charEnd}>} phonemes - from textToPhonemes()\n * @param {{characters, character_start_times_seconds, character_end_times_seconds}} alignment\n * @returns {Array<{viseme: string, startSec: number, endSec: number}>}\n */\nexport function buildTimeline(phonemes, alignment) {\n  const { character_start_times_seconds: starts, character_end_times_seconds: ends } = alignment;\n  const timeline = [];\n\n  for (const { phoneme, charStart, charEnd } of phonemes) {\n    if (charStart >= starts.length || charEnd - 1 >= ends.length) continue;\n    const startSec = starts[charStart];\n    const endSec   = ends[charEnd - 1];\n    const viseme   = VISEME_MAP[phoneme] ?? 'rest';\n    timeline.push({ viseme, startSec, endSec });\n  }\n\n  return timeline;\n}\n\n/**\n * Create a viseme playback controller that drives deformer.applyDelta() in sync with audio.\n *\n * @param {Object.<string, Float32Array>} visemeShapes — from calibration (IDB)\n * @param {object} deformer — must have applyDelta(delta: Float32Array)\n * @returns {{ play(timeline, audioBuffer, audioCtx): void, stop(): void }}\n */\nexport function createVisemePlayer(visemeShapes, deformer) {\n  let rafHandle = null;\n  let source = null;\n  let startTime = 0;\n\n  const restShape = visemeShapes['rest'] ?? new Float32Array(\n    (Object.values(visemeShapes)[0]?.length ?? 0)\n  );\n  const N3 = restShape.length;\n  const blended = new Float32Array(N3);\n\n  function findViseme(timeline, timeSec) {\n    for (let i = timeline.length - 1; i >= 0; i--) {\n      if (timeSec >= timeline[i].startSec && timeSec <= timeline[i].endSec)\n        return timeline[i].viseme;\n    }\n    return 'rest';\n  }\n\n  function play(timeline, audioBuffer, audioCtx) {\n    stop();\n    blended.fill(0);  // ← reset to rest state before each play\n    if (!N3) return; // visemeShapes was empty, nothing to animate\n    startTime = audioCtx.currentTime;\n    source = audioCtx.createBufferSource();\n    source.buffer = audioBuffer;\n    source.connect(audioCtx.destination);\n    source.start();\n\n    function frame() {\n      const elapsed = audioCtx.currentTime - startTime;\n      const visemeName = findViseme(timeline, elapsed);\n      const shape = visemeShapes[visemeName] ?? restShape;\n\n      // Lerp from current blend toward target shape (t=0.7: snappy but smooth)\n      const t = 0.7;\n      for (let i = 0; i < N3; i++) {\n        blended[i] = blended[i] * (1 - t) + shape[i] * t;\n      }\n      deformer.applyDelta(blended);\n\n      if (elapsed < audioBuffer.duration + 0.1) {\n        rafHandle = requestAnimationFrame(frame);\n      } else {\n        stop();\n      }\n    }\n\n    rafHandle = requestAnimationFrame(frame);\n  }\n\n  function stop() {\n    if (rafHandle) { cancelAnimationFrame(rafHandle); rafHandle = null; }\n    if (source) { try { source.stop(); } catch {} source = null; }\n    // Fade back to rest\n    blended.fill(0);\n    deformer.applyDelta(blended);\n  }\n\n  return { play, stop };\n}\n\n[Command: cat /home/workspace/mirrormirror/src/tts.js]\n/**\n * Clone a voice from an audio blob.\n * @param {Blob} audioBlob — WebM or WAV recording from MediaRecorder\n * @returns {Promise<string>} voice_id\n */\nexport async function cloneVoice(audioBlob) {\n  const r = await fetch('/api/clone', {\n    method: 'POST',\n    headers: { 'Content-Type': audioBlob.type || 'audio/webm' },\n    body: audioBlob,\n  });\n  if (!r.ok) throw new Error(`Clone failed: ${r.status}`);\n  const { voice_id } = await r.json();\n  return voice_id;\n}\n\n/**\n * Synthesize speech with character-level timestamps.\n * @param {string} text\n * @param {string} voiceId\n * @param {AudioContext} audioCtx\n * @returns {Promise<{audioBuffer: AudioBuffer, alignment: object}>}\n */\nexport async function synthesize(text, voiceId, audioCtx) {\n  const r = await fetch('/api/tts', {\n    method: 'POST',\n    headers: { 'Content-Type': 'application/json' },\n    body: JSON.stringify({ text, voice_id: voiceId }),\n  });\n  if (!r.ok) throw new Error(`TTS failed: ${r.status}`);\n  const { audio_base64, alignment } = await r.json();\n\n  const binary = atob(audio_base64);\n  const bytes   = new Uint8Array(binary.length);\n  for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i);\n  const audioBuffer = await audioCtx.decodeAudioData(bytes.buffer);\n\n  return { audioBuffer, alignment };\n}\n\n/**\n * Transcribe an audio recording via the backend STT proxy.\n * @param {Blob} audioBlob\n * @returns {Promise<string>} transcribed text\n */\nexport async function transcribe(audioBlob) {\n  const r = await fetch('/api/transcribe', {\n    method: 'POST',\n    headers: { 'Content-Type': audioBlob.type || 'audio/webm' },\n    body: audioBlob,\n  });\n  if (!r.ok) throw new Error(`Transcribe failed: ${r.status}`);\n  const { text } = await r.json();\n  return text;\n}"],"tool_call_id":"toolu_01DmDuzqgGJGuxMs5DUkrkd8","metadata":null,"timestamp":"2026-03-31T12:29:49.256222+00:00","outcome":"success","part_kind":"tool-return"}],"kind":"request","metadata":{"zmid":"n23or72ru6uv"}},{"parts":[{"tool_name":"run_parallel_cmds","args":"{\"cmd_list\": [\"cat /home/workspace/mirrormirror/src/renderer.js\",\"cat /home/workspace/mirrormirror/src/calibrate.js\",\"cat /home/workspace/mirrormirror/src/conversation.js\",\"cat /home/workspace/mirrormirror/src/recorder.js\",\"cat /home/workspace/mirrormirror/src/storage.js\",\"cat /home/workspace/mirrormirror/src/motion.js\"]}","tool_call_id":"toolu_011jp5ncMeV2uDoxqJrjDYtw","id":null,"provider_name":null,"provider_details":null,"part_kind":"tool-call"}],"model_name":"zo:anthropic/claude-opus-4-6","input_tokens":56613,"output_tokens":172,"cache_read_tokens":45227,"cache_write_tokens":11385,"timestamp":"2026-03-31T12:29:52.752027+00:00","kind":"response","persona_id":null,"persona_name":null,"persona_image":null,"metadata":{"zmid":"olcmdhdz9715"}},{"parts":[{"tool_name":"run_parallel_cmds","content":{"stdout":"[Command: cat /home/workspace/mirrormirror/src/renderer.js]\n// src/renderer.js\nimport vertexShaderSource   from './shaders/face.vert.wgsl?raw';\nimport fragmentShaderSource from './shaders/face.frag.wgsl?raw';\n\n/**\n * Initialize WebGPU renderer and return a draw function.\n * @param {HTMLCanvasElement} canvas\n * @param {{ vertices, normals, uvs, indices, textureAtlas }} meshData\n * @returns {{ draw(modelMatrix: Float32Array): void, updateVertices(positions: Float32Array): void, destroy(): void }}\n */\nexport async function createRenderer(canvas, meshData) {\n  if (!navigator.gpu) throw new Error('WebGPU is not supported in this browser.');\n\n  const adapter = await navigator.gpu.requestAdapter();\n  if (!adapter) throw new Error('No WebGPU adapter found.');\n  const device = await adapter.requestDevice();\n\n  const context = canvas.getContext('webgpu');\n  const format  = navigator.gpu.getPreferredCanvasFormat();\n  context.configure({ device, format, alphaMode: 'premultiplied' });\n\n  // --- Interleave vertex data: [pos(3), normal(3), uv(2)] ---\n  const N = meshData.vertices.length / 3;\n  const storedUVs        = meshData.uvs.slice();  // copy — see Fix 2\n  const _workNormals     = new Float32Array(N * 3);\n  const _workInterleaved = new Float32Array(N * 8);\n  const interleaved = new Float32Array(N * 8);\n  for (let i = 0; i < N; i++) {\n    interleaved[i*8+0] = meshData.vertices[i*3+0];\n    interleaved[i*8+1] = meshData.vertices[i*3+1];\n    interleaved[i*8+2] = meshData.vertices[i*3+2];\n    interleaved[i*8+3] = meshData.normals[i*3+0];\n    interleaved[i*8+4] = meshData.normals[i*3+1];\n    interleaved[i*8+5] = meshData.normals[i*3+2];\n    interleaved[i*8+6] = meshData.uvs[i*2+0];\n    interleaved[i*8+7] = meshData.uvs[i*2+1];\n  }\n\n  const vertexBuffer = device.createBuffer({\n    size:  interleaved.byteLength,\n    usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,\n  });\n  device.queue.writeBuffer(vertexBuffer, 0, interleaved);\n\n  const indexBuffer = device.createBuffer({\n    size:  meshData.indices.byteLength,\n    usage: GPUBufferUsage.INDEX | GPUBufferUsage.COPY_DST,\n  });\n  device.queue.writeBuffer(indexBuffer, 0, meshData.indices);\n\n  const indexFormat = meshData.indices instanceof Uint32Array ? 'uint32' : 'uint16';\n\n  // --- Depth texture ---\n  let depthTexture = device.createTexture({\n    size:   [canvas.width, canvas.height],\n    format: 'depth24plus',\n    usage:  GPUTextureUsage.RENDER_ATTACHMENT,\n  });\n\n  // --- Face texture from atlas ---\n  const imageBitmap = await createImageBitmap(meshData.textureAtlas);\n  const gpuTexture = device.createTexture({\n    size:   [imageBitmap.width, imageBitmap.height, 1],\n    format: 'rgba8unorm',\n    usage:  GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT,\n  });\n  device.queue.copyExternalImageToTexture(\n    { source: imageBitmap },\n    { texture: gpuTexture },\n    [imageBitmap.width, imageBitmap.height],\n  );\n  imageBitmap.close();  // release CPU-side decoded image data\n  const sampler = device.createSampler({ magFilter: 'linear', minFilter: 'linear' });\n\n  // --- Uniform buffer (128 bytes: MVP + Model) ---\n  const uniformBuffer = device.createBuffer({\n    size:  128,\n    usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,\n  });\n\n  // --- Pipeline ---\n  const pipeline = device.createRenderPipeline({\n    layout: 'auto',\n    vertex: {\n      module:     device.createShaderModule({ code: vertexShaderSource }),\n      entryPoint: 'vs_main',\n      buffers: [{\n        arrayStride: 32,\n        attributes: [\n          { shaderLocation: 0, offset: 0,  format: 'float32x3' },\n          { shaderLocation: 1, offset: 12, format: 'float32x3' },\n          { shaderLocation: 2, offset: 24, format: 'float32x2' },\n        ],\n      }],\n    },\n    fragment: {\n      module:     device.createShaderModule({ code: fragmentShaderSource }),\n      entryPoint: 'fs_main',\n      targets: [{ format }],\n    },\n    primitive:    { topology: 'triangle-list', cullMode: 'none' },\n    depthStencil: { format: 'depth24plus', depthWriteEnabled: true, depthCompare: 'less' },\n  });\n\n  const bindGroup = device.createBindGroup({\n    layout: pipeline.getBindGroupLayout(0),\n    entries: [\n      { binding: 0, resource: { buffer: uniformBuffer } },\n      { binding: 1, resource: gpuTexture.createView() },\n      { binding: 2, resource: sampler },\n    ],\n  });\n\n  // --- Matrices ---\n  function perspective(fovY, aspect, near, far) {\n    const f = 1 / Math.tan(fovY / 2);\n    return new Float32Array([\n      f/aspect, 0, 0, 0,\n      0, f, 0, 0,\n      0, 0, far/(near-far), -1,\n      0, 0, (near*far)/(near-far), 0,\n    ]);\n  }\n\n  function lookAt(eye, center, up) {\n    const z = norm(sub(eye, center));\n    const x = norm(cross(up, z));\n    const y = cross(z, x);\n    return new Float32Array([\n      x[0], y[0], z[0], 0,\n      x[1], y[1], z[1], 0,\n      x[2], y[2], z[2], 0,\n      -dot(x,eye), -dot(y,eye), -dot(z,eye), 1,\n    ]);\n  }\n\n  function mul4(a, b) {\n    const out = new Float32Array(16);\n    for (let col = 0; col < 4; col++)\n      for (let row = 0; row < 4; row++)\n        for (let k = 0; k < 4; k++)\n          out[col*4+row] += a[k*4+row] * b[col*4+k];\n    return out;\n  }\n\n  const sub   = (a,b) => [a[0]-b[0], a[1]-b[1], a[2]-b[2]];\n  const dot   = (a,b) => a[0]*b[0]+a[1]*b[1]+a[2]*b[2];\n  const cross = (a,b) => [a[1]*b[2]-a[2]*b[1], a[2]*b[0]-a[0]*b[2], a[0]*b[1]-a[1]*b[0]];\n  const norm  = v => { const l = Math.sqrt(dot(v,v)); return [v[0]/l, v[1]/l, v[2]/l]; };\n\n  let proj = perspective(Math.PI / 3, canvas.width / canvas.height, 0.1, 100);\n  const view = lookAt([0, 0, 2.5], [0, 0, 0], [0, 1, 0]);\n\n  // Recreate depth texture and projection on canvas resize\n  const resizeObserver = new ResizeObserver(() => {\n    const w = canvas.clientWidth  || canvas.width;\n    const h = canvas.clientHeight || canvas.height;\n    canvas.width  = w;\n    canvas.height = h;\n    depthTexture.destroy();\n    depthTexture = device.createTexture({\n      size:   [canvas.width, canvas.height],\n      format: 'depth24plus',\n      usage:  GPUTextureUsage.RENDER_ATTACHMENT,\n    });\n    proj = perspective(Math.PI / 3, canvas.width / canvas.height, 0.1, 100);\n  });\n  resizeObserver.observe(canvas);\n\n  function updateVertices(positions) {\n    // positions: Float32Array of length N*3\n    // Recompute normals into _workNormals\n    _workNormals.fill(0);\n    for (let t = 0; t < meshData.indices.length; t += 3) {\n      const a = meshData.indices[t], b = meshData.indices[t+1], c = meshData.indices[t+2];\n      const ax = positions[a*3], ay = positions[a*3+1], az = positions[a*3+2];\n      const bx = positions[b*3], by = positions[b*3+1], bz = positions[b*3+2];\n      const cx = positions[c*3], cy = positions[c*3+1], cz = positions[c*3+2];\n      const e1x = bx-ax, e1y = by-ay, e1z = bz-az;\n      const e2x = cx-ax, e2y = cy-ay, e2z = cz-az;\n      const nx = e1y*e2z - e1z*e2y;\n      const ny = e1z*e2x - e1x*e2z;\n      const nz = e1x*e2y - e1y*e2x;\n      for (const i of [a, b, c]) {\n        _workNormals[i*3]   += nx;\n        _workNormals[i*3+1] += ny;\n        _workNormals[i*3+2] += nz;\n      }\n    }\n    for (let i = 0; i < N; i++) {\n      const len = Math.sqrt(_workNormals[i*3]**2 + _workNormals[i*3+1]**2 + _workNormals[i*3+2]**2);\n      if (len > 0) { _workNormals[i*3] /= len; _workNormals[i*3+1] /= len; _workNormals[i*3+2] /= len; }\n      else _workNormals[i*3+2] = 1;\n    }\n    // Re-interleave into _workInterleaved\n    for (let i = 0; i < N; i++) {\n      _workInterleaved[i*8+0] = positions[i*3+0];\n      _workInterleaved[i*8+1] = positions[i*3+1];\n      _workInterleaved[i*8+2] = positions[i*3+2];\n      _workInterleaved[i*8+3] = _workNormals[i*3+0];\n      _workInterleaved[i*8+4] = _workNormals[i*3+1];\n      _workInterleaved[i*8+5] = _workNormals[i*3+2];\n      _workInterleaved[i*8+6] = storedUVs[i*2+0];\n      _workInterleaved[i*8+7] = storedUVs[i*2+1];\n    }\n    device.queue.writeBuffer(vertexBuffer, 0, _workInterleaved);\n  }\n\n  function draw(modelMatrix) {\n    const mvp = mul4(mul4(proj, view), modelMatrix);\n    device.queue.writeBuffer(uniformBuffer, 0,  mvp);\n    device.queue.writeBuffer(uniformBuffer, 64, modelMatrix);\n\n    const encoder = device.createCommandEncoder();\n    const pass = encoder.beginRenderPass({\n      colorAttachments: [{\n        view:       context.getCurrentTexture().createView(),\n        loadOp:     'clear',\n        clearValue: { r: 0.04, g: 0.04, b: 0.04, a: 1 },\n        storeOp:    'store',\n      }],\n      depthStencilAttachment: {\n        view:              depthTexture.createView(),\n        depthLoadOp:       'clear',\n        depthClearValue:   1.0,\n        depthStoreOp:      'store',\n      },\n    });\n    pass.setPipeline(pipeline);\n    pass.setBindGroup(0, bindGroup);\n    pass.setVertexBuffer(0, vertexBuffer);\n    pass.setIndexBuffer(indexBuffer, indexFormat);\n    pass.drawIndexed(meshData.indices.length);\n    pass.end();\n    device.queue.submit([encoder.finish()]);\n  }\n\n  function destroy() {\n    resizeObserver.disconnect();\n    vertexBuffer.destroy();\n    indexBuffer.destroy();\n    uniformBuffer.destroy();\n    gpuTexture.destroy();\n    depthTexture.destroy();\n  }\n\n  return { draw, updateVertices, destroy };\n}\n\n[Command: cat /home/workspace/mirrormirror/src/calibrate.js]\nimport { textToPhonemes, INCANTATION } from './phoneme.js';\nimport { VISEME_NAMES, VISEME_MAP }    from './viseme.js';\nimport { computeLandmarkDelta }         from './deform.js';\nimport { cloneVoice }                   from './tts.js';\nimport { saveVoiceData }                from './storage.js';\nimport { createRecorder }               from './recorder.js';\n\nfunction makeFallbackShapes(N) {\n  const shapes = {};\n  for (const name of VISEME_NAMES) shapes[name] = new Float32Array(N * 3);\n  return shapes;\n}\n\n/**\n * Start calibration: voice recording (VAD auto-stop) + landmark capture.\n *\n * @param {string}   scanId      — IDB scan id to attach voice data to\n * @param {object}   meshData    — from buildMesh(); needs .restLandmarks\n * @param {function} onAutoStop  — called synchronously when stop is triggered (before async work)\n * @param {function} onDone      — called on successful save\n * @param {function} onError     — called with Error on failure\n * @returns {{ onFrame: function, stop: function }}\n */\nexport async function startCalibration({ scanId, meshData, onAutoStop, onDone, onError }) {\n  const { restLandmarks } = meshData;\n  const N = restLandmarks.length;\n  let stopped = false;\n\n  const startTime = performance.now();\n  const frames = [];\n\n  function onFrame(liveLandmarks) {\n    if (stopped) return;\n    const timeSec = (performance.now() - startTime) / 1000;\n    const delta = computeLandmarkDelta(restLandmarks, liveLandmarks);\n    frames.push({ timeSec, delta });\n  }\n\n  async function doStop() {\n    if (stopped) return;\n    stopped = true;\n    onAutoStop?.();   // synchronously signals caller (e.g. to show \"invoking…\")\n    const durationSec = (performance.now() - startTime) / 1000;\n    const audioBlob = await rec.stop();\n\n    try {\n      const [voiceId, visemeShapes] = await Promise.all([\n        cloneVoice(audioBlob),\n        extractVisemeShapes(frames, durationSec, N),\n      ]);\n      await saveVoiceData(scanId, voiceId, visemeShapes);\n      onDone?.();\n    } catch (err) {\n      onError?.(err);\n    }\n  }\n\n  const rec = createRecorder({\n    onSilence: () => doStop().catch(onError),\n    silenceMs: 1500,\n  });\n  await rec.start();\n\n  return { onFrame, stop: doStop };\n}\n\nfunction extractVisemeShapes(frames, durationSec, N) {\n  if (!frames.length) return makeFallbackShapes(N);\n\n  const phonemes = textToPhonemes(INCANTATION);\n  const totalPhonemes = phonemes.length;\n  const secPerPhoneme = durationSec / totalPhonemes;\n  const visemeShapes = {};\n  const fallbacks = makeFallbackShapes(N);\n\n  for (const visemeName of VISEME_NAMES) {\n    const collected = [];\n    phonemes.forEach((p, i) => {\n      if ((VISEME_MAP[p.phoneme] ?? 'rest') !== visemeName) return;\n      const startSec = i * secPerPhoneme;\n      const endSec   = startSec + secPerPhoneme;\n      for (const f of frames) {\n        if (f.timeSec >= startSec && f.timeSec < endSec) collected.push(f.delta);\n      }\n    });\n\n    if (!collected.length) {\n      visemeShapes[visemeName] = fallbacks[visemeName];\n      continue;\n    }\n    const mid = Math.floor(collected.length / 2);\n    visemeShapes[visemeName] = collected[mid].slice();\n  }\n\n  return visemeShapes;\n}\n\n[Command: cat /home/workspace/mirrormirror/src/conversation.js]\nexport function createConversation() {\n  let history = [];\n\n  async function send(userText) {\n    const response = await fetch('/api/chat', {\n      method: 'POST',\n      headers: { 'Content-Type': 'application/json' },\n      body: JSON.stringify({ messages: [...history, { role: 'user', content: userText }] }),\n    });\n\n    if (!response.ok) {\n      throw new Error(`Chat request failed with status ${response.status}`);\n    }\n\n    const data = await response.json();\n    history.push({ role: 'user', content: userText });\n    history.push({ role: 'assistant', content: data.text });\n    return data.text;\n  }\n\n  function reset() {\n    history = [];\n  }\n\n  return { send, reset };\n}\n\n[Command: cat /home/workspace/mirrormirror/src/recorder.js]\n/**\n * createRecorder — MediaRecorder with Web Audio VAD (voice activity detection).\n *\n * @param {object}   opts\n * @param {function} [opts.onSilence]        — fires after silenceMs of quiet following speech\n * @param {number}   [opts.silenceMs=1500]   — ms of sustained quiet before onSilence\n * @param {number}   [opts.silenceThreshold=0.01] — RMS threshold for speech vs silence\n * @returns {{ start: () => Promise<void>, stop: () => Promise<Blob> }}\n */\nexport function createRecorder({\n  onSilence,\n  silenceMs = 1500,\n  silenceThreshold = 0.01,\n} = {}) {\n  let mediaRecorder = null;\n  let chunks        = [];\n  let stream        = null;\n  let audioCtx      = null;\n  let rafId         = null;\n  let silenceTimer  = null;\n  let hasSpeech     = false;\n\n  async function start() {\n    stream   = await navigator.mediaDevices.getUserMedia({ audio: true, video: false });\n    audioCtx = new AudioContext();\n\n    const source   = audioCtx.createMediaStreamSource(stream);\n    const analyser = audioCtx.createAnalyser();\n    analyser.fftSize = 512;\n    source.connect(analyser);\n    const buf = new Float32Array(analyser.fftSize);\n\n    function tick() {\n      analyser.getFloatTimeDomainData(buf);\n      let sum = 0;\n      for (let i = 0; i < buf.length; i++) sum += buf[i] * buf[i];\n      const rms = Math.sqrt(sum / buf.length);\n\n      if (rms > silenceThreshold) {\n        hasSpeech = true;\n        if (silenceTimer) { clearTimeout(silenceTimer); silenceTimer = null; }\n      } else if (hasSpeech && !silenceTimer) {\n        silenceTimer = setTimeout(() => onSilence?.(), silenceMs);\n      }\n\n      rafId = requestAnimationFrame(tick);\n    }\n    tick();\n\n    mediaRecorder = new MediaRecorder(stream);\n    chunks = [];\n    mediaRecorder.ondataavailable = e => { if (e.data.size) chunks.push(e.data); };\n    mediaRecorder.start(100);\n  }\n\n  async function stop() {\n    cancelAnimationFrame(rafId);\n    if (silenceTimer) { clearTimeout(silenceTimer); silenceTimer = null; }\n\n    let blob;\n    if (mediaRecorder && mediaRecorder.state !== 'inactive') {\n      blob = await new Promise(resolve => {\n        const finish = () => resolve(new Blob(chunks, { type: 'audio/webm' }));\n        mediaRecorder.onstop = finish;\n        mediaRecorder.stop();\n        // If stop() transitions synchronously to inactive (e.g. in test mocks),\n        // resolve immediately so we don't hang waiting for the onstop callback.\n        if (mediaRecorder.state === 'inactive') finish();\n      });\n    } else {\n      blob = new Blob(chunks, { type: 'audio/webm' });\n    }\n\n    stream?.getTracks().forEach(t => t.stop());\n    audioCtx?.close()?.catch(() => {});\n    stream = null;\n    audioCtx = null;\n    mediaRecorder = null;\n    return blob;\n  }\n\n  return { start, stop };\n}\n\n[Command: cat /home/workspace/mirrormirror/src/storage.js]\n// src/storage.js — persist/restore scan bundles via IndexedDB\nconst DB_NAME    = 'threedme';\nconst STORE      = 'scans';\nconst POSE_NAMES = ['center', 'left', 'right', 'up', 'down'];\n\n// Fire a depth request from a JPEG blob; resolves to a depth PNG Blob or null on any failure.\nfunction fetchDepth(jpeg) {\n  const controller   = new AbortController();\n  const depthTimeout = setTimeout(() => controller.abort(), 20_000);\n  return fetch('/depth', {\n    method:  'POST',\n    headers: { 'Content-Type': 'image/jpeg' },\n    body:    jpeg,\n    signal:  controller.signal,\n  })\n  .then(r => r.ok ? r.blob() : null)\n  .catch(() => null)\n  .finally(() => clearTimeout(depthTimeout));\n}\n\nfunction openDB() {\n  return new Promise((resolve, reject) => {\n    const req = indexedDB.open(DB_NAME, 3);\n\n    req.onupgradeneeded = e => {\n      const db = e.target.result;\n      const tx = e.target.transaction;\n\n      // v1: create the object store (out-of-line keys, no keyPath)\n      if (e.oldVersion < 1) {\n        db.createObjectStore(STORE);\n      }\n\n      // v1 → v2: migrate the single \"latest\" record to a UUID-keyed record\n      if (e.oldVersion >= 1 && e.oldVersion < 2) {\n        const store  = tx.objectStore(STORE);\n        const getReq = store.get('latest');\n        getReq.onsuccess = () => {\n          const old = getReq.result;\n          if (old) {\n            const id = crypto.randomUUID();\n            store.put({ id, name: '', ts: old.ts ?? Date.now(), bundles: old.bundles }, id);\n            store.delete('latest');\n          }\n        };\n      }\n\n      // v2 → v3: create voice object store\n      if (e.oldVersion < 3) {\n        db.createObjectStore('voice');\n      }\n    };\n\n    req.onsuccess = e => resolve(e.target.result);\n    req.onerror   = () => reject(req.error);\n  });\n}\n\nasync function imageDataToJpeg(imageData) {\n  const c = new OffscreenCanvas(imageData.width, imageData.height);\n  c.getContext('2d').putImageData(imageData, 0, 0);\n  return c.convertToBlob({ type: 'image/jpeg', quality: 0.85 });\n}\n\nasync function jpegToImageData(blob) {\n  const bm = await createImageBitmap(blob);\n  const c  = new OffscreenCanvas(bm.width, bm.height);\n  c.getContext('2d').drawImage(bm, 0, 0);\n  return c.getContext('2d').getImageData(0, 0, bm.width, bm.height);\n}\n\n/**\n * Save bundles under an optional name.\n * @param {{ center, left, right, up, down }} bundles\n * @param {string} [name=''] — display name; empty string = untitled\n * @returns {Promise<string>} the new scan id\n */\nexport async function saveScan(bundles, name = '') {\n  const entries = Object.entries(bundles);\n  const jpegs = await Promise.all(entries.map(([, b]) => imageDataToJpeg(b.imageData)));\n  const stored = {};\n  entries.forEach(([pose, bundle], i) => {\n    stored[pose] = {\n      landmarks:       bundle.landmarks,\n      jpeg:            jpegs[i],\n      transformMatrix: Array.from(bundle.transformMatrix ?? []),\n    };\n  });\n  const id = crypto.randomUUID();\n  const db = await openDB();\n  return new Promise((resolve, reject) => {\n    const tx = db.transaction(STORE, 'readwrite');\n    tx.objectStore(STORE).put({ id, name, ts: Date.now(), bundles: stored }, id);\n    tx.oncomplete = () => { db.close(); resolve(id); };\n    tx.onerror    = () => { db.close(); reject(tx.error); };\n  });\n}\n\n/**\n * List all saved scans, newest first.\n * @returns {Promise<Array<{id: string, name: string, ts: number}>>}\n */\nexport async function listScans() {\n  const db = await openDB();\n  const records = await new Promise((resolve, reject) => {\n    const tx  = db.transaction(STORE, 'readonly');\n    const req = tx.objectStore(STORE).getAll();\n    req.onsuccess = () => resolve(req.result ?? []);\n    req.onerror   = () => reject(req.error);\n  });\n  db.close();\n  return records\n    .filter(r => r.id)                       // exclude any non-migrated stale records\n    .sort((a, b) => b.ts - a.ts)\n    .map(({ id, name, ts }) => ({ id, name, ts }));\n}\n\n/**\n * Load a specific scan by id.\n * @returns {Promise<{bundles, depthPromises, ts, name} | null>}\n */\nexport async function loadScan(id) {\n  const db = await openDB();\n  const record = await new Promise((resolve, reject) => {\n    const tx  = db.transaction(STORE, 'readonly');\n    const req = tx.objectStore(STORE).get(id);\n    req.onsuccess = () => resolve(req.result ?? null);\n    req.onerror   = () => reject(req.error);\n  });\n  db.close();\n  if (!record) return null;\n\n  const data    = record.bundles;\n  const bundles = {};\n  for (const pose of POSE_NAMES) {\n    if (!data[pose]) continue;\n    bundles[pose] = {\n      landmarks:       data[pose].landmarks,\n      imageData:       await jpegToImageData(data[pose].jpeg),\n      transformMatrix: new Float32Array(data[pose].transformMatrix ?? []),\n    };\n  }\n  if (!bundles.center) return null;\n\n  const depthPromises = {};\n  for (const pose of POSE_NAMES) {\n    if (data[pose]) depthPromises[pose] = fetchDepth(data[pose].jpeg);\n  }\n\n  return { bundles, depthPromises, ts: record.ts, name: record.name };\n}\n\n/**\n * Delete a scan by id. No-op if the id does not exist.\n */\nexport async function deleteScan(id) {\n  const db = await openDB();\n  return new Promise((resolve, reject) => {\n    const tx = db.transaction(STORE, 'readwrite');\n    tx.objectStore(STORE).delete(id);\n    tx.oncomplete = () => { db.close(); resolve(); };\n    tx.onerror    = () => { db.close(); reject(tx.error); };\n  });\n}\n\n/**\n * Persist voice_id and viseme_shapes for a scan.\n * @param {string} scanId\n * @param {string} voiceId\n * @param {Object.<string, Float32Array>} visemeShapes\n */\nexport async function saveVoiceData(scanId, voiceId, visemeShapes) {\n  const db = await openDB();\n  return new Promise((resolve, reject) => {\n    const tx = db.transaction('voice', 'readwrite');\n    tx.objectStore('voice').put({ voice_id: voiceId, viseme_shapes: visemeShapes }, scanId);\n    tx.oncomplete = () => { db.close(); resolve(); };\n    tx.onerror    = () => { db.close(); reject(tx.error); };\n  });\n}\n\n/**\n * Load voice data for a scan.\n * @returns {Promise<{voice_id: string, viseme_shapes: Object} | null>}\n */\nexport async function loadVoiceData(scanId) {\n  const db = await openDB();\n  const record = await new Promise((resolve, reject) => {\n    const tx  = db.transaction('voice', 'readonly');\n    const req = tx.objectStore('voice').get(scanId);\n    req.onsuccess = () => resolve(req.result ?? null);\n    req.onerror   = () => reject(req.error);\n  });\n  db.close();\n  return record;\n}\n\n[Command: cat /home/workspace/mirrormirror/src/motion.js]\n// src/motion.js\n\nconst LERP_SPEED        = { gaze: 12, device: 8, mouse: 8 };\nconst MOUSE_SENSITIVITY = 0.005;   // radians per pixel\nconst GAZE_TIMEOUT      = 1.0;    // seconds before gaze is considered lost\nconst PRIORITY          = { gaze: 3, device: 2, mouse: 1 };\n\nexport function createMotionController() {\n  let rotX = 0, rotY = 0;         // current (smoothed)\n  let targetX = 0, targetY = 0;   // target\n  let lerpSpeed = LERP_SPEED.mouse;\n  let activeSource = 'mouse';\n\n  let gazeActive = false;\n  let gazeStaleSec = 0;\n  let gazeLossCallback = null;\n\n  function setInput(pitch, yaw, source) {\n    if (gazeActive && source !== 'gaze') return;\n\n    if (source === 'gaze') {\n      gazeActive = true;\n      gazeStaleSec = 0;\n      targetX = pitch;\n      targetY = yaw;\n      lerpSpeed = LERP_SPEED.gaze;\n      activeSource = 'gaze';\n      return;\n    }\n\n    if (PRIORITY[source] >= PRIORITY[activeSource]) {\n      targetX = pitch;\n      targetY = yaw;\n      lerpSpeed = LERP_SPEED[source];\n      activeSource = source;\n    }\n  }\n\n  function tick(dt) {\n    const alpha = 1 - Math.exp(-lerpSpeed * dt);\n    rotX += (targetX - rotX) * alpha;\n    rotY += (targetY - rotY) * alpha;\n\n    if (gazeActive) {\n      gazeStaleSec += dt;\n      if (gazeStaleSec >= GAZE_TIMEOUT) {\n        gazeActive = false;\n        activeSource = 'mouse';\n        lerpSpeed = LERP_SPEED.mouse;\n        if (gazeLossCallback) gazeLossCallback(rotX, rotY);\n      }\n    }\n  }\n\n  function setGazeLossCallback(fn) {\n    gazeLossCallback = fn;\n  }\n\n  // Column-major mat4 for WebGPU: Ry(rotY) * Rx(rotX)\n  function getModelMatrix() {\n    const cx = Math.cos(rotX), sx = Math.sin(rotX);\n    const cy = Math.cos(rotY), sy = Math.sin(rotY);\n    // Combined Ry * Rx, column-major\n    return new Float32Array([\n       cy,       0,  -sy,      0,\n       sx * sy,  cx,  sx * cy, 0,\n       cx * sy, -sx,  cx * cy, 0,\n       0,        0,   0,       1,\n    ]);\n  }\n\n  return { setInput, tick, getModelMatrix, setGazeLossCallback };\n}\n\n// Hook device orientation (call once in main.js)\nexport function bindDeviceOrientation(ctrl) {\n  window.addEventListener('deviceorientation', (e) => {\n    if (e.beta == null || e.gamma == null) return;\n    // beta: front-back tilt (-180 to 180), gamma: left-right (-90 to 90)\n    const pitch = ((e.beta - 90) * Math.PI) / 180;\n    const yaw   = (e.gamma * Math.PI) / 180;\n    ctrl.setInput(pitch, yaw, 'device');\n  });\n}\n\n// Hook mouse drag (call once in main.js, pass the canvas element)\nexport function bindMouseDrag(ctrl, element) {\n  let dragging = false, lastX = 0, lastY = 0;\n  let accumX = 0, accumY = 0;\n\n  ctrl.setGazeLossCallback((lastPitch, lastYaw) => {\n    accumX = lastPitch;\n    accumY = lastYaw;\n  });\n\n  function onMove(dx, dy) {\n    accumY += dx * MOUSE_SENSITIVITY;\n    accumX += dy * MOUSE_SENSITIVITY;\n    ctrl.setInput(accumX, accumY, 'mouse');\n  }\n\n  element.addEventListener('mousedown', e => { dragging = true; lastX = e.clientX; lastY = e.clientY; });\n  window.addEventListener('mouseup', () => { dragging = false; });\n  window.addEventListener('mousemove', e => {\n    if (!dragging) return;\n    onMove(e.clientX - lastX, e.clientY - lastY);\n    lastX = e.clientX; lastY = e.clientY;\n  });\n\n  // Touch support\n  element.addEventListener('touchstart', e => { dragging = true; lastX = e.touches[0].clientX; lastY = e.touches[0].clientY; }, { passive: true });\n  window.addEventListener('touchend', () => { dragging = false; });\n  window.addEventListener('touchmove', e => {\n    if (!dragging) return;\n    onMove(e.touches[0].clientX - lastX, e.touches[0].clientY - lastY);\n    lastX = e.touches[0].clientX; lastY = e.touches[0].clientY;\n  }, { passive: true });\n}","stderr":"","returncode":0},"tool_call_id":"toolu_011jp5ncMeV2uDoxqJrjDYtw","metadata":null,"timestamp":"2026-03-31T12:29:55.025744+00:00","outcome":"success","part_kind":"tool-return"}],"kind":"request","metadata":{"zmid":"qv6mrmkfr7ak"}},{"parts":[{"content":"Here's the full inventory of AI features, models, and techniques powering Mirror Mirror end-to-end:\n\n---\n\n## 1. Face Detection & Landmark Tracking — **MediaPipe Face Landmarker**\n- **Model**: `face_landmarker.task` (float16), loaded from `@mediapipe/tasks-vision`\n- **What it does**: Detects 478 3D face landmarks in real-time from the webcam. Used in two places:\n  - **Capture phase** — guides the user through 5 poses (center, left, right, up, down) and extracts landmark geometry + facial transformation matrices per pose.\n  - **Gaze tracking** — runs continuously during the viewer to track the user's head pose (pitch/yaw) and drive the 3D model rotation + expression deformation in real time.\n\n## 2. Monocular Depth Estimation — **Depth Anything V2 Small**\n- **Model**: `depth-anything/Depth-Anything-V2-Small-hf` via HuggingFace Transformers + PyTorch\n- **What it does**: Takes each captured JPEG frame, infers a per-pixel depth map, normalizes it to an 8-bit 256×256 grayscale PNG. This gives the 3D mesh actual Z-depth so the face isn't flat — vertices are displaced in Z by sampling this depth map.\n\n## 3. 3D Face Reconstruction — Custom multi-pose fusion pipeline\n- **Technique**: Combines MediaPipe's 474 landmark positions from the center pose with the depth map to produce a 3D vertex mesh. UV coordinates are computed from the face's image-space position, and a texture atlas is stitched from all 5 captured angles (center front panel + 4 side panels for left/right/up/down). Triangulation uses MediaPipe's canonical face topology (`FACE_TRIANGLES` + `EYE_FILL_TRIANGLES`).\n- **Math**: Rigid-body 4×4 matrix inversion, matrix multiplication, and point transformation to un-project landmarks into 3D world space.\n\n## 4. Real-time 3D Rendering — **WebGPU**\n- Custom WGSL vertex/fragment shaders (`face.vert.wgsl`, `face.frag.wgsl`) rendering the textured face mesh with per-vertex normals and lighting.\n- Supports dynamic vertex updates every frame for expression deformation.\n\n## 5. Expression Deformation — **Procrustes Alignment + Landmark Deltas**\n- **Technique**: 2D Procrustes analysis (closed-form similarity transform: translate + rotate + uniform scale) aligns live landmarks to rest-pose landmarks using 8 bony anchor points. Per-landmark deltas are computed, EMA-smoothed (α=0.35), then converted to vertex-space displacements. This drives real-time facial expressions on the 3D model from webcam input.\n\n## 6. Voice Cloning — **ElevenLabs Voice Add API**\n- **API**: `POST /v1/voices/add`\n- **What it does**: During calibration, the user speaks an incantation. The recorded audio is uploaded to ElevenLabs, which creates a cloned voice profile and returns a `voice_id`. This voice is then used for all TTS responses — the mirror literally speaks back in *your* voice.\n\n## 7. Text-to-Speech with Timestamps — **ElevenLabs Multilingual V2**\n- **Model**: `eleven_multilingual_v2`\n- **API**: `POST /v1/text-to-speech/{voice_id}/with-timestamps`\n- **What it does**: Synthesizes the mirror's response text into audio using the cloned voice, and returns character-level start/end timestamps. These timestamps are critical for lip-sync.\n\n## 8. Speech-to-Text — **ElevenLabs Scribe V1**\n- **Model**: `scribe_v1`\n- **API**: `POST /v1/speech-to-text`\n- **What it does**: Transcribes the user's spoken question into text so it can be sent to the LLM.\n\n## 9. Conversational AI — **Claude (Anthropic)**\n- **Model**: `claude-opus-4-6`\n- **API**: Anthropic Messages API (`POST /v1/messages`)\n- **System prompt**: *\"You are the reflection in this mirror, speaking back to the face looking at you. Respond in one short sentence. Be direct, honest, and a little surprising — like something your own face might actually say.\"*\n- **What it does**: Generates the mirror's spoken response. Maintains multi-turn conversation history so the mirror has context.\n\n## 10. Viseme-based Lip Sync — Custom phoneme→viseme pipeline\n- **Technique**: Rule-based English text-to-phoneme conversion (digraph + single-char rules → 13 phoneme classes: REST, MBP, FV, TH, TDNL, SZ, SH, R, KG, W, OH, AH, EE). Each phoneme maps to a viseme (mouth shape). During calibration, the incantation's known phoneme sequence is time-aligned to the captured landmark frames, extracting a representative deformation delta per viseme. At playback, ElevenLabs' character timestamps are merged with the phoneme sequence to produce a timed viseme timeline, and the face mesh is deformed frame-by-frame with lerp blending (t=0.7) to animate the mouth in sync with audio.\n\n## 11. Voice Activity Detection (VAD) — **Web Audio API + RMS threshold**\n- **Technique**: `AnalyserNode` computes RMS energy from the microphone stream in real-time. Once speech is detected (RMS > 0.01 threshold), a 1.5s silence timer triggers automatic recording stop. This gives natural conversational turn-taking — the mirror knows when you've stopped speaking.\n\n## 12. Motion Control — Multi-source input fusion\n- **Technique**: Fuses gaze tracking (highest priority), device orientation (gyroscope), and mouse/touch drag to control the 3D model's rotation. Uses exponential smoothing (`1 - e^(-speed·dt)`) for fluid interpolation, with per-source lerp speeds and priority arbitration.\n\n## 13. Persistence — **IndexedDB**\n- Scans (landmark bundles, JPEG textures, depth data) and voice data (cloned voice IDs, per-viseme deformation shapes) are persisted client-side in IndexedDB, with schema migrations across 3 versions.\n\n---\n\n**In summary**: The user looks into their camera. MediaPipe tracks their face. Depth Anything gives it depth. WebGPU renders a 3D replica. They speak an incantation — ElevenLabs clones their voice while Procrustes extracts their mouth shapes. They ask a question — Scribe transcribes it, Claude answers, ElevenLabs speaks the answer in their own cloned voice, and the 3D face lip-syncs using the captured viseme shapes. The mirror talks back.","id":null,"provider_name":null,"provider_details":null,"part_kind":"text"}],"model_name":"zo:anthropic/claude-opus-4-6","input_tokens":66809,"output_tokens":1686,"cache_read_tokens":56612,"cache_write_tokens":10196,"timestamp":"2026-03-31T12:29:58.017004+00:00","kind":"response","persona_id":null,"persona_name":null,"persona_image":null,"metadata":{"zmid":"i34g4w7vra6t"}}],"referral_url":"https://zo-computer.cello.so/ZbdQqb2NLw0"}