Forge · Raise Your Little Wukong

★★★★★ Refinement

📋 Key Takeaways

Central thesis: the fuller the script, the more the agent does what you want — S5 fullness is not lip service, it is one real .md actually fed into your KB
88% of AI agents die at production — three walls: prompt engineering is not enough / context engineering is still not enough / harness engineering requires a human in the loop
Four things move together: 元悟空 (LLM) + platform (WukongDojo) + script (your design) + you (human in the loop) — missing any one and the agent will not grow
Four calibration tiers: Open 70/30 (brainstorm) · Blend 40/60 (daily) · Strict 10/90 (compliance/medical) · Cite 0/100 (legal KB) — free where you should, contained where you must
Three exemplars: Workflow (process closure) · Constraint (4-gate hard rules) · Knowledge (in-domain proprietary + out-of-domain clean refusal) — all three are 95/95 student work
Chant the harness well · raise the Little Wukong well — forge one cut this week, S6 we walk it toward launch

🎬 Video Replay

Full: ▶ Watch now (HD full version)

YouTube: https://youtu.be/RK633HqYMNs

Bilibili: https://www.bilibili.com/video/BV17r5Z6hEuq/

🔧 Materials

✏️ S5 Homework Guide (submit · Golden Hoop 5 · pick one cut, change it, run it) ↗

✏️ Tarea

HW (submit · Golden Hoop 5 · Refinement): pick ONE place in your S4 script that most needs changing, ACTUALLY change it on WukongDojo (edit script or feed KB), run it ONCE to see the effect, write a forge log. ⚠️ Keep the S4-declared category (A closer / B Q&A / C companion). ⚠️ S5 is ONE cut, not a full rewrite of S4. Forge one deep >> graze seven shallow. ① WHAT I CHANGED: edit script (write before → after) or feed KB (filename + at least 3 entries). ② HOW IT WENT: run one real scenario, paste 2-3 dialog lines, write how the effect was. ③ NEXT STEP: what to change before S6 + one specific question for Bill.

⏰ Fecha límite: 5/23 周六 9:00 AM PT

→ Enviar tarea (S5-FORGE)

▶ 🤖 Criterios de evaluación IA

Dimensión	Peso	Descripción
What I changed	30%	Name the cut + concrete change. Script edit must show before → after; KB feed must show filename + at least 3 entries. "Ran it on WukongDojo" without specifics = low. Filename without content = low.
How it went	35%	Run one REAL scenario (not a test question), paste 2-3 dialog lines, write the effect (better / same / worse) + why in one sentence. Honesty beats "everything improved". Not running = low.
Honest observations	15%	Honest reflection on the effect. If it got worse, that observation is valuable — you learned the cost of constraints. "Everything improved" suggests soft scenarios or lack of honesty.
Next step	20%	What to change before S6 (concrete: which .md / which step) + one specific question for Bill. "Keep polishing" = low.

90-100 Concrete change + ran with dialog excerpt + honest observation + concrete next step · 60-89 Concrete change + ran but observations occasionally hand-wavy · 40-59 Says polished but vague on change / did not run · 20-39 Barely touched the agent · 1-19 Nearly empty

Iniciar sesión

Correo enviado

Forge · Raise Your Little Wukong

📋 Key Takeaways

🎬 Video Replay

🔧 Materials

✏️ Tarea