Explainability — “why this task”
Why explainability matters
Section titled “Why explainability matters”Rule #1 for teacher trust:
If I don’t understand why the system picks problem 147 for Ivan — I won’t rely on it.
Explainability matters for external reviewers too: without a crisp “why this task,” the pitch collapses into generic model talk. Explainability isn’t polish — it’s core product.
What we don’t do
Section titled “What we don’t do”We don’t ask LLMs (Claude / GPT) to narrate numeric facts.
Models sometimes hallucinate:
- wrong micro-skills named;
- confuse with ;
- invent “reasons” unsupported by data.
When explanations must be exact, that fails. So facts assemble deterministically via templates.
What we do do
Section titled “What we do do”Templates stitch sentences from BKT state with simple rules:
- Identify the task’s weakest skill — primary reason for recommendation.
- Identify the strongest skill — guarantees the student won’t drown.
- Surface as a ZPD indicator.
- Mention rare-skill emphasis when
rareSkillBonusapplies.
Example explanation (Estonian)
Section titled “Example explanation (Estonian)”Ülesanne T-147 Ivanile
Põhjus: kõige nõrgem mikrooskus on "sulgude avamine" (P=0.41). Tugevaim — "aritmeetika märkidega" (P=0.82). Ülesanne treenib just nõrka kohta, kuid ei jää aritmeetika peale kinni. Lahenduse tõenäosus ≈ 0.55 — see on parajalt keeruline.English gloss:
“Weakest micro-skill — expanding parentheses (). Strongest — arithmetic with signs (). The exercise trains the weak spot without trapping them in arithmetic. — appropriately challenging.”
Implementation
Section titled “Implementation”In web/lib/explain.ts (stretch goal — stable baseline):
export function explainRecommendation( scored: ScoredTask, microskills: Record<MicroSkillId, MicroSkill>, lang: 'et' | 'ru' | 'en' = 'et'): string { const sorted = Object.entries(scored.perSkillPL) .sort(([, a], [, b]) => a - b); const [weakest, weakP] = sorted[0]; const [strongest, strongP] = sorted[sorted.length - 1]; const targetSkill = microskills[weakest].title_et; const supportSkill = microskills[strongest].title_et; return T[lang]({ targetSkill, weakP: weakP.toFixed(2), supportSkill, strongP: strongP.toFixed(2), pSolve: scored.pSolve.toFixed(2), });}
const T = { et: ({ targetSkill, weakP, supportSkill, strongP, pSolve }) => `Põhjus: nõrgim — "${targetSkill}" (P=${weakP}). Tugevaim — ` + `"${supportSkill}" (P=${strongP}). Lahenduse tõenäosus ≈ ${pSolve}.`, ru: ..., en: ...,};Optional: Claude as stylist
Section titled “Optional: Claude as stylist”You may pass the filled template through Claude only for tone:
You are a MATx assistant. Rewrite the following for a teacher in 1–2 friendlyEstonian sentences. Do not change numbers or skill names.
[template]Guardrails:
- numbers and skills remain fixed — Claude reads but shouldn’t alter facts;
- hallucination risk stays low because facts are provided;
- tone feels human, not database dump.
Hybrid recipe: facts from us, wording polish optional.
Teacher UI fields
Section titled “Teacher UI fields”| Field | Source | Purpose |
|---|---|---|
| Task name | task.id | identification |
| Top-2 skills with P | mastery vector | target vs support |
| Student | scoreTaskForStudent | ZPD indicator |
| 1–2 prose sentences | explainRecommendation | human-readable reason |
| Alternatives (top-3) | recommend()[1..2] | backup choices |
Edge cases
Section titled “Edge cases”- Single-skill task → no “strongest contrast.” Template: “Target skill X (P=Y). .”
- All skills strong () → why recommend? Template admits reinforcement: “Skills mostly mastered — consolidation.”
- All skills weak () → frustration zone. Template warns: “Risk of overload — consider easier alternate.”
Talking points: templates & LLMs
Section titled “Talking points: templates & LLMs”“Explanations are generated deterministically from BKT state — digits stay trustworthy with zero LLM hallucination risk in math. Optional stylistic pass-through keeps tone friendly without touching facts.”
Next: not only “what’s wrong” but “how to practice”
Section titled “Next: not only “what’s wrong” but “how to practice””Today the template answers “why this task” and “where the student is weakest.” The next step — add one more line: how to practice. Not only “Ivan’s ,” but also “try this: five expansion drills, then one with a minus in front of the parentheses.” The teacher screen turns from a diagnosis into a diagnosis + prescription, on the same page.
Which hint works better
Section titled “Which hint works better”When several hints exist for the same mistake, it’s useful to know which of them helps. Plain version: after hint A the student solved the next task 60% of the time; after hint B — 75%. So B is better — show it more often. No formulas, just a counter of “how often it helped.” In the larger product this becomes automatic selection of the best hint.