Skip to content

Numerical example — Ivan and parentheses

Goal: track how P(L)P(L) for the skill “expand parentheses” moves for Ivan after each task.

Ivan opens his first parentheses problem. He hasn’t seen this topic before.

  • P(L0)=0.2P(L_0) = 0.2 — prior (“probably doesn’t know”).
  • P(S)=0.1P(S) = 0.1, P(G)=0.2P(G) = 0.2, P(T)=0.1P(T) = 0.1 — literature defaults.

Posterior for correct:

Ppost=0.20.90.20.9+0.80.2=0.180.340.529P_{\text{post}} = \frac{0.2 \cdot 0.9}{0.2 \cdot 0.9 + 0.8 \cdot 0.2} = \frac{0.18}{0.34} \approx 0.529

Learning step:

P(L1)=0.529+(10.529)0.10.576P(L_1) = 0.529 + (1 - 0.529) \cdot 0.1 \approx 0.576

0.20 → 0.58. Confidence nearly triples after one task — appropriate: we knew almost nothing, now we have a strong positive signal.

Now P(L)=0.576P(L) = 0.576.

Ppost=0.5760.90.5760.9+0.4240.2=0.5180.518+0.0850.859P_{\text{post}} = \frac{0.576 \cdot 0.9}{0.576 \cdot 0.9 + 0.424 \cdot 0.2} = \frac{0.518}{0.518 + 0.085} \approx 0.859 P(L2)=0.859+(10.859)0.10.873P(L_2) = 0.859 + (1 - 0.859) \cdot 0.1 \approx 0.873

0.58 → 0.87. Another correct answer — the model is nearly convinced.

P(L)=0.873P(L) = 0.873. Wrong-answer posterior:

Ppost=0.8730.10.8730.1+0.1270.9=0.08730.0873+0.11430.433P_{\text{post}} = \frac{0.873 \cdot 0.1}{0.873 \cdot 0.1 + 0.127 \cdot 0.9} = \frac{0.0873}{0.0873 + 0.1143} \approx 0.433 P(L3)=0.433+(10.433)0.10.490P(L_3) = 0.433 + (1 - 0.433) \cdot 0.1 \approx 0.490

0.87 → 0.49. A noticeable drop — not to zero. At high confidence an error reads partly as slip:

Was at 0.87 — could be slip. Down to 0.49; waiting for more data.

P(L)=0.490P(L) = 0.490.

Ppost=0.4900.90.4900.9+0.5100.2=0.4410.441+0.1020.812P_{\text{post}} = \frac{0.490 \cdot 0.9}{0.490 \cdot 0.9 + 0.510 \cdot 0.2} = \frac{0.441}{0.441 + 0.102} \approx 0.812 P(L4)=0.812+(10.812)0.10.831P(L_4) = 0.812 + (1 - 0.812) \cdot 0.1 \approx 0.831

0.49 → 0.83. Recovery.

Task 5 — incorrect (second mistake after rebound)

Section titled “Task 5 — incorrect (second mistake after rebound)”

P(L)=0.831P(L) = 0.831.

Ppost=0.8310.10.8310.1+0.1690.9=0.08310.0831+0.15210.353P_{\text{post}} = \frac{0.831 \cdot 0.1}{0.831 \cdot 0.1 + 0.169 \cdot 0.9} = \frac{0.0831}{0.0831 + 0.1521} \approx 0.353 P(L5)=0.353+(10.353)0.10.418P(L_5) = 0.353 + (1 - 0.353) \cdot 0.1 \approx 0.418

0.83 → 0.42. Down again.

Task 6 — incorrect (third mistake in six tries)

Section titled “Task 6 — incorrect (third mistake in six tries)”

P(L)=0.418P(L) = 0.418.

Ppost=0.4180.10.4180.1+0.5820.9=0.04180.0418+0.52380.074P_{\text{post}} = \frac{0.418 \cdot 0.1}{0.418 \cdot 0.1 + 0.582 \cdot 0.9} = \frac{0.0418}{0.0418 + 0.5238} \approx 0.074 P(L6)=0.074+(10.074)0.10.166P(L_6) = 0.074 + (1 - 0.074) \cdot 0.1 \approx 0.166

0.42 → 0.17. Three errors — the model confidently says the skill isn’t mastered; slip explains one miss, not three.

That’s “random slip vs real gap.”

StepAnswerP(L)P(L) beforeposteriorP(L)P(L) after
00.200
10.2000.5290.576
20.5760.8590.873
30.8730.4330.490
40.4900.8120.831
50.8310.3530.418
60.4180.0740.166
P(L)
1.0 ┤
0.9 ┤ ●
0.8 ┤
0.7 ┤
0.6 ┤● ●
0.5 ┤
0.4 ┤ ● ●
0.3 ┤
0.2 ┤● ●
0.1 ┤
0.0 ┴──┬──┬──┬──┬──┬──┬──┬─────
0 1 2 3 4 5 6
step
  1. Runs of correct answers (1–2) lift confidence quickly.
  2. Single mistake at high confidence → partly slip — no meltdown.
  3. Runs of mistakes (5–6) pull estimates down decisively — slip can’t explain everything.
  4. Volatility around P(L)0.5P(L) \approx 0.5 is expected — uncertainty is real.

In chapter 8 we’ll choose Ivan’s next task from this P(L)P(L) history — likely something simpler on the same skill to consolidate, not advance blindly.

These numbers match web/lib/bkt.ts. Full Python replay with plots — Notebook 1 — BKT from scratch.