Numerical example — Ivan and parentheses

Goal: track how $P(L)$ for the skill “expand parentheses” moves for Ivan after each task.

Starting state

Ivan opens his first parentheses problem. He hasn’t seen this topic before.

$P(L_0) = 0.2$ — prior (“probably doesn’t know”).
$P(S) = 0.1$ , $P(G) = 0.2$ , $P(T) = 0.1$ — literature defaults.

Task 1 — correct

Posterior for correct:

P_{\text{post}} = \frac{0.2 \cdot 0.9}{0.2 \cdot 0.9 + 0.8 \cdot 0.2} = \frac{0.18}{0.34} \approx 0.529

Learning step:

P(L_1) = 0.529 + (1 - 0.529) \cdot 0.1 \approx 0.576

0.20 → 0.58. Confidence nearly triples after one task — appropriate: we knew almost nothing, now we have a strong positive signal.

Task 2 — correct

Now $P(L) = 0.576$ .

P_{\text{post}} = \frac{0.576 \cdot 0.9}{0.576 \cdot 0.9 + 0.424 \cdot 0.2} = \frac{0.518}{0.518 + 0.085} \approx 0.859

P(L_2) = 0.859 + (1 - 0.859) \cdot 0.1 \approx 0.873

0.58 → 0.87. Another correct answer — the model is nearly convinced.

Task 3 — incorrect

$P(L) = 0.873$ . Wrong-answer posterior:

P_{\text{post}} = \frac{0.873 \cdot 0.1}{0.873 \cdot 0.1 + 0.127 \cdot 0.9} = \frac{0.0873}{0.0873 + 0.1143} \approx 0.433

P(L_3) = 0.433 + (1 - 0.433) \cdot 0.1 \approx 0.490

0.87 → 0.49. A noticeable drop — not to zero. At high confidence an error reads partly as slip:

Was at 0.87 — could be slip. Down to 0.49; waiting for more data.

Task 4 — correct

$P(L) = 0.490$ .

P_{\text{post}} = \frac{0.490 \cdot 0.9}{0.490 \cdot 0.9 + 0.510 \cdot 0.2} = \frac{0.441}{0.441 + 0.102} \approx 0.812

P(L_4) = 0.812 + (1 - 0.812) \cdot 0.1 \approx 0.831

0.49 → 0.83. Recovery.

Task 5 — incorrect (second mistake after rebound)

$P(L) = 0.831$ .

P_{\text{post}} = \frac{0.831 \cdot 0.1}{0.831 \cdot 0.1 + 0.169 \cdot 0.9} = \frac{0.0831}{0.0831 + 0.1521} \approx 0.353

P(L_5) = 0.353 + (1 - 0.353) \cdot 0.1 \approx 0.418

0.83 → 0.42. Down again.

Task 6 — incorrect (third mistake in six tries)

$P(L) = 0.418$ .

P_{\text{post}} = \frac{0.418 \cdot 0.1}{0.418 \cdot 0.1 + 0.582 \cdot 0.9} = \frac{0.0418}{0.0418 + 0.5238} \approx 0.074

P(L_6) = 0.074 + (1 - 0.074) \cdot 0.1 \approx 0.166

0.42 → 0.17. Three errors — the model confidently says the skill isn’t mastered; slip explains one miss, not three.

That’s “random slip vs real gap.”

All six steps in one table

Step	Answer	$P(L)$ before	posterior	$P(L)$ after
0	—	—	—	0.200
1	✓	0.200	0.529	0.576
2	✓	0.576	0.859	0.873
3	✗	0.873	0.433	0.490
4	✓	0.490	0.812	0.831
5	✗	0.831	0.353	0.418
6	✗	0.418	0.074	0.166

Step chart

P(L)
 1.0 ┤
 0.9 ┤      ●
 0.8 ┤
 0.7 ┤
 0.6 ┤●           ●
 0.5 ┤
 0.4 ┤      ●           ●
 0.3 ┤
 0.2 ┤●                       ●
 0.1 ┤
 0.0 ┴──┬──┬──┬──┬──┬──┬──┬─────
        0  1  2  3  4  5  6
                  step

Takeaways

Runs of correct answers (1–2) lift confidence quickly.
Single mistake at high confidence → partly slip — no meltdown.
Runs of mistakes (5–6) pull estimates down decisively — slip can’t explain everything.
Volatility around $P(L) \approx 0.5$ is expected — uncertainty is real.

In chapter 8 we’ll choose Ivan’s next task from this $P(L)$ history — likely something simpler on the same skill to consolidate, not advance blindly.

Want to verify yourself?

These numbers match web/lib/bkt.ts. Full Python replay with plots — Notebook 1 — BKT from scratch.