Knowledge as probability
Hidden vs observed
Section titled “Hidden vs observed”A classic mistake is trying to directly decide whether the student “knows” the topic. That’s impossible: skill lives in the head; we only see outcomes (correct / incorrect).
In BKT we give up on certainty and say:
Student state is a hidden variable. We don’t know it exactly. But we can estimate the probability that the student has mastered it.
That probability is the central quantity:
L stands for learned. It’s just a number between 0 and 1.
- — we’re confident they don’t know.
- — “no idea.”
- — confident they know.
In practice is rarely exactly 0 or 1 — the model keeps doubt on purpose. That’s correct: one correct answer isn’t proof; one mistake isn’t a verdict.
Where the starting value comes from
Section titled “Where the starting value comes from”When a student first meets a skill we set a prior — our guess before any observations.
The literature default is :
“Probably doesn’t know yet, but might have heard something.”
If other skills look strong, you can raise it. If the topic is brand new, leave 0.2. This parameter can later be fit from real data (see Notebook 3 — EM fitting).
What the model does after each answer
Section titled “What the model does after each answer”In plain words:
- Correct answer → increases.
- Incorrect answer → decreases.
- Not by “whole steps” — the shift depends on prior confidence and slip/guess parameters (see chapter 4).
The point: moves smoothly. That saves the model from two common failures:
- Panic on one mistake (“they know nothing!”).
- Euphoria on one correct answer (“genius!”).
Graphically
Section titled “Graphically”Model confidence scale:
0 ────────●────────────────────────────────── 1doesn't know P(L)=0.2 (start) knows
After one correct task:0 ─────────────────────●───────────────────── 1 P(L)=0.58
After two correct:0 ─────────────────────────────────●───────── 1 P(L)=0.87The numbers come from real BKT updates with default parameters. Numerical example walks through the details.
One counter per micro-skill
Section titled “One counter per micro-skill”Important nuance: we don’t store one per student. We store a vector:
Ivan: expand_brackets: 0.42 distributive_law: 0.81 signs: 0.66 move_across_equals: 0.55 ...So each skill has its own trajectory. Ivan can be strong in arithmetic and weak on parentheses — and the model sees it.
Try it: moves with answers
Section titled “Try it: P(L)P(L)P(L) moves with answers”Below is a real BKT simulator. Press ✓ or ✗ and watch climb on correct answers or fall on mistakes; follows automatically.
BKT parameters
Notice:
- never hits exactly 0 or 1 — BKT always keeps residual uncertainty;
- after a correct answer jumps up, after an error it drops, but not symmetrically;
- is always “tighter” — between and because it folds in guess and slip noise.
Why we store a vector of per skill, and why one overall “math level” is a bad idea — next chapter.