Bayes’ rule step by step

If you see the intimidating formula

P(L \mid \text{correct}) = \frac{P(L) \cdot (1 - P(S))}{P(L) \cdot (1 - P(S)) + (1 - P(L)) \cdot P(G)}

and want to run — pause. We’ll unpack it. It’s common sense encoded as a fraction.

Bayes’ core idea

Bayes answers:

I thought event A had some probability. Then I saw observation B. How should I revise my belief about A?

Here:

A = “student mastered the skill”
B = “student solved the problem correctly”

We want $P(A \mid B)$ — probability of A given B.

Two boxes

Imagine we don’t know whether the student knows the skill. Split learners into two boxes:

%%{init: {'theme': 'base','flowchart': {'nodeSpacing': 96,'rankSpacing': 108,'padding': 40,'curve': 'basis','useMaxWidth': true}}}%%
flowchart LR
Pop["Population (1.0)"]
Pop -->|"P(L)"| K[Knows skill]
Pop -->|"1 - P(L)"| NK[Doesn't know]
K -->|"1 - P(S)"| KR[Knows & correct]
K -->|"P(S)"| KW[Knows & wrong slip]
NK -->|"P(G)"| NKR[Doesn't know guessed right]
NK -->|"1 - P(G)"| NKW[Doesn't know wrong]

Probability tree: every student lands in one of four cells

Plugging numbers:

$P(L) = 0.2$ → 20% “know,” 80% “don’t know”;
$P(S) = 0.1$ → among “know,” 90% answer correctly, 10% slip;
$P(G) = 0.2$ → among “don’t know,” 20% guess right, 80% miss.

Four cells:

	Answers correctly	Answers incorrectly
Knows ( $P(L)=0.2$ )	$0.2 \cdot 0.9 = 0.18$	$0.2 \cdot 0.1 = 0.02$
Doesn’t know ( $1-P(L)=0.8$ )	$0.8 \cdot 0.2 = 0.16$	$0.8 \cdot 0.8 = 0.64$
Total	0.34	0.66

Cell probabilities sum to $1.00$ — full partition.

Conditioning on “answered correctly”

Observing “correct” restricts us to the correct column — mass 0.34.

What fraction of that mass are “knowers”?

P(\text{knows} \mid \text{correct}) = \frac{0.18}{0.34} = 0.529

That is Bayes — numerator “knows AND correct,” denominator “all correct.”

In symbols:

P(L \mid \text{correct}) = \frac{\overbrace{P(L) \cdot (1 - P(S))}^{\text{knew and didn't slip}}}{\underbrace{P(L) \cdot (1 - P(S))}_{\text{from knowers}} + \underbrace{(1 - P(L)) \cdot P(G)}_{\text{from non-knowers who guessed}}}

And for “incorrect”:

P(L \mid \text{wrong}) = \frac{P(L) \cdot P(S)}{P(L) \cdot P(S) + (1 - P(L)) \cdot (1 - P(G))}

Numeric check

Take $P(L) = 0.2$ , $P(S) = 0.1$ , $P(G) = 0.2$ . Student answers correctly.

P(L \mid \text{correct}) = \frac{0.2 \cdot 0.9}{0.2 \cdot 0.9 + 0.8 \cdot 0.2} = \frac{0.18}{0.18 + 0.16} = \frac{0.18}{0.34} \approx 0.529

Confidence jumps from 0.2 to ~0.529 — almost “toss-up, leaning knows.”

What if they answer incorrectly?

P(L \mid \text{wrong}) = \frac{0.2 \cdot 0.1}{0.2 \cdot 0.1 + 0.8 \cdot 0.9} = \frac{0.02}{0.02 + 0.72} = \frac{0.02}{0.74} \approx 0.027

Confidence collapses toward zero.

Why one problem swings estimates so hard

Note: with $P(L_0) = 0.2$ a mistake drops confidence from 0.2 to ~0.027 — ~8×. That’s correct:

We barely believed in them; they missed — exactly what we expected from a non-knower.

But at $P(L) = 0.95$ (already confident) one mistake only lowers $P(L)$ to ~0.61 — treated mostly as slip. Check:

P(L \mid \text{wrong}) = \frac{0.95 \cdot 0.1}{0.95 \cdot 0.1 + 0.05 \cdot 0.9} = \frac{0.095}{0.095 + 0.045} \approx 0.679

That’s calibrated updating.

Step 2: allow “learning during the attempt”

After applying Bayes (the posterior) we add one small step:

P(L_{\text{new}}) = P_{\text{posterior}} + (1 - P_{\text{posterior}}) \cdot P(T)

Meaning:

Even if posterior says you didn’t know — you still had probability $P(T)$ to learn during this problem.

With $P(T) = 0.1$ :

P(L_{\text{new}}) = 0.529 + (1 - 0.529) \cdot 0.1 \approx 0.576

After an error:

P(L_{\text{new}}) = 0.027 + (1 - 0.027) \cdot 0.1 \approx 0.124

Bottom line: all of BKT math

That’s it. The whole model.

// Step 1. Posterior via Bayes.
posterior = correct
  ? (pL * (1 - pSlip)) / (pL * (1 - pSlip) + (1 - pL) * pGuess)
  : (pL * pSlip) / (pL * pSlip + (1 - pL) * (1 - pGuess));

// Step 2. Update accounting for possible learning.
pL_new = posterior + (1 - posterior) * pTransit;

Exactly this appears in web/lib/bkt.ts — function bktUpdate. See the code walk-through.

Try it: two boxes by hand

All four cells of the 2×2 table on one square. Drag sliders and press correct/incorrect. Highlights show what mass “survives” the observation; the formula prints the posterior.

P(L) prior0.50P(S) slip0.10P(G) guess0.20

«All possible outcomes» (4 cells)

Height: P(L)/(1−P(L)). Green width: «correct». Vivid cells = observation matched.

P(L | ✓)

0.818

P(L | ✗)

0.111

posterior

0.818

shift: +0.318

Same Bayes story as above — without algebra — you see which area matches the observation.

Next chapter — both formulas side by side; chapter 7 — full numeric walk-through across six tasks.