Skip to content

Bayes’ rule step by step

If you see the intimidating formula

P(Lcorrect)=P(L)(1P(S))P(L)(1P(S))+(1P(L))P(G)P(L \mid \text{correct}) = \frac{P(L) \cdot (1 - P(S))}{P(L) \cdot (1 - P(S)) + (1 - P(L)) \cdot P(G)}

and want to run — pause. We’ll unpack it. It’s common sense encoded as a fraction.

Bayes answers:

I thought event A had some probability. Then I saw observation B. How should I revise my belief about A?

Here:

  • A = “student mastered the skill”
  • B = “student solved the problem correctly”

We want P(AB)P(A \mid B) — probability of A given B.

Imagine we don’t know whether the student knows the skill. Split learners into two boxes:

%%{init: {'theme': 'base','flowchart': {'nodeSpacing': 96,'rankSpacing': 108,'padding': 40,'curve': 'basis','useMaxWidth': true}}}%%
flowchart LR
Pop["Population (1.0)"]
Pop -->|"P(L)"| K[Knows skill]
Pop -->|"1 - P(L)"| NK[Doesn't know]
K -->|"1 - P(S)"| KR[Knows & correct]
K -->|"P(S)"| KW[Knows & wrong slip]
NK -->|"P(G)"| NKR[Doesn't know guessed right]
NK -->|"1 - P(G)"| NKW[Doesn't know wrong]
Probability tree: every student lands in one of four cells

Plugging numbers:

  • P(L)=0.2P(L) = 0.2 → 20% “know,” 80% “don’t know”;
  • P(S)=0.1P(S) = 0.1 → among “know,” 90% answer correctly, 10% slip;
  • P(G)=0.2P(G) = 0.2 → among “don’t know,” 20% guess right, 80% miss.

Four cells:

Answers correctlyAnswers incorrectly
Knows (P(L)=0.2P(L)=0.2)0.20.9=0.180.2 \cdot 0.9 = 0.180.20.1=0.020.2 \cdot 0.1 = 0.02
Doesn’t know (1P(L)=0.81-P(L)=0.8)0.80.2=0.160.8 \cdot 0.2 = 0.160.80.8=0.640.8 \cdot 0.8 = 0.64
Total0.340.66

Cell probabilities sum to 1.001.00 — full partition.

Observing “correct” restricts us to the correct column — mass 0.34.

What fraction of that mass are “knowers”?

P(knowscorrect)=0.180.34=0.529P(\text{knows} \mid \text{correct}) = \frac{0.18}{0.34} = 0.529

That is Bayes — numerator “knows AND correct,” denominator “all correct.”

In symbols:

P(Lcorrect)=P(L)(1P(S))knew and didn’t slipP(L)(1P(S))from knowers+(1P(L))P(G)from non-knowers who guessedP(L \mid \text{correct}) = \frac{\overbrace{P(L) \cdot (1 - P(S))}^{\text{knew and didn't slip}}}{\underbrace{P(L) \cdot (1 - P(S))}_{\text{from knowers}} + \underbrace{(1 - P(L)) \cdot P(G)}_{\text{from non-knowers who guessed}}}

And for “incorrect”:

P(Lwrong)=P(L)P(S)P(L)P(S)+(1P(L))(1P(G))P(L \mid \text{wrong}) = \frac{P(L) \cdot P(S)}{P(L) \cdot P(S) + (1 - P(L)) \cdot (1 - P(G))}

Take P(L)=0.2P(L) = 0.2, P(S)=0.1P(S) = 0.1, P(G)=0.2P(G) = 0.2. Student answers correctly.

P(Lcorrect)=0.20.90.20.9+0.80.2=0.180.18+0.16=0.180.340.529P(L \mid \text{correct}) = \frac{0.2 \cdot 0.9}{0.2 \cdot 0.9 + 0.8 \cdot 0.2} = \frac{0.18}{0.18 + 0.16} = \frac{0.18}{0.34} \approx 0.529

Confidence jumps from 0.2 to ~0.529 — almost “toss-up, leaning knows.”

What if they answer incorrectly?

P(Lwrong)=0.20.10.20.1+0.80.9=0.020.02+0.72=0.020.740.027P(L \mid \text{wrong}) = \frac{0.2 \cdot 0.1}{0.2 \cdot 0.1 + 0.8 \cdot 0.9} = \frac{0.02}{0.02 + 0.72} = \frac{0.02}{0.74} \approx 0.027

Confidence collapses toward zero.

Note: with P(L0)=0.2P(L_0) = 0.2 a mistake drops confidence from 0.2 to ~0.027 — ~8×. That’s correct:

We barely believed in them; they missed — exactly what we expected from a non-knower.

But at P(L)=0.95P(L) = 0.95 (already confident) one mistake only lowers P(L)P(L) to ~0.61 — treated mostly as slip. Check:

P(Lwrong)=0.950.10.950.1+0.050.9=0.0950.095+0.0450.679P(L \mid \text{wrong}) = \frac{0.95 \cdot 0.1}{0.95 \cdot 0.1 + 0.05 \cdot 0.9} = \frac{0.095}{0.095 + 0.045} \approx 0.679

That’s calibrated updating.

Step 2: allow “learning during the attempt”

Section titled “Step 2: allow “learning during the attempt””

After applying Bayes (the posterior) we add one small step:

P(Lnew)=Pposterior+(1Pposterior)P(T)P(L_{\text{new}}) = P_{\text{posterior}} + (1 - P_{\text{posterior}}) \cdot P(T)

Meaning:

Even if posterior says you didn’t know — you still had probability P(T)P(T) to learn during this problem.

With P(T)=0.1P(T) = 0.1:

P(Lnew)=0.529+(10.529)0.10.576P(L_{\text{new}}) = 0.529 + (1 - 0.529) \cdot 0.1 \approx 0.576

After an error:

P(Lnew)=0.027+(10.027)0.10.124P(L_{\text{new}}) = 0.027 + (1 - 0.027) \cdot 0.1 \approx 0.124

That’s it. The whole model.

// Step 1. Posterior via Bayes.
posterior = correct
? (pL * (1 - pSlip)) / (pL * (1 - pSlip) + (1 - pL) * pGuess)
: (pL * pSlip) / (pL * pSlip + (1 - pL) * (1 - pGuess));
// Step 2. Update accounting for possible learning.
pL_new = posterior + (1 - posterior) * pTransit;

Exactly this appears in web/lib/bkt.ts — function bktUpdate. See the code walk-through.

All four cells of the 2×2 table on one square. Drag sliders and press correct/incorrect. Highlights show what mass “survives” the observation; the formula prints the posterior.

«All possible outcomes» (4 cells)
knowsdoesn’t know
Height: P(L)/(1−P(L)). Green width: «correct». Vivid cells = observation matched.
P(L | ✓)
0.818
P(L | ✗)
0.111
posterior
0.818
shift: +0.318

Same Bayes story as above — without algebra — you see which area matches the observation.

Next chapter — both formulas side by side; chapter 7 — full numeric walk-through across six tasks.