The four BKT numbers

BKT is just four numbers per skill. Understand those — you’ve grasped half the model.

Parameters

Symbol	Name	Meaning	Default
$P(L_0)$	Initial knowledge	prior confidence the student already knows the skill before the first attempt	0.2
$P(T)$	Transit	probability of learning in one attempt (even if incorrect)	0.1
$P(S)$	Slip	“knew but slipped through carelessness”	0.1
$P(G)$	Guess	“didn’t know but guessed / intuitively correct”	0.2

Default parameters on one scale

Literature defaults (Corbett–Anderson) usually sit in the 0.1–0.2 band — the model stays stable.

Why these four

Each number answers one specific question; without it the model breaks.

$P(L_0)$ — where to place the student at start

Before the first answer we know little about this particular student. Pick a moderately low value (they’ve just started the topic):

P(L_0) = 0.2

You can raise toward 0.4–0.5 if prerequisites look strong, or drop toward 0.1 if they’re fresh from seventh grade. On demos we keep one default for everyone — simpler tagging.

$P(T)$ — can the student learn while solving

Critical. Without $P(T)$ the model freezes: after 100 tasks there’s no growth; the only way to lift $P(L)$ is a streak of correct answers.

With $P(T) > 0$ the model says:

Even if you were wrong — you still had a chance to learn during the attempt. So I’ll bump confidence in mastery a bit.

$P(T) = 0.1$ is the literature default — ~10% chance per attempt to move toward “knows.”

$P(S)$ — slip — prevents panic

Without slip a single mistake nukes confidence:

Wrong answer ⇒ doesn’t know.

That’s false. Everyone sometimes:

flips a sign;
misreads the prompt;
rushes mental math;
loses focus after four tasks in a row.

$P(S) = 0.1$ tells the model:

~10% of “knowing” attempts still miss. Don’t panic on one wrong answer.

So confidence drops — but not to zero.

$P(G)$ — guess — prevents naivety

Mirror of slip:

four-option MC gives ~25% blind guess odds;
sometimes the right number appears “by accident”;
some tasks allow computing an answer without understanding.

$P(G) = 0.2$ :

~20% of correct answers might not reflect real mastery. Don’t celebrate one lucky guess.

So confidence rises — but not to 1.0.

Why these exact numbers

They’re literature defaults from classic BKT work since 1995 (Corbett & Anderson, User Modeling and User-Adapted Interaction). They behave reasonably across domains (math, grammar, programming) when you lack bespoke data.

In production you should tune them automatically from answer histories via EM (Expectation–Maximization). That’s covered in Notebook 3 — EM fitting.

If asked about the defaults

“Why $P(S) = 0.1$ instead of 0.15?”

Ready answer:

These are literature defaults from foundational BKT papers. On real data we fit them with EM — it’s on our roadmap. For MVP we anchor sensible baselines — moving within ~0.05–0.15 doesn’t change the qualitative story.

What happens if parameters are way off

Full sensitivity study — Notebook 2 — Parameter sensitivity. Short version:

$P(G) = 0.5$ — model distrusts correct answers; $P(L)$ barely rises. Students loop drills forever.
$P(S) = 0.5$ — model distrusts mistakes; $P(L)$ barely falls. Weak students get tasks that are too hard.
$P(T) = 0$ — no learning during attempts; only demonstrating existing knowledge is modeled.
$P(T) = 0.5$ — unrealistically fast mastery in a handful of tasks — plausible only for trivial adult skills.

Defaults 0.2 / 0.1 / 0.1 / 0.2 hit the sweet spot.

Try it: how $P(S)$ and $P(G)$ shape $P(\text{solve})$

Drag the sliders: $P(G)$ sets the left end of the line ( $P(L)=0$ ), $1 - P(S)$ the right end ( $P(L)=1$ ). In between, slope equals $1 - P(S) - P(G)$ .

P(S) — slip0.10P(G) — guess0.20

A straight line. P(solve) equals P(G) at P(L)=0 and 1−P(S) at P(L)=1. Slope = 1−P(S)−P(G).

If $P(S) + P(G) \to 1$ (e.g. 0.5 / 0.5) the curve flattens — correctness carries almost no information about $P(L)$ . BKT stops working.

Now we have state ( $P(L)$ ) and parameters ( $P(T)/P(S)/P(G)$ ). Time for the update rule — BKT’s core math — in the next chapter.

The four BKT numbers

Parameters

Why these four

P(L0)P(L_0)P(L0​) — where to place the student at start

P(T)P(T)P(T) — can the student learn while solving

P(S)P(S)P(S) — slip — prevents panic

P(G)P(G)P(G) — guess — prevents naivety