Tasks with multiple skills

A problem is a skill bundle

Real textbook items rarely test one micro-skill. Example:

$2(x − 3) = 10$

needs simultaneously:

linear_eq.expand_brackets,
arith.signs,
linear_eq.move_to_one_side,
linear_eq.divide_by_coefficient.

If Ivan has zero mastery on any one — the task fails even if others sit at 0.95.

Combining $P(\text{solve})$ across skills

We have per-skill $P(\text{solve}_i)$ . What’s joint $P(\text{solve}_{\text{joint}})$ for the whole task?

Option A: arithmetic mean (bad)

P_{\text{AM}} = \frac{1}{n} \sum_{i=1}^{n} P(\text{solve}_i)

Three skills: $P_1 = 0.95$ , $P_2 = 0.95$ , $P_3 = 0.20$ .

P_{\text{AM}} = \frac{0.95 + 0.95 + 0.20}{3} = 0.70

Looks perfect — selector screams ZPD. Yet Ivan will almost certainly fail — skill 3 sits at 0.20.

Arithmetic mean hides the weak link.

Option B: geometric mean (right)

P_{\text{GM}} = \exp\left(\frac{1}{n} \sum_{i=1}^{n} \log P(\text{solve}_i)\right) = \sqrt[n]{\prod_{i=1}^{n} P(\text{solve}_i)}

Same example:

P_{\text{GM}} = \sqrt[3]{0.95 \cdot 0.95 \cdot 0.20} = \sqrt[3]{0.1805} \approx 0.565

GM 0.565 — much lower than AM. Selector sees it’s not really in ZPD (| $0.565 - 0.7$ | still sizable) and may pick something better.

Why GM punishes weak links harder

If any factor is tiny, the whole product shrinks. Geometric mean inherits that — chain analogy: one weak link snaps the chain.

Comparison table

$P_1$	$P_2$	$P_3$	AM	GM
0.5	0.5	0.5	0.500	0.500
0.7	0.7	0.7	0.700	0.700
0.9	0.9	0.3	0.700	0.624
0.95	0.95	0.2	0.700	0.503
0.99	0.99	0.1	0.693	0.461
0.5	0.5	0.05	0.350	0.171

The wider the gap between strong and weak skills, the further GM drops below AM — exactly what we want.

Try it: AM vs GM vs min

Slide three $P(\text{solve})$ sliders side by side. Especially fun: keep P₁=P₂=0.95 and sweep P₃ from 0.05 to 1.0 — GM exposes the weak skill; AM hides it.

A task with three micro-skills. Move the sliders and watch how the different aggregation methods behave.

P(solve₁)0.95P(solve₂)0.95P(solve₃)0.20

Arith. mean

0.700

Geom. mean (ours)

0.565

min

0.200

When even one P is low, GM drops sharply but AM doesn’t. min is too harsh. GM is the «weakest-link» compromise we need.

Implementation

In web/lib/bkt.ts we compute GM via sum of logs for numerical stability (tiny products underflow):

  const perSkillPL: Record<MicroSkillId, number> = {};
  let logSum = 0;
  for (const skillId of task.microskills) {
    const pL = state.mastery[skillId] ?? params.pInit;
    perSkillPL[skillId] = pL;
    logSum += Math.log(Math.max(1e-6, pSolve(pL, params)));
  }
  const pSolveJoint = Math.exp(logSum / task.microskills.length);

Note: Math.max(1e-6, ...) guards $\log(0) = -\infty$ . In practice $P(\text{solve})$ stays > 0 thanks to $P(G) > 0$ , but the clamp is cheap insurance.

Alternatives we skipped

Method	Idea	Why not
Minimum $\min P_i$	failure follows weakest skill	Too harsh; strong skills slightly cushion weak ones in reality
Product $\prod P_i$	independence story	Same as GM raised to $n$ th power; incomparable across different $n$
Harmonic mean	also suppresses weak links	Harder to pitch; similar effect to GM
Multidimensional logistic	regress on skill pairs	Needs pairwise parameters — heavy data appetite

GM — simple, interpretable, mathematically motivated compromise.

Tasks with 5+ skills

More skills ⇒ stronger “scatter penalty.” Fine for elementary items; for long multi-step problems (10+ skills) GM can get overly pessimistic.

Hackathon plan: cap tasks at 2–4 micro-skills. If an item needs 7+ — split into steps (one micro-skill each) or admit it’s too large for adaptive modeling.

Talking points: multi-skill tasks

“We aggregate with geometric mean across micro-skills — so the task fails if any required skill fails. That matches reality better than plain averaging, and corresponds to treating failures as independent events.”

Live example

The same calculation runs against the real matrix on the Progression matrix → Selector simulator page — 9 micro-skills and 20 tasks from the «Defineerimine» topic (curated by Andri Suga). Click “Strong +/−, weak ×/÷” and watch the geometric mean drag down exactly the tasks where the weakened skill is one of the core ones.

And where to practise the solving itself?

Companion project — Tom Kabel’s MATx. Once a student has the modeling down (our 9-microskill defining-skills matrix), equations like $2(x-3)=10$ still need to be solved — that’s a different domain, and Tom has already built a tool for it. The 9↔9 mapping is on the Bridge to MATx page.