Sergey Kopanev: you sleep — agents ship

Go Back
B2C Funnel Systems · Part 9

Personalization Worked. Then Overfit.


Personalization produced lift in week one.

Week three looked random.

Week four looked dangerous.

I saw this after early personalization wins in a long health funnel. The lift was real. Stability was not.

The initial boost was strong.

Then Facebook delivery algorithms shifted. Traffic quality degraded hard.

What stopped working was not personalization logic in isolation. The input stream changed.

What Personalization Actually Was

It was not a giant recommendation model.

It was sequenced messaging.

One screen asks a question.

The next screen uses that answer directly.

“For users who selected X, do Y.”

Then the funnel repeats that pattern across the flow.

Answer on step N becomes context on step N+1.

We used real inputs from the weight-loss goal flow.

Gender.

Starting weight.

Target weight.

Time horizon.

Declared constraints from earlier screens.

The next screen reflected those values directly.

That made the experience feel specific. Conversion moved fast.

Why Early Lift Happens

Any relevant segmentation beats one-size-fits-all.

So first results look great.

Confidence rises.

Then teams add more branches, more rules, more model weight.

Complexity grows faster than signal.

Where Overfit Starts

Overfit shows up as fragility.

Small traffic shifts create large outcome swings.

A variant that won last week loses hard this week.

Nobody can explain why without writing fan fiction.

In this case, traffic source drift accelerated the failure.

Segment behavior changed with the channel quality.

Rules tuned on yesterday’s Facebook profile started missing today’s one.

Guardrails I Use

Personalization runs inside limits.

  • minimum sample floor per segment
  • rollback trigger on retention drop
  • global fallback path always available
  • weekly review of segment stability, not only conversion

This keeps wins while reducing random blowups.

What Changed

We removed low-signal branches.

We kept only segments with repeatable behavior.

The system looked less clever.

It made more money.

Trade-Off

You give up flashy decision trees.

You gain reliability.

More personalization logic means more sensitivity to channel drift.

Reliability compounds. Flash does not.

Takeaway

Personalization is a tool, not an identity.

Constrain it. Measure stability. Keep only repeatable lift.

If the traffic source degrades, revisit assumptions first. The model may be fine. The input is not.


Next: How I’d Start Today If I Had to Start From Zero..