Two Tasks at Once. Forced Learning. Still Failed.

I tried to cure the model’s stupidity by giving it ADHD. Spoiler: it just became an overworked idiot.

If a student fails math, give them a physics test too. Maybe the stress will make them smarter.

That was my logic for Multi-Task Learning. Sequence models failed, direction models failed… maybe overloading the model will work?

My model couldn’t predict purchases. So I forced it to predict the next screen at the same time.

The Theory: To predict the next screen, the model must understand the funnel structure. This “structural knowledge” should help it predict purchases.

The Reality: It became an expert at navigation and a failure at sales.

The Experiment

44K sequences, 112 screens, 1 stupid idea.

I built a model with two heads:

Head A: Predicts “Will they buy?” (Binary classification)
Head B: Predicts “What is the next screen?” (Multi-class classification)

I trained them together. The loss function was Loss = Loss_Purchase + Loss_NextScreen.

The Result

Next Screen Accuracy: 92% (Amazing!)
Purchase Prediction: 0.54 (Useless)

And again: 1.5% conversion means 92% accuracy on navigation is worthless.

The model learned the funnel perfectly.

It knew that “Quiz” leads to “Weight.”
It knew that “Plan” leads to “Paywall.”

It got an A+ in Geography. It got an F in Economics.

The “Map vs Money” Problem

Knowing where a user is going doesn’t tell you if they will pay.

User A: Clicks “Next” to see the price. (Curious).
User B: Clicks “Next” to buy. (Committed).

They both go to Paywall. Only one pulls out the card.

The model correctly predicted the screen for both. It couldn’t tell them apart.

Structure is not Intent. I proved (again) that you can understand the app perfectly and still not understand the user.

The Lesson

Don’t distract your model.

I thought adding a second task would “force” the model to learn better features. Instead, the model optimized for the easier task (predicting the next screen) and ignored the hard task (predicting the sale).

Neural networks are lazy. Multi-task just gave it more ways to avoid thinking.

They will always take the path of least resistance. Predicting the next screen is easy (it’s fixed by the app logic). Predicting a sale is hard (it’s human psychology).

The model chose the easy path. Of course it did. Every ML system is a teenager: do the easy homework, ignore the hard one.

Next: Time-Series Motivation. Model Said No..