Sergey Kopanev - Entrepreneur & Systems Architect

Go Back

User Intent Prediction #11: Model Export Conversion.


I had a working model. Then I met the Devil. His name was tf2onnx.

I had a working model. I just needed to put it on the website. How hard could it be?

Narrator: It was extremely hard.

I entered Dependency Hell. I met the Devil. His name was tf2onnx.

The Three Paths to Failure

I tried the “standard” ways to deploy a TensorFlow model to the web.

Path 1: TensorFlow.js

  • Promise: “Run TF models in the browser!”
  • Reality: 2MB bundle size. Slow startup. Crashed on mobile Safari.
  • Error: WebGL context lost. Retrying... (It never retried).
  • Verdict: Too fat.

Path 2: TFLite

  • Promise: “Lightweight mobile inference!”
  • Reality: Requires a custom runtime. Documentation is a maze of broken links.
  • Error: Unsupported op: GatherV2 (My model used a basic gather operation).
  • Verdict: Too complex.

Path 3: ONNX (The Standard)

  • Promise: “Universal format!”
  • Reality: tf2onnx failed. Version mismatch. Opset 13 vs Opset 18.
  • Error: ValueError: Cannot convert model with opset version 18. Supported: 13.
  • Verdict: Broken.

Dependency Hell

I spent 3 days fighting version numbers.

pip install tensorflow==2.15
# → breaks numpy
pip install numpy==1.26
# → breaks pandas
pip install pandas==2.1
# → breaks tensorflow

I created a virtual environment. I created another virtual environment. I created a Docker container.

Nothing worked.

I wasn’t doing ML. I was doing plumbing.

The Solution: Ignore the Tools

I realized something. My model was just a matrix multiplication. Output = Input * Weights + Bias

Why was I installing 2GB of libraries to do multiplication?

I wrote a custom script in 50 lines of Python:

  1. Load the TensorFlow model.
  2. Extract the weight matrices.
  3. Write them to a raw ONNX file manually (using onnx library directly).
  4. Run it with onnxruntime (minimal runtime, 12MB).

It worked instantly.

  • Size: 9KB (vs 2MB for TF.js).
  • Dependencies: One (onnxruntime).
  • Speed: Blazing (27μs per prediction).

The Lesson

Tools are fragile. Math is solid.

The “standard stack” is bloated and brittle. When the tools fail, go back to the math. A matrix multiplication doesn’t have version conflicts.

I stopped being a library user and started being an engineer.


This leads to smart discounts and regret—now that the model is live, let’s see if it actually makes money.