Sergey Kopanev - Entrepreneur & Systems Architect

Go Back

Recording macOS Without Root Access


I needed to record Zoom calls.

The easy way: install Soundflower.

The problem: Soundflower is dead.

What Soundflower Did

It was simple.

Install a kernel extension. Create a virtual audio device. Route your system audio through it. Done.

Every audio tool on macOS used it.

Then macOS Catalina killed kernel extensions.

Apple decided your audio routing was a security threat.

The old way stopped working.

What You Actually Need

Two audio streams. At the same time.

Your microphone: Your voice.

System audio: The other person’s voice. The YouTube video. The notification ping.

Mix them. Save to disk.

Sounds trivial.

It’s not.

The One Stream Problem

macOS gives each app one audio stream.

Your microphone can’t be opened by two apps at once.

Try it:

Open QuickTime. Start recording audio.

Open Audacity. Try to record from the same mic.

“Device is already in use.”

So how do Zoom + Slack + Discord all record your mic simultaneously?

They don’t.

They ask the operating system for audio.

The OS handles the mixing.

Your job: convince macOS to give you both mic and system audio without locking anyone else out.

The API Apple Hid

The answer isn’t AVAudioRecorder.

It’s not AVCaptureSession.

It’s ScreenCaptureKit.

The screen recording API.

Which also captures audio.

Apple added it quietly in macOS 13. No announcement. No migration guide.

You use the screen capture API. You ignore the screen. You take the audio.

It works.

Two Streams, Two Problems

ScreenCaptureKit gives you system audio.

AVAudioEngine gives you the microphone.

Both arrive in separate callbacks.

Both at different sample rates.

Mic: 48kHz, every 85ms.

System: Could be 44.1kHz, 48kHz, or 96kHz depending on what app is playing. Every 100ms.

Mix them directly?

After 10 minutes, they’re 3 seconds out of sync.

You need to resample everything to a common format. Then align by timestamp.

Then mix.

The Drift

I thought I had it working.

Recorded a 30-minute Zoom call.

Played it back.

First 5 minutes: perfect sync.

Minute 10: slight echo.

Minute 20: I’m responding to things said 2 seconds ago.

Minute 30: complete gibberish.

The buffers were drifting.

I was mixing based on arrival time, not audio timestamp.

Fixed it. Recorded again.

This time: perfect sync for 30 minutes.

But now the CPU usage spiked.

Timestamp alignment is expensive.

The Normalization Problem

Your mic is quiet.

System audio is loud.

One Zoom participant whispers. Another yells.

You can’t just mix the streams.

You need real-time loudness normalization.

Not peak normalization. That doesn’t work for speech.

Loudness normalization. EBU R128. The broadcast standard.

Every frame: measure perceived loudness. Adjust gain. Apply.

Now your whisper and your yell both sound reasonable.

But you’re burning 8% CPU on an M1 just to normalize in real-time.

Permission Hell

You ask for microphone permission.

User clicks “Allow.”

You ask for screen recording permission (for system audio).

User clicks “Allow.”

You try to record.

Error: AVAudioSessionErrorCodeInsufficientPriority

What?

Because macOS has three permission layers:

  1. App entitlements
  2. User permission prompt
  3. TCC database (the internal record)

Sometimes they desync.

The user said yes. The system says no.

The fix:

tccutil reset Microphone com.yourapp.bundle
tccutil reset ScreenCapture com.yourapp.bundle

Relaunch. Ask again.

Works now.

No explanation why it failed the first time.

What You Get

Three files per recording:

~/nbp-data/{uuid}/
├── raw_mic.ogg      # 42 MB
├── raw_system.ogg   # 38 MB
├── audio_mix.ogg    # 45 MB

OGG Opus. ~1 MB per minute.

Open in VLC. Plays immediately.

No proprietary format. No custom player.

The Trade-Off

This works without root.

But you pay:

Latency: 80-120ms from mic to disk.

CPU: 8% on M1 for mixing + normalization.

Complexity: 600 lines of CoreAudio glue code.

Compare to Soundflower:

Latency: <10ms.

CPU: 2%.

Complexity: Install DMG, done.

But Soundflower required a kernel extension.

Which Apple killed in Catalina.

And will never allow again.

The Reality

This took two weeks.

CoreAudio documentation is terrible.

Sample rates drift for no reason.

Buffers arrive out of order.

Permissions fail silently.

But once it works?

It keeps working.

No kernel panics.

No “this extension is incompatible with macOS 15.”

No security prompts on every OS update.

Just audio. Two streams. Mixed. Saved.

Without asking for root.


Next: Why I store recordings as files, not database rows.