Sergey Kopanev: you sleep — agents ship

Go Back
NBP Follow-ups · Part 1

Auto-Record That Asks First


I keep starting calls and forgetting to record.

By the time I remember, three minutes are gone. Sometimes ten. Sometimes the whole call.

The fix everyone reaches for first: silent auto-record. App detects a call, app records.

I refused to ship that. NBP exists because I did not want meeting data leaving my disk. Recording without asking is the same betrayal in a different costume.

What NBP Is

NBP records on macOS and processes audio locally. Whisper on-device. Files on disk. Zero network. The first article explains the why.

The desktop app handles capture. Click record. Click stop. Pipeline runs.

The recording I missed was the recording I did not click in time.

Mic Activates, App Knocks

The detection is cheap.

Core Audio fires a property listener when the system input device goes “alive”. Zoom opens the mic. Teams opens the mic. FaceTime, Discord, Slack huddles, plain phone calls through the iPhone — all of them flip the same bit.

The detector listens for that bit.

const CALL_APP_PROCESSES: &[(&str, &str)] = &[
    ("zoom.us", "Zoom"),
    ("CptHost", "Zoom"),
    ("Microsoft Teams", "Teams"),
    ("FaceTime", "FaceTime"),
    ("avconferenced", "FaceTime/Phone"),
    ("callservicesd", "Phone Call"),
    ("Slack", "Slack"),
    ("Slack Helper", "Slack"),
    ("Webex", "Webex"),
    ("Discord", "Discord"),
    ("Skype", "Skype"),
];

Mic goes live. Walk the process list. Match a name. Fire a notification.

Zoom — Call Detected
Click to start recording

The user clicks. The window pops. The recording starts.

The app does not record on its own. It announces. The user decides.

That is the whole product decision. Everything below is the engineering it took to make that one click reliable.

The First Flag

Initial version was sloppy.

I set a global AtomicBool when the notification fired. The window-focus handler checked the flag and started recording when focus came in.

static PENDING_CALL: AtomicBool = AtomicBool::new(false);

pub fn take_pending_call() -> bool {
    PENDING_CALL.swap(false, Ordering::SeqCst)
}

It worked the first three times.

Then I clicked the menubar to change a pipeline. The window focused. Recording started.

I had not been on a call in two hours. The flag had been sitting there since the last notification I dismissed.

A flag with no expiry is a trap waiting for the user to walk into it.

TTL on the Trap

Quick fix: stamp the flag with a timestamp. Reject anything older than 30 seconds.

static PENDING_CALL_AT: AtomicU64 = AtomicU64::new(0);

const PENDING_CALL_TTL: u64 = 30;

pub fn take_pending_call() -> bool {
    let ts = PENDING_CALL_AT.swap(0, Ordering::SeqCst);
    ts > 0 && now_epoch().saturating_sub(ts) <= PENDING_CALL_TTL
}

Now stale flags die. A notification I did not click is forgotten 30 seconds later. The window-focus path stops triggering on flags that have nothing to do with calls.

Better. Still wrong.

The whole architecture is “detect a click by inferring it from a focus event that may or may not have come from the click.” That is a guess dressed up as logic.

The Real Click Signal

The OS knows when a notification was clicked. The OS will tell you. I just was not using the channel that says so.

Switched to mac-notification-sys with wait_for_click(true). Spawn a thread, fire the notification, the thread blocks until the user clicks the banner. No flags. No timestamps. No focus-handler heuristics.

thread::spawn(move || {
    let response = send_notification(&title, ..., wait_for_click(true));
    if response.clicked() {
        show_window(&app);
        start_recording(&app);
    }
});

Click detection became deterministic.

Then the deprecation warning showed up. NSUserNotificationCenter — the API mac-notification-sys rides on — has been deprecated for a while. macOS still tolerates it. macOS will stop tolerating it.

Final swap: UNUserNotificationCenter via objc2-user-notifications. Modern API. Click handled by a real UNUserNotificationCenterDelegate. The delegate gets called by the system. No threads waiting. No flags expiring. No deprecation count down.

The call from the OS to the app is now: notification was clicked, here is the identifier, here is the action. The app reacts.

That is what the first version was trying to fake.

Tray, Not Dock

The window-on-close behavior changed too.

The user closes the window mid-day. Most apps die at that point. NBP needs to keep listening.

WindowEvent::CloseRequested { api, .. } => {
    api.prevent_close();
    let _ = window.hide();
}

The window hides. The tray icon stays. The detector keeps watching the mic.

[●] NBP
   Quick start
     ├ summarize
     ├ extract-actions
     ├ daily-standup
   Show window
   Quit

Tray-only mode is what makes the auto-detect feature useful. If the user has to keep the window open to get notifications, the feature does not survive the first inbox check.

The Trade-Off

Cost: an extra background thread per notification, a Core Audio listener that runs full-time, one tray menu to maintain, and a settings toggle to turn the whole thing off when the laptop is on a private call.

Benefit: I stopped missing the start of meetings. The first three minutes — the part where someone says what the meeting is actually about — get captured now.

Bigger cost the first version paid: trust. The flag-with-no-expiry started a recording I did not ask for. Once. The fix was technical. The lesson was not.

A privacy-first app cannot have surprise recording moments. Not even one. The whole architecture has to make “record” impossible without an explicit click that came from a notification that came from a real call.

Takeaway

Auto-detect is fine. Auto-record is not.

Detect the call. Show the user. Wait for the click. Use the OS click signal, not a flag you set and hope to clean up later.

The user stays in charge. The mic stays off until they say so.


Next: The Night Realtime Stopped Talking.