Methodology

How we measure detection accuracy, what those numbers mean, and what we publish honestly.

Quiet disclosure • No clinical claims • Trust earned over time

What We Measure

AwareFlow uses one binary classifier per habit. Right now there are two: a sniff classifier and a throat clearing classifier. Each one is trained on labeled audio examples to recognize a single specific sound.

Detection runs entirely on your device using the Apple Neural Engine. Audio is analyzed in real time and immediately discarded. Nothing is recorded or sent anywhere. The numbers below are about how reliably the models recognize their target sound, not about anything stored from you.

What We Currently Report

The figures below come from CreateML training, which evaluates each model against held-out portions of the same labeled dataset used to train it.

These numbers reflect how the model performs against held-out splits of its own training data. That is internally meaningful but not the same thing as field accuracy. Field accuracy depends on your environment, your microphone, the distance to your phone, and your individual sound profile.

What These Numbers Don't Mean

A training metric is not a clinical claim. AwareFlow is not a medical device and does not measure or treat any condition.

Real-world accuracy varies. Quiet rooms behave differently from cafes. AirPods behave differently from a phone microphone on a desk. A throat clear from someone with a cold has different acoustics from a clear throat clear during a meeting. The training metrics above don't capture any of that.

The Calibration Lab personalizes the threshold per user, not the model itself. So your experience improves as the system learns what is yours, rather than because the underlying classifier changed.

How AwareFlow Communicates Trust

AwareFlow does not show users a static accuracy percentage in the app. No single number describes your personal experience, so showing one would be misleading.

Instead, AwareFlow uses a Detection Maturity system: every habit moves through three states as the app gathers your feedback.

Trust is earned through use, not claimed up front. The state label is visible in the app so you always know what stage you're in.

What We're Working On

Two longer-term measurement efforts are planned. They live here so the methodology stays honest as the product grows.

Both efforts are about giving you, and the broader research community, a defensible way to evaluate how well AwareFlow does what it claims to do.