Do Mentalyc, Upheal, and Blueprint actually upload session audio?

Yes. Each is a cloud SaaS, and the cloud-SaaS architecture necessarily requires the audio to be transmitted to the vendor's servers in order for the vendor's transcription and drafting pipeline to run. The relevant question is not whether the audio is uploaded but where it is stored, who its subprocessors are, how long it is retained, and which BAA terms apply to that storage and processing.

Do these vendors use the audio or notes to train AI models?

All three publicly state that they do not use customer PHI for model training without explicit opt-in, and many cloud AI scribes have begun front-loading this language in their marketing because clinician concern has made it a buying criterion. The contractual statement is real and meaningful. It is also a contractual rather than architectural guarantee, and it depends on every subprocessor in the chain honoring the same restriction.

Which third parties touch the data once a vendor receives it?

Cloud AI scribes typically use a chain of subprocessors: cloud infrastructure (AWS, GCP, or Azure), a transcription engine (often a third-party speech-to-text API), a large-language-model provider (Anthropic, OpenAI, Google, or a self-hosted model on a cloud GPU), payment processing (Stripe), product analytics, error monitoring, and sometimes customer-support tooling. Each vendor's BAA and subprocessor list spells out which of these apply.

How long is the audio retained?

Retention varies materially by vendor and by data type. Some vendors auto-delete raw audio shortly after transcription completes; others retain audio for the life of the account; others give the clinician a configurable retention window. The clinician should read the specific vendor's data-retention clause for each artifact (audio, transcript, draft, edits) separately, because the rules can differ.

How is on-device different from a vendor that auto-deletes audio?

Auto-deletion is a contractual promise about a vendor's behavior. On-device means there is no vendor copy in the first place — the audio, transcript, and draft were never transmitted, so there is nothing to retain or delete. The threat models are categorically different: a deletion policy can fail (subpoena before deletion completes, backup-system retention, subprocessor caching); the absence of transmission cannot fail in the same way.

Blog · HIPAA · 2026-04-30

The 7 things Mentalyc, Upheal, and Blueprint actually send to their servers

No reputational claims, no leaked-document drama — just a category-by-category read of what a cloud AI scribe necessarily transmits, stores, and processes by virtue of being a cloud AI scribe. The information is sitting in each vendor's own privacy disclosures; this post collects it in the order a private-practice clinician should evaluate it.

TL;DR

A cloud AI scribe — Mentalyc, Upheal, Blueprint, Supanote, Freed, CliniScripts — is, architecturally, a SaaS application. Its servers receive whatever the desktop or mobile client uploads, and the SaaS pipeline cannot run otherwise. Across the public privacy policies of the three best-known therapy-specific scribes (Mentalyc, Upheal, Blueprint), the same seven categories of data move to the vendor: the session audio, the speech-to-text transcript, the AI-generated draft, the clinician's edits to that draft, account and practice metadata, session metadata, and telemetry / usage analytics. None of that is hidden. All of it is disclosed. The interesting question is which categories your specific threat model wants gone — and whether "deleted on a schedule" or "never transmitted in the first place" is the answer your practice needs.

→ Architecture, not contract: see What is a BAA, actually — and what it does NOT cover for why the vendor's promises and the vendor's data flow are different layers of the same problem.

Why this matters in 2026

Two shifts have made the contents of a cloud AI scribe's data flow a buying criterion, not a back-of-the-document footnote. First, the 2024–2026 wave of plaintiff-side discovery against AI vendors has made clear that records held at a vendor are reachable by subpoena directly, on the vendor's timeline, with notification governed by the BAA rather than by the clinician's preference. Second, the steady drumbeat of subprocessor breaches across SaaS in general — none specific to therapy AI, but plenty in adjacent industries — has reframed "we have a BAA" as a layer of liability allocation rather than a layer of physical protection. Both shifts point at the same question: what is actually in this vendor's possession that I would not want produced or exposed?

This post answers that question for the three most-searched cloud therapy-scribe vendors of early 2026 — Mentalyc, Upheal, Blueprint — at the category level. Specific clauses change, retention windows are tweaked, subprocessors get added and removed; the clinician should still read each vendor's own current privacy policy and BAA. What does not change is the architectural shape of a cloud SaaS, which is what most of the categories below are forced by.

The seven categories

Every cloud AI scribe of this shape transmits and stores the following seven things. The names are ours; the underlying data flows are described in each vendor's public privacy disclosures.

1. The session audio. The recording is uploaded to the vendor either as the session is happening (live capture) or as a file after the fact (post-session upload). The audio contains the client's voice, the clinician's voice, and whatever was said in the room. This is PHI in the strongest, most-protected sense the relevant statutes recognize. All three vendors handle it under a signed BAA. All three have a documented retention window, which differs by vendor and sometimes by plan tier.
2. The speech-to-text transcript. Either generated by the vendor's own ASR or by a subprocessor speech-to-text engine, the transcript is a verbatim or near-verbatim text representation of the audio. It is materially richer than a clinician's progress note because it contains the raw language: the client's specific words, the affect timestamps, the asides, the pauses. Discovery practitioners in 2025–2026 have specifically learned to ask for transcripts because they expose facts that a summary note would not.
3. The AI-generated draft note. This is the artifact the clinician originally signed up for: a SOAP, DAP, BIRP, or GIRP draft generated by feeding the transcript (and prompt scaffolding) into a large language model. The draft is held server-side at least long enough to deliver it back to the clinician's UI and, in most cases, longer — for revision history, for compliance audit logs, and for clinician access to past notes from any device.
4. The clinician's edits to the draft. When the clinician corrects "client expressed mild anxiety" to "client described escalating panic," the diff travels back to the vendor. Edits are usually retained as part of the version history for the note. Some vendors have, in their published documentation, framed edit telemetry as a quality-improvement signal — the contractual training-on-data restriction usually still applies, but the diff itself is data the vendor has.
5. Account and practice metadata. Clinician name, license number (where supplied), NPI, practice name, billing email, jurisdiction, EHR system, plan tier. This is normal SaaS account data and is rarely the privacy concern, but it ties the clinician's identity to every artifact above.
6. Session metadata. Timestamp, duration, format chosen (SOAP / DAP / BIRP / GIRP), client identifier (initials, MRN-equivalent, or pseudonym depending on how the clinician enters it), template used, target EHR or paste destination, sometimes diagnosis code. Session metadata is what makes the session indexable inside the vendor's UI; it also makes the session findable inside the vendor's database.
7. Telemetry and usage analytics. Page-level events, feature usage, button clicks, error logs, performance traces, sometimes IP address and device fingerprint, often a third-party analytics or observability tool's session replay (with PHI redaction varying by configuration). These are the same kinds of telemetry every modern web app emits; the BAA covers them, but they are still a separate stream of data with its own subprocessors.

What the public disclosures actually say (early 2026 read)

The three vendor pages most likely to be the source-of-truth a private-practice clinician would land on if they read the privacy section of each product carefully:

Mentalyc. Public privacy and security pages describe a HIPAA-aligned posture with a signed BAA, AWS US-region storage, AES-256 at rest and TLS in transit, an audio-retention window that the customer can adjust, and a list of subprocessors that includes a hyperscaler (AWS), at least one third-party LLM provider, payment processing (Stripe), and product-analytics tooling. Mentalyc's marketing has been the loudest of the three in stating that customer audio and notes are not used to train models without opt-in.
Upheal. Public documentation describes the same general shape: cloud SaaS with a BAA, support for both audio-only and video sessions, US and EU data-region options, third-party LLM and ASR subprocessors, and standard SaaS analytics. Upheal's positioning includes a stronger video-recording surface than the audio-only competitors, which means the data category list extends to video frames in addition to audio for clinicians using that workflow.
Blueprint. Public privacy disclosures describe the pay-per-session pricing model layered on the same SaaS architecture: audio uploaded, transcribed, drafted, returned. Blueprint's per-session pricing means a billing event accompanies each note, and that billing data — session count, frequency, plan tier — sits with the payment processor in addition to the vendor.

None of these descriptions are reputational claims. They are restatements of what each vendor publishes about itself. The author of any cloud SaaS that processes PHI is required by law and by industry custom to publish exactly this information, and these vendors do.

The shape these seven categories produce together

Read the seven categories as a stack rather than a list. The audio is the bottom of the stack — the rawest, most identifying form of the session. The transcript is one level up, computationally derived from the audio but still verbatim. The draft is one level up from the transcript, an LLM-generated summary that nonetheless contains material drawn directly from what was said. Edits are the diff layer. Metadata is the index. Telemetry is the periphery. A subpoena, breach, or policy change can reach into any single layer of that stack independently of the others, and the clinician's mitigation options differ at each layer.

The contractual answer to all of this is the BAA. The BAA distributes liability, names subprocessors, and articulates what the vendor will and will not do with each category. It is a real protection. It is also a layer of paper that sits one degree removed from the data itself: when the vendor's subprocessor's storage ends up in a discovery instrument or in a security incident, the BAA tells you who pays and who reports, but the data was already where it was. The architectural-vs-contractual distinction is not a marketing line; it is the difference between a promise about behavior and a fact about location.

What changes when the inference happens on the clinician's Mac

An on-device AI scribe collapses the seven-category stack. There is no server-side audio because the audio never opened a network socket. There is no server-side transcript because whisper.cpp ran locally on the M-series chip. There is no server-side draft because the draft was generated by a quantized 14-billion-parameter model on the same machine. There are no clinician edits to ship back because the edits never left the editor. Account metadata still exists for license activation, but it is decoupled from the session contents. Session metadata stays in the local hash-chained inference log, where the clinician can show it to a court on demand. Telemetry, in the on-device case, is structurally minimal — there is no SaaS that needs analytics to understand its users.

The tradeoff is honest. On-device inference requires a recent Mac (M-series, macOS 14+), pins the workflow to a single device, and shifts model maintenance to the user when a new local model becomes available. Many clinicians do not need this level of architectural certainty and are well-served by a cloud scribe with a strong BAA. The position TherapyDraft takes is that some clinicians, in some practice contexts — boutique private practice, cash-pay clientele, populations with elevated subpoena exposure, supervisors in training programs — have a threat model that the cloud SaaS shape cannot reach, no matter how good the BAA is. TherapyDraft is the tool for that subset.

How to read this in practice

For each vendor under consideration — TherapyDraft included — write the seven categories down the left side of a page and put one of three answers next to each: transmitted and retained, transmitted and deleted on a schedule, or not transmitted. Read the vendor's privacy policy, BAA, and subprocessor list with that grid in front of you. The questions to ask of each row are: (a) under what circumstances would this category be produced in response to a subpoena, (b) which third party would the vendor have to depend on to honor any deletion or non-training commitment, and (c) does the vendor's notification clause guarantee you find out about a request before the production happens.

The grid takes ten minutes to fill in for any given vendor. The clinician who does it once becomes capable of evaluating the next AI-scribe vendor in five minutes. The clinician who does not, by the time the relevant subpoena or incident lands, is in the position of asking a customer-support agent what just happened to their client's data — which is the worst possible time to discover that a SaaS product is, mechanically, a SaaS product.

An honest close

The point of this post is not to claim that Mentalyc, Upheal, or Blueprint do anything wrong. They do exactly what their published policies say they do, with the BAAs and security postures that the relevant statutes require. The point is to make legible what every cloud AI scribe necessarily holds, so that a clinician choosing a tool can compare like with like rather than comparing marketing surface area. Once the comparison is at the data-category level, the choice is no longer "cloud or local" framed as a tribal preference; it is a specific question about which categories the practice's threat model says should not exist on someone else's server, and whether contractual deletion or architectural absence is the answer the practice wants.

A clinician who finishes the seven-row grid and concludes that a cloud scribe is the right tool for their practice should buy one and use it confidently. A clinician who finishes the grid and finds that two or three of the seven rows feel intolerable on a vendor's server is the clinician for whom on-device exists. The information is the protection; the binary opinion was never the protection.

Run the five-question gap check

Our BAA Coverage Gap Quiz turns the seven-row grid into five questions you can answer in sixty seconds. The quiz runs entirely in your browser; nothing is sent to us. Run the BAA Coverage Gap Quiz

Try TherapyDraft

The private beta is free for ten sessions — no credit card, no upload. Install the signed .dmg, grant microphone access, draft your first note on the laptop that already holds your calendar and your EHR login. If the workflow doesn't improve on your current setup, uninstall. Because nothing was ever shipped anywhere, there is nothing to retrieve.

Join the private beta

This post describes the architecture of cloud AI scribes for therapists at the category level, as it stood in early 2026, drawn from publicly available vendor disclosures. Vendor policies, retention windows, subprocessor lists, and BAA terms change. Verify each vendor's current published documentation directly before relying on this post for a procurement decision. This is general information, not legal or compliance advice.