Skip to main content

Privacy Architecture

How ClarityLift listens to work conversations without creating a surveillance tool.

The product exists in a category where the default assumption is intrusive. This page is the technical spine of how it is actually built, so you can verify the boundary before you adopt it.

1. Architecture

How the ingest pipeline is built

ClarityLift connects to Slack or Microsoft Teams via standard OAuth. The connection is scoped to channel read permissions only. Direct-message scopes are never requested. The moment an event lands at our webhook, it passes through four filters before anything else happens.

Filter 1. DM rejection at ingest

Every inbound webhook carries a channel type indicator. If the channel type is a DM or a group DM, the handler returns HTTP 200 and drops the event before any processing runs. This is enforced on the server, not in the classifier — a misconfigured classifier cannot expose DMs because the DMs never reached the classifier. Code lives in src/app/api/events/slack/route.ts and its Teams equivalent.

Filter 2. Channel opt-in

A channel is analyzed only if an admin has explicitly enabled it in the ClarityLift dashboard. The opt-in defaults to off. Connecting a workspace does not analyze anything until channels are picked. Disabling a channel stops analysis within 30 seconds on the next ingest cycle, and deletes pending jobs in the durable queue.

Filter 3. In-memory processing, retention-zero

The message body is held in a request-scoped variable long enough to run the fast classifier (regex pattern matching, no I/O) and, if the message matches a pattern of interest, a single call to the LLM provider. The provider call uses retention- zero request options (OpenAI {store: false}, Azure OpenAI inside Microsoft's boundary, Anthropic ZDR contract). The moment the classifier returns, the message body drops out of scope and is garbage-collected. No row in our database ever contains the message text.

Filter 4. Ten-person minimum for every output

Signal rows carry a channelId and a teamId, never a user id. Before any team-level score is computed, a database subscriber checks that the associated team has at least 10 members. Teams below that floor are dropped from the scoring output entirely. The floor is enforced at the database layer, not the UI layer — it cannot be bypassed by a misbehaving client. Per-channel admins can raise the floor above 10 but not lower it.

What lands in the database is small. A row per detected signal with seven fields: organization id, team id, channel id, signal type, severity, confidence, and a detection timestamp. That is the entire persisted shape. No message id that resolves to a user, no message text, no author, no reactant list.

2. What employees see

The transparency surface

ClarityLift is a tool the admin has to be willing to tell the team about. The product is designed so the admin can actually have that conversation without walking back.

Which channels are connected is visible

Any employee can ask their admin which channels are analyzed. ClarityLift's settings page shows the list. We recommend admins share that list with the team at the time of the announcement, and update it publicly when channels are added or removed.

The bot is visible in the channel

When Slack or Teams auth completes, the ClarityLift bot user is present. An employee clicking its profile sees the tool's name, the scopes it requested, and a link to this page. There is no way for ClarityLift to analyze a channel without leaving that footprint.

No output ever names an individual

Every dashboard view, every email, every API response is scoped to a team of 10+ people. An admin viewing the dashboard cannot ask it “who said what.” The answer is not hidden behind a permission — it is not a question the schema can answer. See the “cannot do” section below.

Opt-in is the default posture

ClarityLift does not analyze a channel unless the admin has enabled it. We do not provide a “analyze everything by default” mode. If an employee asks their admin to remove a specific channel from analysis, the admin can disable it in a click and signals stop flowing within 30 seconds.

3. What the product cannot do

Negative capabilities, listed explicitly

The category this product sits in is crowded with tools that claim these limits as policy. For ClarityLift they are structural — baked into the schema and the ingest pipeline.

  • Cannot identify individual speakers in any output.

    No output row carries a user id. The signal schema has no such column. If an admin wanted to retroactively attribute a signal to a specific person, they could not — the data to do it was never persisted.

  • Cannot analyze direct messages or group DMs.

    DM scopes are never requested during OAuth. If Slack or Teams somehow delivered a DM to our webhook anyway, the handler rejects it before any classifier call.

  • Cannot retrieve deleted messages.

    We do not store message text at all, deleted or not. If a message is deleted before our fast classifier processes it, we never see it. If it is deleted after, we have no body to recover.

  • Cannot export message content.

    DSAR exports (data subject access requests) include every persisted row. The message-text row does not exist. The export contains signal counts, pillar scores, insights, and audit entries. Nothing transcribable back to a conversation.

  • Cannot score teams below 10 members.

    The compliance floor is a pre-insert database check. A team of 9 produces no team-level score — even if the admin asked for one, even if every signal confidence was 1.0.

  • Cannot be used for hiring, firing, performance review, or compensation decisions.

    This is prohibited in the Acceptable Use Policy every admin acknowledges at onboarding. It is also why outputs never include individual attribution — they could not be used that way even if a rogue admin wanted to.

4. Data handling

Retention, encryption, subprocessors, deletion

Retention

Message text: zero retention. Processed in memory, discarded.
Signal rows, scores, and insights: retained indefinitely by default; per-org retention policies are on the enterprise roadmap.
Audit log: 7-year minimum (SOC 2 compliance floor), enforced at the database.

Encryption

In transit: TLS 1.2+ for every connection, including the webhook ingress, the dashboard, the LLM-provider egress, and the Key Vault calls. In rest: Azure SQL transparent data encryption plus column-level encryption on every OAuth refresh token using CL_ENCRYPTION_KEY (stored in Azure Key Vault, never in the repo).

Processing location

Primary region: Azure East US. Databases, application servers, Key Vault, Redis cache, and Azure OpenAI deployments all live inside that region. LLM provider traffic either stays inside Microsoft's boundary (Azure OpenAI) or rides a contracted ZDR agreement (Anthropic). OpenAI direct is a Phase 0 legacy path with 30-day retention; production workspaces are Azure OpenAI by default.

Subprocessors

Four today: Microsoft Azure (infrastructure + Azure OpenAI), Anthropic (optional ZDR classification), Azure Communication Services (transactional email), and PostHog for authenticated-dashboard product analytics only. PostHog is explicitly not loaded on marketing pages — the browser SDK initialises only inside /dashboard/*, autocapture is off, and session recording is disabled. The marketing site uses our own self-hosted event capture. No Google Analytics, no Segment, no ad tech, no session-replay tool anywhere on the product.

Deletion

A workspace admin can request full deletion from the settings page. The request enters a 30-day grace window during which ingestion pauses and cancellation is possible. After 30 days, every signal, score, insight, and connection record is physically deleted. The Organization row survives as an anonymized stub so audit log foreign keys remain valid — nothing else.

6. Implementation

What the first 30 days look like

  1. Admin creates the workspace and acknowledges four consent documents. DPA, Privacy Notice, Data Processing Terms, AUP. All four versions are recorded with timestamp. Acknowledgment is org-level, not per-employee.
  2. Admin connects Slack or Teams via OAuth. Scopes are read-only, no DM access. The bot user is now visible in the workspace.
  3. Admin picks the channels to analyze. Defaults to none. We recommend picking the standing team channels you actually want visibility into — #engineering, #product, #sales-floor, #customer-success — and skipping #random, #off-topic, #hr-investigations.
  4. Admin announces to the team. The deployment playbook provides a pre-announcement Slack message, all-hands speaking notes, a written FAQ, and a hard-questions prep doc. Five assets, sequenced across the announcement arc.
  5. Days 1 to 3: the baseline builds. Signals start flowing within minutes, but trend analysis and anomaly detection activate at day 3. Early scores carry a “warming up” caveat on the dashboard so nothing is overinterpreted.
  6. Days 3 to 7: first insights. The daily scoring job produces pillar scores. When a pillar dips below its threshold, rule-based insights fire. Admins can mark an insight “not useful” and the nightly threshold-tuner adjusts that org's cutoffs over time.
  7. Day 7: first weekly digest. Monday 9am in the org's local timezone, a scannable email lands in every user's inbox with the week's shifts. Users can opt out at any time from settings.
  8. Day 30: full baseline. Anomaly detection reaches full confidence. The 30-day follow-up step of the announcement playbook lands: the admin re-confirms the boundary to the team and shares what was learned without naming individuals.

Ready to talk about what you'd measure?

The architecture is the first conversation. Once it is settled, the second conversation is what signals would matter for your team.