ClarityLift · Methodology · v1.0
How a message becomes a team-level signal.
Public methodology, written so a procurement reviewer or security engineer can verify each step. Not the source code. The methodology. Hidden methodologies are how shortcuts get hidden.
The pipeline
Every inbound message follows the same path. The path runs in this order, every time. Each step is described below.
- Webhook arrives at our handler.
- DM rejection at ingest. Direct messages drop here.
- Channel opt-in check. Disabled channels drop here.
- Per-employee consent gate. Non-consenting senders drop here.
- Silence-baseline counter increments. Metadata only.
- Cross-channel dedup. Reposts of the same text drop here.
- Fast classifier. Regex-based, rule-only, no I/O.
- LLM classifier. If the fast classifier marks the message as signal-worthy.
- Floor check at write. Sub-10 teams produce no row.
- HealthSignal row persisted with seven fields. No text.
Every step has a deliberate purpose and a deliberate place in the order. Reordering changes the privacy posture. The order itself is part of the methodology.
The classifier path
Webhook arrives at our handler.
DM rejection at ingest.
Channel opt-in check.
Per-employee consent gate.
/my-data. Every flip on the consent gate is audit-logged.Silence-baseline counter increments.
Cross-channel dedup.
Fast classifier.
LLM classifier (only when needed).
- Azure OpenAI. Default. Prompts stay inside Microsoft's Azure boundary; OpenAI as a company does not receive the data. 30-day abuse-monitor retention only when safety systems flag a prompt; near zero in practice for workplace conversation.
- Anthropic. Available under a contracted zero-data-retention agreement. Used when the customer prefers Anthropic and the ZDR contract is in force.
- OpenAI direct. Phase-0 legacy path. 30-day default retention. Production workspaces are not on this path.
store: false). The moment the classifier returns, the message body drops out of scope and is garbage-collected. No row in our database ever contains the message text.Floor check at write.
MIN_GROUP_SIZE is altered.HealthSignal row persisted.
The six signal types
The LLM classifier emits one of six signal types. Each answers a different question about the team.
Friction
Recurring cross-team tension, escalation frequency, blame language. Computed at the team level only; never per-individual.
Disengagement
Declining participation across channels the employee normally engages in, withdrawal from strategic conversations, response shortening. Aggregate across the team, not per-person.
Communication health
Cross-functional dialogue frequency, information bottlenecks, response-time degradation, siloing patterns.
Culture drift
Tone-pattern shifts, values-alignment signals, psychological-safety indicators. Drift over time relative to a team's own baseline, not a peer comparison.
Retention signals
Team-level stability indicators based on communication-pattern aggregates. Surfaces dynamics that historically correlate with team-level turnover. Never an individual flight-risk score; the schema cannot represent one.
Alignment
Direction of team activity relative to admin-defined organizational goals. Reinforce / contradict / neutral. Opt-in feature; off by default until an admin defines goals at /dashboard/strategy.
Severity for each is one of low / medium / high. Confidence is a 0-1 score from the classifier. Both fields land on the HealthSignal row.
Aggregation rules
Every output the dashboard surfaces is bound by the same aggregation rules. These are the floor commitments that the classifier path's step 9 enforces at write time.
- Minimum team size of 10. Teams below 10 produce no signal at all. Not a hidden output. Not a dimmed reading. No row.
- Cohort filtering re-applies the floor. If an admin filters the dashboard by HRIS metadata (department, team type, tenure bucket), the resulting cohort must still be ≥ 10 members. Cohorts below the floor render “below floor” with no aggregate values.
- Cross-customer aggregation is opt-in only. By default, your data does not contribute to platform-level benchmarks. See /transparency § 5 for the disclosure.
- Customer-level floor of 10 applies to platform-level benchmarks. No benchmark publishes from fewer than 10 opted-in customers. Same k-anonymity principle, applied at the customer tier.
Calibration
New channels enter a 30-day calibration window. During calibration, the per-message pipeline runs end to end (the DM gate and the floor still apply) but signals are not persisted. Calibration is how the classifier learns the channel's baseline for tone, response time, and participation distribution. Without calibration, the first week of signals would all read as anomalies relative to nothing.
After calibration completes, the channel transitions to active and signals start firing. Admins can pause a channel at any time, which suspends signal generation but preserves historical scores.
Retention
- Message text: zero retention. Processed in memory during the LLM call. Discarded the moment the classifier returns.
- HealthSignal rows: retained indefinitely by default. Per-org retention policies on the enterprise roadmap.
- Audit log: 7-year retention floor. Survives customer offboarding (with disclosure to customer at offboard).
- Consent records: 7-year retention floor. Same survival rule as audit log.
- Channel-volume metadata (silence baseline): retained as long as the channel is enabled, plus 30 days.
Failure modes
When something in the pipeline fails, the failure is visible, not hidden.
- LLM call fails or times out: the message drops. The fast-classifier output is logged with `skip:llm-fail` and no signal is persisted. The signal does NOT default to “verified” or any positive state.
- Floor check throws (membership count unresolvable): the insert is rejected. We treat the signal as if the team were below floor. Fail-closed.
- Provider boundary breach (e.g., a sub-processor is found storing prompts in violation of the published terms): the integration is cut within 7 days per /service-standards. Until the integration cutover lands, the affected path is disabled.
Version
Current version: v1.0
Last updated: 2026-04-25
Methodology changes ship with a version bump on this page and a paired entry in the changelog below. The CI rule HIGH-SV-METHODOLOGY-VERSIONED blocks merges where this page is updated without a Version and Changelog header.
Changelog
- v1.0 (2026-04-25). Initial publication. Closes the methodology gap on INTEGRITY.md. Documents the 10-step classifier path, the six signal types, the aggregation rules, calibration, retention, and failure modes.