Kashi — Model-Boundary Perspective Technical research memo for developers Date: 2026-04-21 Purpose Turn the “model-boundary perspective” into a concrete technical decision memo for the Kashi team. This is not a fixed plan. It is a critical architecture recommendation plus explicit decision points, required controls, and acceptance criteria. ================================================================================ 0. Executive conclusion ================================================================================ Kashi’s current materials already contain one strong boundary claim: the live production detector is deterministic and does not use Claude in live operation. Claude is currently described as being used for seed authoring, structured classification, and reasoning-heavy tasks, not for the production detector path. But the current materials also contain a real contradiction: the deck repeatedly says “metadata only / no content / none read content,” while some named detectors already imply transcript interpretation or text-derived semantics (unanswered-question rate, topic-credit ignored-turns, agreement-asymmetry), and the live deck explicitly says topic-credit ignored-turns is “deterministic; similarity via embedding distance.” So the real issue is not “AI or no AI.” The real issue is boundary discipline. Brutal version: - “No live LLM” is not the same thing as “no model.” - “Deterministic” is not the same thing as “metadata only.” - “No content-level harassment classification” is not the same thing as “no text-derived processing.” If Kashi leaves this muddy, buyers, workers, security reviewers, and eventually devs themselves will all read different systems into the same words. Recommended technical direction: 1. Keep a strict structural production core as the default path. 2. Split all text-derived features into a separately named semantic lane. 3. Split all generative or assistive features into a separately named generative-assist lane. 4. Keep internal authoring / experimentation in a fully separate non-production lane. 5. Make each lane independently disable-able at tenant level. 6. Stop using blanket “metadata only / never transcribe for analysis” language unless the product truly removes all transcript-semantic logic from runtime. Near-term recommendation: For pilot survivability, the safest default is: - Structural core ON - Semantic lane OFF by default (explicit tenant opt-in) - Generative-assist lane OFF by default for institution-facing flows - Internal authoring/testing completely segregated from customer production data ================================================================================ 1. What the current materials actually support ================================================================================ 1.1 What the docs say clearly - The production detection path is presented as deterministic and not Claude-backed in live operation. - The stack says all detector outputs are pre-baked at build time in the demo, with no live LLM calls on the demo path. - Claude is described as used for seed authoring and structured classification, plus reasoning-heavy tasks. - The architecture is framed as explainable, review-oriented, and non-generative in core detection. - The procurement/security memo explicitly says the right buyer-facing sentence is: the production signal path is deterministic, auditable, and role-bounded, and any optional model-assisted feature should be separately governed and documented. 1.2 What the docs also say, which creates the contradiction - The deck says “patterns, not content, not affect” and “store only structural metadata.” - The same deck also says the system takes meeting transcripts and speaker attribution as input. - The named detectors include: - unanswered-question rate - topic-credit ignored-turns - agreement-asymmetry / position shift logic - These are not cleanly explainable as pure turn-timing logic. - One live deck section explicitly states: “Topic-credit ignored-turns (deterministic; similarity via embedding distance).” - The measurement-science memo is blunt: if Kashi keeps these detectors, it must either become truly structural-only in MVP or become honest that it is a constrained hybrid system. 1.3 Why this matters technically This is not only a messaging problem. If the team does not split these lanes technically, then all of the following become muddled: - data flow - subprocessor disclosure - tenant controls - region guarantees - audit logging - retention policy - model-provider contracts - acceptance testing - procurement answers - failure analysis - contestability workflows ================================================================================ 2. Answering the core questions directly ================================================================================ 2.1 What is deterministic? Deterministic should mean: given the same input artifact and same config, the system returns the same output, with no probabilistic generation. In Kashi, the following can plausibly sit inside the deterministic structural core: - transcript file parsing / normalization - speaker/timestamp ingestion - turn segmentation and ordering - speaking time per speaker - turn counts, turn durations - interruption detection based on overlap + truncation rules - response latency from turn boundaries - speaking-share inequality / floor-time Gini - dyadic interruption continuity across windows - baseline drift against prior comparable meetings - thresholding / score composition rules - abstention rules - RBAC presentation logic - audit-event generation Important: deterministic does NOT automatically mean “metadata only.” A text-derived embedding similarity calculation can also be deterministic. So Kashi needs two separate labels: - deterministic structural - deterministic text-derived 2.2 What is LLM-assisted? LLM-assisted should mean: a feature or internal workflow where transcript text or derived text is sent to a generative or reasoning model that produces a probabilistic or constrained response. Based on the current materials, the following belong here: - seed scenario authoring - synthetic transcript/test-data generation - structured classification for research or evaluation - reasoning-heavy internal analysis - any future victim-explainer narrative generation - any future analyst-side summary or writing assistance - any future optional content-help feature These must not be described as part of the core production detector unless they actually are. 2.3 What is only for internal authoring vs production runtime? Internal authoring / R&D only - seed scenario generation - synthetic transcript creation - red-team / edge-case generation - taxonomy drafting - internal structured annotation help - prompt prototyping for future assistive features - memo writing / deck drafting / scenario authoring Production runtime - actual customer-data ingestion - detector execution - scoring / thresholding / event generation - user-facing or admin-facing UI outputs - private employee views - manager mirrors - executive aggregates - evidence-vault actions - review workflows Rule: internal authoring must never quietly become production runtime through convenience. If a production service calls a model using customer transcript text, that is no longer “internal authoring.” It is runtime model use and must be documented as such. 2.4 If embeddings or semantic similarity are used, where do they run and on whose infrastructure? Current-state truth: the docs do not give a clean enough answer yet. They say topic-credit ignored-turns uses embedding distance, but they do not fully document: - which model - where it runs - whether it is external - which region - what retention terms apply - whether the customer can disable it - whether it can run without leaving the pinned data region That is exactly the gap the procurement/security memo calls out. Technical recommendation: If Kashi keeps any embedding or semantic-similarity detector in runtime, it should live in an explicitly named semantic lane with one of these deployment patterns: Best pattern for strict buyers: A. self-hosted or dedicated non-generative embedding service inside the region-pinned data plane Acceptable but heavier disclosure path: B. external embedding/model provider with: - explicit provider name - explicit region/path - retention terms - ZDR eligibility status - tenant opt-in - audit events for each call Bad pattern: C. vague “deterministic similarity” language with no provider/region/logging answer 2.5 Can customers disable model-assisted lanes? They need to be able to. Not as a future nice-to-have. As a first-class product boundary. Minimum control surface: - org-level toggle: structural core only - org-level toggle: enable semantic lane - org-level toggle: enable generative-assist lane - per-feature toggle for each optional lane - visible admin page showing what is enabled - audit log for any config change - exportable configuration state for procurement/security review If Kashi cannot disable semantic/generative lanes independently, then the boundary is not real. It is just deck language. ================================================================================ 3. Recommended boundary model: four explicit lanes ================================================================================ Lane 0 — Ingestion and normalization lane Purpose Take customer-provided meeting artifacts and convert them into a normalized internal representation. Inputs - transcript text - timestamps - diarization / speaker labels - meeting metadata - calendar/platform metadata if in scope Allowed logic - parsing - normalization - validation - quality scoring - diarization-confidence tagging - language tagging - meeting-type classification only if deterministic and documented, otherwise keep as separate lane Notes This lane is not “harmless” just because it is not generative. It already touches sensitive content and must be region-controlled, logged, and retention-scoped. Lane 1 — Structural detection core Purpose Run the defensible, buyer-safe, deterministic production detector path. Allowed feature types - overlap-based interruption - turn truncation - speaking share - Gini - latency - baseline drift - dyadic continuity - exposure sufficiency - abstention logic - confidence objects driven by input quality and exposure Characteristics - deterministic - non-generative - no external LLM calls - preferably no text semantics beyond normalized turns and timestamps - can be the product’s default ON path This is the lane Kashi can safely center in buyer messaging. Lane 2 — Semantic deterministic lane Purpose Handle text-derived but non-generative logic. Examples - unanswered-question “substantive response” logic - topic-credit ignored-turns via embedding similarity - agreement-shift or position-shift logic - semantic cluster continuity across turns - lexical/topic carry-over Characteristics - may still be deterministic - but not metadata-only - touches transcript text meaningfully - must be disclosed separately - must be independently disable-able - should be labeled “text-derived” everywhere internally This lane is the honest place for constrained hybrid logic. Lane 3 — Generative-assist lane Purpose Assistive features that produce generated or model-authored text. Examples - victim-explainer narrative - optional review support summary - analyst-facing draft note - support-side summarization - internal triage drafting Characteristics - probabilistic or prompt-driven even if schema-constrained - must never be confused with the core production detector - should never silently feed institutional decisions - should be private by default when user-facing - should be off by default in institution-facing flows unless explicitly governed Lane 4 — Internal authoring / experimentation lane Purpose R&D, red-team work, authoring, scenario generation, testing. Examples - seed transcript authoring - synthetic scenarios - benchmark scenario generation - failure-mode exploration - copy drafting Characteristics - not part of customer runtime - ideally no production customer transcripts at all - if any production data is ever used for debugging, it must be separately approved, minimized, logged, and policy-bound ================================================================================ 4. Critical technical diagnosis by detector ================================================================================ Below is the brutal classification Kashi should do immediately. 4.1 Clear structural-core candidates These belong in Lane 1 unless implementation secretly uses transcript semantics: - intrusive interruption - floor-time Gini / speaking-share inequality - raw response latency - dyadic interruption continuity - speaker baseline drift - chilling delta, but only if the trigger event itself is structural and the downstream delta is based on observable participation change rather than content interpretation 4.2 Likely semantic-lane detectors These should not be marketed as “metadata only” if they remain: - unanswered-question rate, if “substantive response” requires semantic judgment - topic-credit ignored-turns, because the docs already say similarity via embedding distance - agreement-asymmetry, if it infers position or stance shift from transcript meaning - any “directive concentration” logic if it depends on semantic or pragmatic classification - any takeover / redirection logic if it moves beyond raw sequencing into discourse interpretation 4.3 Why this split matters If Kashi keeps 4.2 in the same silent bucket as 4.1, three bad things happen: 1. Procurement answers become misleading. 2. Dev architecture stays muddy and hard to test cleanly. 3. Future disputes become uglier because a user will rightly ask: “Did this come from timings, from transcript semantics, or from a model?” And the system must be able to answer exactly. ================================================================================ 5. Recommended product position for the next stage ================================================================================ 5.1 Best near-term technical posture For pilotable clarity, Kashi should adopt this rule: Default production mode = structural core only. Optional hybrid mode = structural core + semantic deterministic lane. Optional assistive mode = structural core + semantic lane + generative assist lane, each separately governed. 5.2 Recommended decision for the team Near-term: - Ship and defend the structural core as the primary product. - Keep semantic detectors behind explicit feature flags. - Treat semantic detectors as beta/controlled until the boundary memo, audit path, region path, and disable controls exist. - Keep generative assistance private-side first, not institution-side first. Long-term: - Kashi probably does need constrained hybrid capability to become genuinely useful for harder phenomena like topic credit, ignored questions, and agreement shifts. - But it should earn that lane honestly, not smuggle it under “metadata only.” 5.3 Strategic reason This gives Kashi the best trade: - buyer-safe default story - honest technical story - future product depth still preserved - easier enterprise procurement later - cleaner dev architecture now - less internal confusion ================================================================================ 6. What must change in wording immediately ================================================================================ 6.1 Lines that should be removed or narrowed Avoid saying: - “metadata only” if text-derived semantic features run in production - “never transcribe for analysis” if transcripts are ingested and semantically processed - “none read content” if embeddings / semantic similarity are used - “deterministic” as if it proves “no text-derived model use” - “no AI” as a blanket statement 6.2 Safer replacements Use: - “The default production detector path is deterministic, non-generative, and structural-first.” - “Optional text-derived features, if enabled, are separately governed and documented.” - “Optional assistive AI features are segregated from the core detector and can be disabled by customer.” - “Kashi distinguishes structural signals, text-derived signals, and generated assistive outputs.” - “No employer-facing content-classification or emotion inference in the core detector.” 6.3 Procurement-safe short version “The core production detector is deterministic and non-generative. Optional text-derived or model-assisted features, if enabled, run in separately governed lanes with explicit customer controls, audit logs, and documented data-flow boundaries.” ================================================================================ 7. Technical controls Kashi should implement ================================================================================ 7.1 Detector registry and boundary tags Every detector and feature should be registered with: - detector_id - boundary_type - STRUCTURAL_DETERMINISTIC - TEXT_DERIVED_DETERMINISTIC - GENERATIVE_ASSIST - INTERNAL_ONLY - input classes used - transcript text required? yes/no - external provider required? yes/no - provider name - region - retention path - output audience - tenant toggle key - confidence objects emitted - contestability surface required Without this registry, the boundary will decay. 7.2 Data-flow segregation - Keep regulated transcript content in region-pinned data plane. - Keep Vercel out of the role of system-of-record for sensitive content. - Ensure logs, traces, and error monitoring do not accidentally persist transcript snippets. - If semantic lane runs, run it near the data plane, not as loose frontend/web-app traffic. 7.3 Auditability At minimum log: - any enable/disable of semantic or generative lane - every external model call - actor, tenant, feature, timestamp - what data class was touched - purpose code - success/failure - review/export events - break-glass admin access - deletion / retention override events And these logs must be exportable at app layer, not only visible inside hosting platforms. 7.4 Tenant controls Tenant admin should be able to: - force structural-only mode - disable all external model calls - disable only generative-assist lane - disable only semantic lane - set region-pinning policy where architecture supports it - review enabled subprocessors/features - export current boundary config 7.5 UI disclosure Every user-visible output should carry lane-aware labeling. Examples: - “Structural signal” - “Text-derived signal” - “Generated explanatory draft” Do not mix them into one silent composite without provenance. 7.6 Contestability hooks If a result depends on transcript semantics or embeddings, the dispute workflow should know that. Different challenges need different remediation: - transcript wrong - speaker attribution wrong - meeting type misclassified - semantics misread - confidence too low - wrong window / missing context ================================================================================ 8. Recommended acceptance criteria ================================================================================ The model boundary is not done when the deck sounds cleaner. It is done when these are true. A. Architecture / registry - Every detector and user-facing feature has an explicit boundary tag. - There is a machine-readable registry for lane, provider, region, and data class. - Structural and semantic lanes can run independently. B. Runtime controls - Structural core remains functional when all model-assisted lanes are disabled. - Semantic lane can be disabled per tenant without code changes. - Generative-assist lane can be disabled per tenant without code changes. C. Data flow - No transcript text is required to pass through Vercel logs for core detection. - Any external model call path is documented end-to-end. - Provider, region, retention, and ZDR status are recorded for every optional model lane. D. Auditability - Model-lane enablement and use are logged at app layer. - Logs are exportable. - Break-glass access is separately logged and reviewable. E. UX / truthfulness - UI distinguishes structural signals from text-derived signals from generated assistive outputs. - Contract/deck/governance copy no longer uses blanket “metadata only” if semantic lane exists. - “No live LLM in the core detector” is preserved and accurately stated. F. Procurement readiness - A model/data boundary memo exists. - A shared-responsibility matrix exists for Supabase / Vercel / any model provider / Kashi. - Customer can receive an exact answer to: what touches text, where it runs, whether it is optional, and whether it can be disabled. G. Validation - Determinism tests exist for structural lane. - Determinism tests exist for any deterministic semantic detector. - Input-quality gating exists for transcript confidence / diarization confidence / overlap quality. - Abstention logic exists where evidence is weak or confounded. ================================================================================ 9. Suggested implementation sequence ================================================================================ Phase 1 — Boundary cleanup - Create detector registry - Tag every detector/feature - Remove false blanket language in code comments, docs, and UI - Add feature flags for semantic/generative lanes Phase 2 — Structural-safe core - Make sure interruption, Gini, baseline drift, continuity, latency, and abstention run cleanly - Verify structural-only tenant mode works end to end - Add audit events and config export Phase 3 — Honest hybrid lane - Move text-derived detectors into separate service/module - Document provider/region/retention - Add per-tenant enablement - Add provenance labels in UI Phase 4 — Assistive lane - Keep victim-side or analyst-side generated features separate from detector logic - Add warning/provenance - Make them easy to disable - Keep them out of HR/discipline-facing flows by default ================================================================================ 10. Final judgment ================================================================================ The strongest technical answer is not “Kashi has no AI.” That is already too sloppy. The strongest technical answer is: Kashi has a deterministic, non-generative structural core. Any text-derived semantic logic belongs in a separately governed hybrid lane. Any generative help belongs in a separately governed assistive lane. Internal authoring belongs outside production. If Kashi does that, the system becomes much easier to defend, build, test, sell, and challenge. If Kashi does not do that, the architecture and the pitch will keep contradicting each other. ================================================================================ 11. Source basis used for this memo ================================================================================ Primary internal sources consulted - Kashi — Progress & Project Overview (2026-04-21) - Kashi - Procurement / Security-Buyer Readiness Memo - Kashi Measurement-Science Research Memo - Kashi product wedge research synthesis