Kashi — Security / Enterprise-Assurance Perspective Technical research memo for developers Prepared: 2026-04-21 Format: plain-text internal working memo Purpose: convert the current Kashi concept and shipped architecture into a security / enterprise-assurance view that is actually usable by engineering while the product plan is still being fixed. Important scope note This is not a fixed implementation plan and not formal legal advice. It is a technical decision memo: what must be true for Kashi to survive enterprise security review, what is still weak, what should be treated as a hard architectural boundary, and which acceptance criteria must be cleared before a serious pilot. ============================================================ 0. Bottom line ============================================================ Kashi’s near-term blocker is not lack of primitives. It already has decent primitives: Supabase-backed Postgres/Auth/RLS, a multi-tenant schema, role-bounded presentation, auditability intent, four-tier retention language, and a no-live-LLM core detector posture. The blocker is that these primitives are not yet packaged into a buyer-grade assurance model. Bluntly: - “We use Supabase + Vercel + encryption” is not a serious answer. - “We have RLS” is not tenant-isolation evidence. - “We don’t use Claude in the live detector” is not yet a model-boundary memo. - “We plan an evidence vault” is not yet a key-lifecycle design. - “We have logs” is not yet an exportable audit trail. The right technical posture is: 1) region-pinned data plane, 2) minimal sensitive-data exposure to the app-delivery layer, 3) database-enforced tenant isolation with evidence, 4) explicit model/data lanes, 5) first-party app-layer audit exports, 6) incident response and deletion semantics written down before pilot, 7) user-protective visibility controls treated as enterprise architecture, not “ethics copy”. If Kashi does this, it stops looking like a clever governance demo and starts looking like a system a CISO / procurement team can actually review. ============================================================ 1. Direct answers to the buyer/security questions ============================================================ 1.1 Where is the primary region? Technical answer Kashi needs: - The primary region must be stated per data class, not as one vague “we are in Japan” sentence. - For Japan-facing installs, the regulated data plane should be pinned to a Japan region by default. - Supabase currently supports one primary region per project and offers Tokyo (ap-northeast-1) as a specific region. Generic APAC defaults to Singapore, so “APAC” is not good enough if the sales story is Japan residency. - Vercel Functions default to Washington, D.C. (iad1) for new projects unless explicitly configured otherwise. - Vercel also states that customer data may be transferred to and in the United States and elsewhere where Vercel or its service providers operate. What this means in practice: - “Primary region” must be split into: A. system-of-record region, B. compute region(s), C. backup location(s), D. log location(s), E. support-access geography, F. optional AI/subprocessor geography. - For strict Japan-residency buyers, the cleanest architecture is: - regulated meeting data stored in a Japan-pinned data plane, - compute that touches sensitive payloads explicitly pinned near that data plane, - app delivery and CDN treated as separate from the regulated system of record, - no claim that “nothing ever leaves Japan” unless every subprocessor path, failover path, support path, and logging path has actually been verified. Decision rule: - Kashi may say “customer meeting data can be stored in a Tokyo primary region.” - Kashi should not say “all Kashi processing is Japan-only” unless it is fully true. 1.2 What sensitive data touches Vercel? The honest answer should be: “As little as possible.” This needs to become a hard architectural rule, not a vibe. Vercel is acceptable as an app-delivery layer if Kashi is disciplined about what flows through it. It becomes a problem when raw meeting artifacts, transcript fragments, evidence snippets, or private concern-formation signals leak into: - default US-region functions, - runtime logs, - request logs, - preview deployments, - analytics, - observability traces, - unredacted exception payloads, - environment-variable misuse, - build artifacts, - third-party integrations attached to the frontend/runtime path. Recommended boundary: Allowed on Vercel - static UI assets, - non-sensitive rendering logic, - session/UI routing state, - redacted identifiers, - pre-aggregated non-sensitive view models, - public marketing site, - synthetic/demo datasets. Forbidden or heavily restricted on Vercel - full transcripts, - raw evidence snippets, - private employee vault ciphertext plus identifying metadata in the same place, - detailed review-worthy event context with named identity where not strictly necessary, - unredacted support/admin tooling, - private-awareness telemetry (pattern-page opens, vault creation, draft creation, confound marking), - raw AI prompts containing meeting content, - preview deployments connected to production sensitive data. Stronger rule for strict buyers: - Sensitive ingest and sensitive detector APIs should be moved off the generic web-app runtime path and into a region-pinned backend plane, or at minimum isolated to explicitly configured non-default function regions with strong logging and redaction controls. 1.3 How is tenant isolation evidenced? This is the difference between “multi-tenant SaaS exists” and “we can pass review.” Good answer: - every customer-scoped table has explicit org_id / tenant_id, - RLS is enabled on every exposed table in every exposed schema, - access is deny-by-default, - every policy is reviewed and testable, - privileged service roles are minimized and audited, - service keys are never exposed to browsers, - support/admin paths are separate from customer paths, - negative tests prove cross-tenant reads/writes fail, - eventual pen testing explicitly includes tenant-escape scenarios. Important technical truth: Supabase/Postgres RLS is powerful, but it is not proof by itself. Also, service keys and bypass roles can bypass RLS if misused. That means “we use RLS” is only half the story. The other half is privileged-path discipline. Recommended evidence pack: - RLS coverage inventory (table-by-table), - policy source-controlled and reviewed, - test suite proving: - tenant A cannot read tenant B rows, - tenant A cannot enumerate tenant B IDs through side channels, - aggregates and counts do not leak suppressed tenants, - support/admin impersonation cannot occur without a break-glass flow, - service-role tokens are absent from client code and frontend bundles, - design note on privileged jobs: - which jobs run with elevated rights, - why, - what data they can touch, - what audit event is emitted. 1.4 What is the incident response path? Right now this is one of the clearest gaps. Enterprise reviewers will expect: - incident severity definitions, - who is paged, - who owns containment, - who decides customer notice, - how evidence is preserved, - how access is restricted during incident handling, - how cross-tenant exposure is triaged, - how misuse of admin/support access is handled, - post-incident corrective-action workflow, - customer notification commitments, - tabletop proof that the team has actually rehearsed this. Recommended Kashi incident model: SEV-1 - confirmed or strongly suspected unauthorized access / cross-tenant exposure / decryption-boundary failure / deletion failure with customer impact. SEV-2 - security control bypass attempt, high-risk logging leak, serious auth/configuration error, sensitive preview-environment exposure, model-boundary leak without confirmed broad exposure. SEV-3 - contained vulnerability, limited policy misconfiguration, low-impact operational issue. SEV-4 - suspicious but unconfirmed event, policy deviation without exposure. Minimum response phases: 1. Detect and classify 2. Contain 3. Preserve evidence 4. Scope blast radius 5. Customer and stakeholder communication 6. Remediate 7. Validate fix 8. Lessons learned / CAPA 9. Update assurance materials if the control statement changed Technical must-haves: - immutable incident timeline, - list of impacted tenants/users, - queryable audit trail, - ability to freeze destructive retention jobs under incident/legal-hold conditions, - kill switch for optional AI/model-assisted lanes, - no ad hoc screenshot-only forensics. Important architectural distinction: Vendor incident != Kashi incident. If Vercel/Supabase has an incident, Kashi still needs its own customer-facing response process because shared responsibility does not absolve the product layer. 1.5 What is the model/data boundary? This must be a memo, not a sentence. Kashi’s strongest current answer is good: - the core production detection path is deterministic and not Claude-backed in live operation. But the security/assurance version of that answer must become much more exact: - Which detector steps are fully deterministic? - Which steps touch transcript text? - Which steps use embeddings, similarity, summarization, or classification? - Which steps occur at build time, internal authoring time, evaluation time, optional analyst time, victim-side explanation time, or live customer runtime? - Which model provider is used, in which lane, under what retention terms, with what ability to disable it? Recommended lane split: Lane A — core live detector - no external model calls - deterministic structural computation only - required for all customers Lane B — offline/internal authoring or eval - may touch synthetic or approved internal data - never silently implied to buyers as part of the live product path Lane C — optional AI-assisted customer features Examples: - analyst assistance, - victim-side explanation enrichment, - draft generation, - semantic clustering. These must be separately governed and disableable. Lane D — future semantic detectors If Kashi eventually uses embeddings or semantic similarity on customer text, it must stop pretending the product is purely structural-only and must document exactly where that processing runs. Critical Anthropic nuance: - Anthropic says commercial-product data is processed on behalf of the customer and is not used to train models by default unless the customer opts into the Development Partner Program. - But zero data retention is not the default general state; it applies only to eligible APIs and products using the customer’s commercial organization API key, subject to Anthropic approval, and even then certain safety signals may still be retained. So: - do not oversell “Anthropic means zero retention”, - do not oversell “no training” into “no storage at all”, - decide whether Kashi’s optional AI lanes use Kashi-owned provider access or customer-controlled keys, - document the consequences either way. 1.6 What audit logs are exportable? The correct answer is: Vendor logs exist, but Kashi still needs first-party app-layer audit logs. Current situation: - Supabase Platform Audit Logs exist, but currently have no dashboard export, no platform-audit log drain, and retention depends on plan. - Supabase product logs are more usable operationally and can be exported from the Logs Explorer, but they are not a substitute for domain-specific Kashi audit semantics. - Vercel Enterprise Audit Logs are exportable to CSV, and Vercel supports custom SIEM log streaming on Enterprise plans. - None of that, by itself, is enough to answer “who saw which worker-related object, for what reason, under what workflow state, and who approved it?” Kashi therefore needs its own audit event model. Minimum app-layer audit schema: - timestamp - actor_id - actor_role - actor_tenant - target_object_type - target_object_id - target_tenant - action - reason_code - workflow_state - before_state_hash / after_state_hash where relevant - request_id / trace_id - auth context - support/admin elevation flag - source IP or network context where lawful/appropriate - customer-visible vs internal-only flag Minimum events to capture: - login / session elevation - object creation/update/delete for all sensitive classes - raw-context access - exports - retention-policy change - legal-hold create/release - RLS/policy/admin configuration change - support or break-glass access - optional AI lane enabled/disabled - model-boundary touching events - evidence-share events - review approval / override / suppression / contest outcome Export requirement: - tenant-admin export, - internal security export, - SIEM/object-store streaming, - retention independent of dashboard limits, - machine-readable format, - tamper-evident storage or hash chaining for high-sensitivity events. ============================================================ 2. Recommended target architecture ============================================================ 2.1 Split the system into explicit planes Kashi should stop talking as if there is one flat app stack. Use at least these planes: A. Presentation plane - web UI, static assets, low-sensitivity rendering. B. Sensitive application plane - ingest APIs, detector jobs, policy enforcement, evidence packaging, audit emission. C. Regulated data plane - transcripts, structured meeting records, review-worthy events, retention metadata, legal-hold state, vault metadata. D. Private evidence plane - most sensitive retained snippets, separately encrypted, access-minimized. E. Security/assurance plane - audit exports, policy registry, access reviews, incident evidence, subprocessor/region map, control documentation. Why this matters: - region claims become precise, - logging policy becomes enforceable, - support access becomes more controllable, - buyer explanations stop sounding hand-wavy. 2.2 Data classification model The data classes should be explicit at schema and policy level: Class 1 — raw meeting artifacts Examples: - transcript files, - timestamped turns, - speaker labels, - original upload artifacts. Class 2 — derived analytics Examples: - interruption counts, - directional graphs, - baselines, - confidence objects, - longitudinal aggregates. Class 3 — review-worthy event objects Examples: - event IDs, - severity/repetition/directionality/confidence, - references to bounded windows. Class 4 — legal-hold / investigation material Examples: - frozen event packages, - retained context windows, - case metadata. Class 5 — private evidence-vault material Examples: - user-selected encrypted snippets, - user-private notes or packaged references if ever added. Each class needs: - owner, - default region, - default retention, - deletion behavior, - exportability, - allowed viewers, - legal-hold behavior, - backup caveat. 2.3 Vercel boundary design Recommended stance: - do not let Vercel become the de facto regulated content store. - do not let preview deployments see real customer-sensitive data. - do not stream transcripts into runtime logs or error traces. - do not rely on default region behavior. - set function regions explicitly for any route that touches data. - pin sensitive routes near the data source or remove them from the generic function layer entirely. Specific engineering controls: - structured request/response logging with redaction, - no transcript text in thrown exceptions, - no transcript text in analytics events, - disable or strictly gate preview-env secrets, - prod data inaccessible from previews, - content-security review for observability tooling, - no third-party frontend analytics on private worker pages unless event payloads are aggressively minimized. 2.4 Tenant-isolation design Must-have technical rules: - org_id on every customer-scoped row, - RLS on every exposed table, - deny by default, - no “temporary” tables outside policy, - no support tooling that bypasses tenancy without break-glass, - separate roles for: - ingest job, - detector job, - support tooling, - customer admin, - security admin, - rotate privileged credentials, - no long-lived support superuser shared across team members. Strong recommendation: - create an isolation proof pack before pilot: 1. schema map, 2. policy map, 3. service-account map, 4. negative test results, 5. privileged-path review, 6. third-party penetration-test scope statement. 2.5 Key management and evidence-vault design The evidence-vault concept is one of Kashi’s strongest ideas, but it is not mature enough to be marketed loosely as end-to-end encryption. Do not say “E2EE” unless the lifecycle is fully defensible. Questions the design must answer: - Who generates keys? - Where are keys stored? - Is recovery possible? - Is escrow allowed? - Can ciphertext be rewrapped to a new key? - What happens if a user leaves the company? - What happens if the same user uses multiple devices? - Can support ever access plaintext? - What is the policy for lost keys? - Does legal hold freeze ciphertext only, or also key-wrapping metadata? Recommended posture: - call it “user-controlled encryption boundary” or “client-held decryption boundary” until finalized. - keep the vendor incapable of routine plaintext recovery by default. - if any recovery or escrow exists, say so explicitly and narrow it with procedure. Minimum lifecycle document: - generation - storage - backup - rotation - revocation - recovery - offboarding - re-wrapping - destruction - escrow / no-escrow rule 2.6 Logging, telemetry, and anti-retaliation protection This is both security architecture and trust architecture. Kashi should treat private awareness and concern formation as protected states. That means: - opening one’s own pattern page, - revisiting one’s dashboard, - marking confounds, - creating a vault, - drafting a package, must not leak into employer-facing analytics surfaces. Security telemetry can still exist, but it must be segregated from business/manager/HR visibility and protected from casual internal access. Required rule: - security logs may record sensitive workflow events for fraud/abuse protection, - but those logs must not become operational dashboards that reveal private concern formation. This is not soft ethics. It is retaliation-risk containment by telemetry design. 2.7 Retention, deletion, and backup semantics Deletion is not one thing. Kashi needs deletion semantics per class: - active-store deletion, - search-index deletion, - cache invalidation, - backup expiry, - legal-hold exception, - audit-log tombstone retention, - export behavior. Example structure: Raw meeting artifacts - short retention by default - deletable absent hold - backups expire per platform + policy Derived analytics - longer retention - recalculated or deleted by tenant-admin policy - linked to provenance so deletion impact is knowable Review-worthy events - narrower but more durable - must support contestability and audit Legal-hold material - hold freezes deletion - release is logged Private vault material - user-private by default - deletion semantics must be exact - if only ciphertext remains in backups for a period, say so Critical buyer question: “What does delete mean?” Kashi needs a non-hand-wavy answer: - deleted from active system immediately or within X window, - removed from customer-visible interfaces, - removed from processing pipelines, - backups retained only until normal backup expiry unless hold/legal obligations apply, - any immutable audit record retains only the minimum necessary tombstone metadata. ============================================================ 3. Critical weaknesses in the current Kashi state ============================================================ 3.1 The security story is still implementation-first, control-model-second Current state: - good primitives, - weak packaging. Fix: - convert stack choices into control statements, - publish residual risks, - separate inherited vendor assurance from Kashi’s own assurance. 3.2 The “Japan residency” story is fragile unless the planes are separated Current state: - possible to store in Tokyo, - but easy to accidentally run sensitive routes in default US-region functions, - easy to leak sensitive data into app-delivery logs or preview paths. Fix: - explicit region map by data class and service, - explicit function-region config, - strong “no sensitive payload in preview/log/analytics” rule, - strict-buyer architecture path ready. 3.3 “RLS + multi-tenant schema” is directionally good but not evidence Current state: - promising, - not yet buyer-grade. Fix: - policy inventory, - negative tests, - privileged-path restrictions, - eventual external test scope. 3.4 The incident response path is underwritten by nobody yet Current state: - auditability language exists, - but there is no clear severity model, communication path, or preservation workflow. Fix: - publish a concise incident response summary before pilot, - run a tabletop. 3.5 The key story is conceptually strong and operationally incomplete Current state: - evidence-vault idea is strong, - lifecycle unresolved. Fix: - key lifecycle note first, - marketing language second. 3.6 Auditability is underspecified at the product layer Current state: - vendor logs exist, - no Kashi-native exportable audit schema is defined. Fix: - app-layer audit event design, - tenant-admin export, - SIEM-ready stream, - abnormal-access alerts. 3.7 The model-boundary story is strong in spirit and weak in memo form Current state: - “no live Claude in core detector” is a good start, - but optional and future lanes are not yet precisely governed. Fix: - formal model/data boundary memo with lane-by-lane truth. 3.8 The rollout trust logic is still too detachable from the security architecture This matters technically: - worker-private views, - anti-retaliation telemetry suppression, - procedurally gated raw access, - contestability for transcript/speaker/context errors, are not just governance notes. They affect: - access control, - logging, - audit trails, - notification behavior, - storage design, - export controls. ============================================================ 4. Acceptance criteria by workstream ============================================================ 4.1 Data residency / geography This workstream is “done enough for pilot” only if: - every data class has a named storage region, - every compute path touching sensitive content has a named execution region, - default-region behavior is overridden or consciously accepted, - Vercel-sensitive-route placement is explicit, - logs, backups, and optional AI/subprocessor geography are mapped, - sales language matches technical truth. 4.2 Tenant isolation Done enough only if: - all exposed tables have RLS, - privileged roles are enumerated, - service keys are confined to private environments, - negative isolation tests pass, - support/admin break-glass path exists and is logged, - there is an isolation evidence note. 4.3 Model/data boundary Done enough only if: - detector pipeline is split into formal lanes, - each lane states whether customer text enters a model, - retention/processor terms are documented per lane, - customers can disable optional AI lanes, - future semantic features cannot quietly appear without updating the memo and governance pack. 4.4 Audit and visibility Done enough only if: - Kashi app-layer audit events are defined, - sensitive object access is logged with reason codes, - exports exist outside vendor dashboard limits, - affected-user visibility and admin visibility rules are explicit, - protected private states do not leak into employer-facing surfaces. 4.5 Key lifecycle / evidence vault Done enough only if: - lifecycle document exists, - recovery policy is explicit, - no-escrow vs escrow rule is explicit, - offboarding behavior is explicit, - multi-device behavior is explicit, - support plaintext access is either impossible or procedurally exceptional and documented, - marketing language has been reviewed against technical reality. 4.6 Incident response Done enough only if: - severity model exists, - on-call / escalation roles exist, - customer-notice workflow exists, - destructive jobs can be paused, - incident evidence can be preserved, - tabletop completed and recorded. 4.7 Retention and deletion Done enough only if: - each class has retention, - deletion semantics are documented, - backup caveats are explicit, - legal-hold workflow exists, - admin controls exist or are clearly staged for later, - exports respect deletion/hold rules. ============================================================ 5. Recommended 30 / 60 / 90 day sequence ============================================================ 30 days — make the assurance story coherent Deliverables: - region + subprocessor map - shared-responsibility matrix - model/data boundary memo - do-not-say list for sales/founders - draft incident response summary - first-pass audit event schema - strict rule: no sensitive data in previews/logs/analytics 60 days — make the controls evidenced Deliverables: - tenant-isolation test pack - privileged-role/service-account inventory - exportable app-audit pipeline - retention/deletion spec - key lifecycle decision note - preview/prod separation controls - initial CAIQ / SIG-lite answer set 90 days — make the pilot package reviewable Deliverables: - Security & Assurance Pack v1 - tabletop exercise notes - customer-facing security FAQ - subprocessor/geography page - retention and deletion one-pager - buyer-facing tenant-isolation note - buyer-facing model-boundary note - deployment-governance addendum with access reasons and challenge path ============================================================ 6. Strong recommendations for the actual architecture ============================================================ If Kashi wants the cleanest serious-enterprise path, it should adopt these as design defaults: 1. Sensitive-data minimization at the web-app layer The web framework should not be the default storage or logging surface for transcripts. 2. Region-selected data plane first Do not build the security story around generic “cloud” language. Build it around named regions and named flows. 3. First-party app audit plane Do not treat vendor dashboards as the system audit trail. 4. Break-glass support only No casual support browsing of sensitive tenant data. Every privileged access should be: - justified, - logged, - time-bound, - reviewable. 5. Preview environments must be sterile No real sensitive customer data in previews. No production secrets in previews unless intentionally and narrowly required. 6. Protected private states Concern formation must not become soft employer-visible metadata. 7. Explicit boundary around optional AI Optional AI lanes should be off by default for strict buyers and separately governable. 8. Honesty over slogan Do not let product copy make claims the architecture cannot currently support. ============================================================ 7. “Do not say this” list ============================================================ Do not say: - “We’re in Japan” if only some data is in Japan. - “Nothing leaves Japan” unless every path is verified. - “We use RLS, so tenant isolation is solved.” - “We are SOC 2 because our vendors are.” - “We have encryption” when the key lifecycle is undefined. - “Anthropic means zero retention.” - “We never process content” if any optional or future semantic path exists. - “This is not monitoring” in an absolute sense. - “We have audit logs” if they are not exportable and reason-coded at the product layer. - “End-to-end encryption” unless the term is truly deserved. Safer replacements: - “Region-pinned primary data store with documented subprocessor and transfer map.” - “Database-enforced tenant separation with policy tests and privileged-path controls.” - “User-controlled encryption boundary for the most sensitive evidence.” - “Non-generative live detector path; optional AI lanes separately governed.” - “Exportable app-layer audit trail with reason-coded sensitive access events.” - “Narrow, bounded, procedurally governed workplace analytics rather than broad surveillance.” ============================================================ 8. Final judgment ============================================================ The hard truth is simple: Kashi does not need a completely different stack. It needs a much more disciplined control story and a few real architectural boundaries. The highest-value moves are not flashy: - pin the data plane, - constrain Vercel exposure, - prove tenant isolation, - write the model-boundary memo, - implement app-native audit export, - finish the key lifecycle, - define incident response, - state deletion semantics honestly, - treat worker-protective visibility rules as enterprise architecture. That is enough to materially raise the seriousness of the product. Without those moves, Kashi remains a sharp concept with procurement-grade weak points. With those moves, it becomes a pilotable system with a believable enterprise-assurance posture. ============================================================ 9. Sources used for this memo ============================================================ Internal Kashi materials - Kashi — Progress & Project Overview (2026-04-21) - Kashi - Procurement / Security-Buyer Readiness Memo - Kashi trust / anti-surveillance research memo - Kashi retaliation-risk memo - Kashi legal/procedural fairness memo - Kashi measurement-science memo - Kashi labor-relations / worker-representation memos Current official / primary external sources checked - Supabase docs: - Available regions - Platform Audit Logs - Shared Responsibility Model - Row Level Security - Database Backups - Security / SOC 2 guidance - Vercel docs: - Shared Responsibility Model - Security & Compliance Measures - Configuring Regions for Vercel Functions - Regions - Audit Logs - Anthropic Privacy Center: - Data processor/controller for commercial products - Zero data retention applicability - Cloud Security Alliance: - Cloud Controls Matrix and CAIQ v4.1 - AI-CAIQ - STAR for AI / AICM materials - NIST: - SP 800-61r3 incident-response update / project