Kashi — Adversarial / Gaming Perspective Technical research memo for developers Prepared: 2026-04-21 Purpose: turn the adversarial/gaming lens into concrete product, data, and architecture decisions for Kashi. ================================================== 0. Executive judgment ================================================== Bottom line: Assume adaptation. Not as an edge case. As default operating reality. If Kashi becomes legible inside an organization, at least some actors will optimize the visible metric surface instead of improving the underlying behavior. The product therefore cannot treat cleaner dashboards as equivalent to cleaner power dynamics. The core technical implication is brutal but simple: 1) single-metric improvement is weak evidence; 2) absence of in-meeting signal is not evidence of absence of pressure; 3) anti-gaming must be built into the scoring, UI wording, role architecture, and post-deployment monitoring layer; 4) Manager Mirror becomes dangerous if it can be used as compliance theater, private evidence accumulation, or a fake “I improved” shield. This memo is not arguing that Kashi is broken. It is arguing that Kashi gets more credible if it explicitly designs for: - metric substitution - channel displacement - hierarchical laundering - polite structural exclusion - symbolic compliance / compliance theater - meeting-type camouflage - anti-inference and anti-retaliation constraints The internal Kashi materials already point in this direction: - the deck refuses the company-wide health bar and warns that visible numbers get falsified; - the adversarial memo explicitly recommends an “adaptation watch” layer; - the retaliation memo warns that Manager Mirror can become covert evidence accumulation; - the measurement-science memo says refusing one bad metric does not remove anti-gaming needs; - the meeting-type memo warns that unsupported contexts must fall back to observational mode. ================================================== 1. Current Kashi position: strengths and hard limit ================================================== Current strengths: - Kashi is longitudinal, not single-event. - Kashi already refuses binary abuse labels, affect inference, performance use, and company-wide relationship scoring. - It already has role-based access, k-anonymity, differential privacy, audit trails, and retention tiers. - The current live detector path is deterministic and explainable. - The product already frames itself as review support, not machine judgment. Hard limit: Kashi sees only a bounded slice of work: recorded meetings and their derived signals. That means anti-gaming cannot mean “observe more and more channels until nothing escapes.” That would collapse the product into surveillance infrastructure and destroy its political and trust thesis. So the correct anti-gaming strategy is not universal coverage. It is disciplined interpretation. ================================================== 2. The main adversarial routes Kashi should assume ================================================== 2.1 Metric substitution Visible behavior improves; underlying power relation persists through a neighboring mechanism. Typical example: - interruptions go down - but unanswered questions rise - or idea-burial / topic-credit capture rises - or one person’s proposals are acknowledged later and by someone else - or participation stays formally balanced while one participant’s actionable uptake collapses What this means technically: - Kashi must not interpret improvement in interruption metrics alone as behavioral correction. - Improvement should be treated as provisional unless adjacent metrics move in the same direction. Recommended rule: No “improved” state from a single detector family. Require corroboration from multiple adjacent measures. -------------------------------------------------- 2.2 Channel displacement Monitored group meetings get cleaner, while pressure moves elsewhere. Likely routes: - 1:1s - off-calendar calls - corridor conversations - private chat - delegated feedback through lower-level managers - staffing / routing / follow-up exclusion outside the meeting This is not a reason to expand into omnichannel monitoring. It is a reason to hard-code a limitation: No in-meeting signal != no pressure. Technical implication: - Kashi should expose observed-channel coverage and scope boundaries. - The UI must explicitly distinguish: a) no qualifying signal detected in observed meetings b) insufficient evidence / unsupported context c) healthy Do not let “healthy” be the default meaning of silence. -------------------------------------------------- 2.3 Hierarchical laundering The most senior actor learns the metric surface and keeps their own visible behavior clean while pressure is executed by deputies, leads, or senior ICs. Typical shape: - senior leader’s mirror looks cleaner after rollout - team-level chill or selective exclusion persists - negative dyads migrate downward in the reporting chain - the organization reads this as improvement because the visible top node looks cleaner Technical implication: - do not over-index on clean senior dashboards - monitor where adverse patterns reappear after intervention - model chains and clustering, not just isolated dyads Recommended logic: After any manager-facing intervention cycle, check whether: - same target remains chilled - negative concentration shifts to adjacent actors - team-level asymmetry remains flat while the original actor’s visible metrics improve -------------------------------------------------- 2.4 Polite structural exclusion No obvious overlap, no shouting, no sentiment spike — but someone still gets buried. Typical mechanisms: - agenda control - sequencing - pre-closing - summarizing formulations that narrow or overwrite prior talk - repeated “we’ll come back to that” - selective response / selective non-response - letting someone speak but not letting their contribution matter This is the most dangerous case because it looks professional. Technical implication: Pure interruption-heavy logic under-calls this mode. If Kashi wants to stay mostly structural, it still needs safe proxies for “input was procedurally neutralized.” Candidate proxy families: - proposal acknowledged but no action uptake within bounded turn window - topic closes disproportionately soon after a specific speaker raises a point - repeated summary/reframe after one participant’s contribution - reply asymmetry: some speakers get immediate substantive engagement, others get time-deferring closure - high ratio of “response without adoption” for one target vs peers Important model-boundary note: This is where Kashi cannot stay intellectually sloppy. Some of these signals are not truly “metadata only.” The current deck still claims six structural detectors and says none read content, but the live product also lists topic-credit ignored-turns via embedding distance and other detectors that imply transcript interpretation. Anti-gaming work makes that contradiction harder, not easier. If Kashi wants safe sequencing/agenda-control proxies, it must be honest that at least part of the system is a constrained hybrid, not a pure timestamp machine. -------------------------------------------------- 2.5 Symbolic compliance / compliance theater The measured person learns the rules of the game and performs “improvement” in visible contexts mainly to satisfy the instrument. Typical signs: - improvements appear immediately after rollout or training, with implausibly clean drops - only monitored meeting types improve - visible metrics improve but contestation, confound-marking, or employee private concern behavior does not - the manager uses the mirror as proof of innocence: “the numbers are clean now” - the workflow encourages attestation (“reviewed / acknowledged / completed”) rather than behavior change Technical implication: Manager Mirror must not become: - a self-certification tool - a hidden disciplinary file - a “green badge” system - evidence the manager can export for appraisal or self-defense Recommended constraints: - no export from Manager Mirror - no completion badge - no global “improved” label without corroboration - no downstream use in promotion, compensation, discipline, or ranking - trend language should stay provisional (“observed pattern decreased in this channel”) rather than moral (“behavior improved”) -------------------------------------------------- 2.6 Meeting-type camouflage Some actors may not “game” by malicious adaptation; the system may simply over-read normal behavior in unsupported contexts. Examples: - incident bridge looks aggressive by design - 1:1 has manager-heavy talk share by design - executive review concentrates challenge by design - training has high instructor dominance by design This matters adversarially because bad-faith defenders can hide behind format ambiguity, and bad-faith critics can claim the system overreaches. Technical implication: Meeting type is part of anti-gaming, not just false-positive mitigation. Required rule: If meeting_type is unknown, unsupported, or low-confidence: - compute observations if useful - do not construct review-worthy events by default - say so explicitly in UI ================================================== 3. “Suspiciously clean” patterns that should trigger caution ================================================== These are the cases the product should treat as yellow flags, not success. A. Interruptions down sharply, but: - unanswered-question rate flat or worse - ignored-turn / topic-credit capture flat or worse - chilling persists - action uptake asymmetry persists B. Manager mirror improves, but: - team-level asymmetry unchanged - same target still shows sustained baseline drop - adjacent actors become the new source of concentration C. Group meetings improve, but: - 1:1 volume rises - unsupported meeting types rise - off-platform / uncaptured channels become more common - transcript-quality coverage drops after rollout D. Speaking-share parity improves, but: - one person’s idea uptake, response quality, or substantive engagement worsens - discussion closes faster after one person speaks E. Improvement appears: - immediately after rollout - only in measured contexts - only in public meetings - without corresponding change in broader review objects F. The organization begins using “no signal detected” as exoneration. That is not a quality win. It is a governance failure. ================================================== 4. What this means for product architecture ================================================== 4.1 Multi-metric corroboration engine Required. Any “improvement” claim should be computed at the pattern layer, not the single-detector layer. Suggested logic: - Detector-level deltas -> normalized by meeting type, role, and baseline - Pattern-level corroboration score - Improvement requires: - at least 2 relevant detectors moving positively - no high-risk adjacent detector worsening materially - no confidence downgrade - no unsupported-context shift explaining the change Output labels: - provisional improvement - mixed movement - possible substitution - insufficient evidence - no qualifying signal detected Do not output: - fixed “healthy” - “manager improved” - “issue resolved” -------------------------------------------------- 4.2 Adaptation Watch layer Should be a first-class system feature, not just governance prose. V1 logic can be simple: - detect sharp improvements in one metric family - compare with adjacent metrics - compare monitored vs unsupported meeting-type mix - compare before/after channel coverage - compare actor cluster movement (did pattern shift to nearby actors?) V1 output: - adaptation_watch_flag = true/false - adaptation_reason_codes = [metric_substitution, channel_shift, unsupported_context_shift, hierarchical_laundering_suspected, confidence_drop] - adaptation_confidence = low/medium/high This does not accuse anyone. It forces the system to resist naive triumphalism. -------------------------------------------------- 4.3 Coverage / scope map The product should expose what it saw and what it did not see. Suggested metadata per analytic period: - observed_platforms - observed_meeting_types - unsupported_meeting_share - low-confidence transcript share - diarization-risk share - group vs 1:1 share - internal vs external share Why: Without coverage metadata, clean numbers look stronger than they are. -------------------------------------------------- 4.4 Meeting-type normalization as anti-gaming control Required fields: - meeting_type - meeting_type_confidence - role schema (chair, facilitator, IC, trainer, presenter, attendee) - internal_vs_external - recurrence_type - decision_mode Why: Without this, people can either route pressure into ambiguous contexts or defend harmful patterns as “normal meeting behavior.” -------------------------------------------------- 4.5 Contestability workflow Adversarial robustness is not only upstream detection. It is also what happens when a user says “this is wrong,” “this context was special,” or “you are missing how this is happening.” Required states: - detected - contested_accuracy - contested_speaker - contested_context - under_review - upheld - downgraded - withdrawn - preserved_under_hold (if needed) Once contested, the object must stop behaving like settled truth. -------------------------------------------------- 4.6 Manager Mirror constraints Manager Mirror should remain, but bounded hard. Must have: - self-only view - no export - no named subordinate telemetry - no silent forwarding into HR evaluation files - no leaderboard / percentile / badge - no “I reviewed” completion mechanic that can be shown as compliance proof - no use in performance, promotion, discipline, or compensation Should have: - future-oriented reflection prompts rather than static scolding - provisional language - explicit statement that cleaner metrics in observed meetings do not prove resolved underlying dynamics - optional manager note kept separate from evaluative systems Should not become: - private shield - disciplinary shadow file - symbolic repentance ritual -------------------------------------------------- 4.7 Anti-inference / anti-retaliation constraints Some anti-gaming “solutions” become retaliation infrastructure if done badly. Required: - opening /app/me/pattern is private - concern formation is private - vault creation and activity are private - drafts are private until explicit share - small-team review outputs require batching / delay / redaction / suppression - employer-side views remain aggregate-first Why this belongs in an anti-gaming memo: If managers can infer who is privately concerned, the system incentivizes quieter, more polished pressure and suppresses use of the worker-protective path. That is not just a privacy bug. It is a gaming amplifier. ================================================== 5. What Kashi should not do in response ================================================== Do NOT respond to anti-gaming risk by: - expanding into generic chat/email/browser surveillance - introducing a company-wide relationship-health bar - ranking managers - using mirror outputs for silent HR decisions - turning adaptation-watch flags into accusation labels - pretending unsupported channels do not matter - pretending the system is still “purely metadata-only” if hybrid transcript interpretation is clearly present Those moves would either: a) destroy trust, or b) create new gaming targets, or c) both. ================================================== 6. Suggested backlog: decision-ready ================================================== P0 — must add now 1. Add explicit anti-gaming doctrine to deck / governance / product docs. 2. Replace any wording that equates cleaner measured patterns with solved behavior. 3. Add “no in-meeting signal detected” language instead of “healthy” where evidence is limited. 4. Add Manager Mirror use restrictions: - no export - no appraisal use - no subordinate drill-down - no completion badge 5. Add coverage metadata and unsupported-context disclosure to every serious view. 6. Add adaptation-watch spec to architecture backlog. 7. Resolve the “metadata-only” contradiction in technical docs. P1 — build soon 8. Multi-metric corroboration engine. 9. Meeting-type/role normalization matrix. 10. Pattern-state machine with contested/downgraded/withdrawn states. 11. Provisional-language component library for UI. 12. Small-team anti-inference suppression rules in institutional lanes. P2 — later, careful 13. Safe proxies for sequencing / agenda control / pre-closing without drifting into broad semantic surveillance. 14. Post-intervention shift detection across adjacent actors. 15. Validation track for “suspiciously clean” scenarios in pilot data. ================================================== 7. Concrete technical acceptance criteria ================================================== AC-1 Multi-metric rule No dashboard may display “improved” unless at least two relevant detector families improve and no adjacent high-risk detector worsens beyond threshold. AC-2 No exoneration state The institutional UI must not map “no qualifying signal detected” to “healthy” or equivalent language. AC-3 Coverage disclosure Every review surface must show coverage metadata: - meeting types included - unsupported share - low-confidence transcript share - observation window AC-4 Adaptation watch The system must be able to raise at least these reason codes: - metric_substitution - channel_shift - unsupported_context_shift - hierarchical_laundering_suspected - confidence_drop AC-5 Manager Mirror bounds Manager Mirror must not support export, badge completion, ranking, or named subordinate behavioral telemetry. AC-6 Private states protected Opening pattern pages, marking confounds, and enabling evidence vault must not produce employer-visible events. AC-7 Contestability Any review-worthy event must support dispute/annotate/review/resolve states with immutable history. AC-8 Unsupported context fallback If meeting_type is unknown or unsupported, Kashi may compute observations but must not over-interpret them into review-worthy events by default. AC-9 Copy discipline UI and docs must avoid these claims: - “improvement means behavior improved” - “clean dashboard means team is fine” - “no signal means no problem” - “Kashi is not monitoring at all” - “all live detection is purely metadata-only” if hybrid detectors remain ================================================== 8. Clean project wording the devs can align around ================================================== Safe: - Kashi surfaces structural risk patterns in meeting interaction over time. - Kashi uses bounded visibility and human review, not automated judgment. - Improvement in one metric is provisional unless supported by broader pattern movement. - Absence of in-meeting signal means absence in the observed channel, not proof of absence overall. Unsafe: - Clean dashboard = clean team - Metric improvement = behavior improvement - No signal = no problem - Manager Mirror = coaching solved - Structural-only = no transcript interpretation anywhere, if hybrid detectors remain in scope ================================================== 9. Final judgment ================================================== The adversarial / gaming perspective does not kill Kashi. It removes the naive version of Kashi. The serious version is: Kashi is not a truth engine and not a moral classifier. It is a bounded, contestable, longitudinal meeting-governance system that assumes people react to visibility and therefore interprets its own metrics with caution. That is the technically stronger version. That is also the deployable one. ================================================== 10. Source notes ================================================== Internal Kashi materials used: [I1] Kashi — Progress & Project Overview (2026-04-21) [I2] kashi_adversarial_research_memo_2026-04-21_scrubbed [I3] Kashi_manager_adoption_research_memo [I4] kashi_measurement_science_research_memo_scrubbed [I5] Kashi_Retaliation_Risk_Research_Memo_2026-04-21_scrubbed [I6] Kashi_Meeting_Type_Normalization_Research_Memo_2026-04-21 [I7] Kashi_Research_Synthesis_Legal_Procedural_Fairness_2026-04-21 [I8] kashi_trust_research_memo_2026-04-21 External sources cross-checking the core claims: [E1] OECD (2025), Algorithmic management in the workplace: New evidence from an OECD employer survey. [E2] OECD (2025), How widespread is algorithmic management in workplaces? [E3] Treem, Barley, Weber, Barbour (2023), Signaling and meaning in organizational analytics: coping with Goodhart’s Law in an era of digitization and datafication. [E4] Larkin (2014), The Cost of High-Powered Incentives: Employee Gaming in Enterprise Software Sales. [E5] Yu, Treré, Bonini (2022), The emergence of algorithmic solidarity: unveiling mutual aid practices and resistance among Chinese delivery workers. [E6] Mawritz et al. (2012), A Trickle-Down Model of Abusive Supervision. [E7] Meeting-science / conversation-analysis literature on meeting formulations, topic progression, and chair control (as cited in the internal adversarial and meeting-type memos).