# Deck 17 · AI Semantic Conversations (Search-to-Sales)

**Status:** Draft (not yet built)
**Saved:** 2026-06-28 by Jeff (verbal)
**Owner:** Plex / OMX Analytics + Digital
**Phasing:** Internal first → External (Ask Max) second

---

## One-line thesis

**Ask Lens, get the answer.** A semantic conversation layer that lets every OMX staff member ask data questions in plain English — "show me last week's margin vs last year" — and get a real answer with the underlying data. Internal first; external (Ask Max for customers) once the muscle is built.

## The wedge — why now

- OMX has the data in Snowflake (PPSS sources, Lens dbt models, F_AGG_SALES_PERFORMANCE_*, F_CUST_TARGET_*) — but only a handful of analysts know how to ask it
- Anthropic / OpenAI now produce production-grade text-to-SQL and semantic-layer translation
- Lens is the visualisation layer — the conversation layer sits ON TOP of Lens, not replacing it
- Same engine, two markets: internal (analysts, reps, managers) first; external (customers via Ask Max) once the patterns are proven
- "Search-to-sales" — internal: search-for-answer; external: search-for-product-then-buy

## Phase 1 — Internal (this deck's primary focus)

**Audience:** sales reps, account managers, analysts, planners, ExCo
**Use cases:**
1. "Show me last week's margin vs last year for [customer]"
2. "Which K0-K2 accounts have stopped ordering in last 60 days?"
3. "What's our top off-range SKU request this month?"
4. "Compare DIFOT this quarter by sector"
5. "Which suppliers are under-delivering on rebate accruals?"
6. "Show me my customer review pack for [account] next week" (QBR pre-fill — links to Deck 18)

**Tech shape:**
- Anthropic Claude (Sonnet 4.6 / Opus 4.7) with text-to-SQL prompt
- Semantic layer over Snowflake (dbt-driven definitions)
- Citation rule: every answer cites the source dbt model + query
- Memory: previous question context for follow-ups ("now break that by rep")
- Lens widgets as the answer surface where the answer is visual

## Phase 2 — External (Ask Max — handoff to Deck 01)

Once internal patterns mature, the same engine answers customer queries:
- "Do you have ink for my old HP?"
- "How much did I spend on stationery last quarter?"
- "What's the price for 12 packs of [SKU]?"

This is Ask Max (Deck 01). Phase 2 is the integration; the engine is the same.

## What this deck covers

1. **The opportunity** — most OMX staff can't ask the data; the data team is the bottleneck
2. **Semantic layer foundations** — dbt model definitions become the "vocabulary" the AI uses
3. **Internal use case sweep** — sales/AM/analyst/planner/ExCo use cases by frequency + value
4. **QBR pre-fill** — biggest sales-side win; links to Deck 18 (Company Research)
5. **Architecture** — Claude + Snowflake semantic layer + Lens widgets
6. **Governance** — what AI can and can't answer; citation; PII/sensitive data fence
7. **Phase 2 link to Ask Max** — same engine, external surface

## What this deck explicitly does NOT do

- Not external customer conversations in phase 1 (Deck 01 — Ask Max owns)
- Not BI dashboard builder (Lens is the dashboard; this is the conversation OVER Lens)
- Not data engineering (FDL/dbt teams own the underlying models; this layer uses them)

---

## Problem framing (what's broken)

- **Data-team bottleneck** — every reasonable question requires a ticket to the analyst pool
- **Reps don't ask the data** — too slow to be useful in a sales conversation
- **QBR prep takes days** — manually pulling the same view per account, per quarter
- **Tribal knowledge** — only a few people know which dbt model has the right field
- **Lens has answers, but the question hasn't been asked yet** — dashboards answer what was anticipated; conversations cover the unexpected

## Benefits (the value story)

| Lever | Mechanism | Sizing approach |
|---|---|---|
| **Analyst capacity reclaim** | Self-serve answers for the long-tail questions | Reclaim 30-50% of analyst time on routine queries |
| **Sales-rep response time** | Real-time answers in customer conversations | Higher win rate; faster deal cycle |
| **QBR prep speed** | Pre-filled QBR packs from conversation | Memory: QBR prep currently days per account; could be <1hr |
| **Decision quality** | Faster + broader access to data = better decisions | Strategic |
| **Phase 2 platform reuse** | External Ask Max reuses internal infra | Lower marginal cost for Deck 01 expansion |
| **Audit trail** | Every conversation logged + citation chain | Compliance + learning loop |

---

## Layout candidates from the gold standard

- **Cover** — analyst at laptop typing a question; semantic answer + chart resolves
- **Problem vector grid (4)**: Analyst-bottleneck / Reps-can't-ask / QBR-prep-cost / Tribal-knowledge
- **Use case sweep** — 6-9 example questions with the answer surface
- **Phase 1 vs Phase 2 — same engine** — visual showing internal use today, external Ask Max tomorrow
- **Architecture pipeline** — User question → Claude semantic layer (dbt-aware) → Snowflake query → Lens widget + citation
- **Governance frame** — what's safe to ask; PII fence; citation always shown
- **QBR walkthrough** — pre-fill of one customer pack as the killer demo
- **Roadmap** — Phase 1 internal (90d POC → 6mo broad rollout) → Phase 2 external Ask Max integration (12mo)
- **The ask** — AI API budget + semantic-layer engineering + change-management for staff adoption

---

## Open questions to resolve

1. **Semantic layer definition** — is dbt sufficient, or do we need explicit metric/dimension definitions (Cube.dev, Lightdash, dbt Semantic Layer)?
2. **Source-of-truth pattern** — which dbt models are conversation-ready? Lens 181 reports stay as regression set (memory)
3. **AI provider** — Claude / OpenAI / both; cost per question at scale
4. **Citation UX** — inline citation; "show me the query" affordance for power users
5. **PII fence** — which fields/customer types are off-limits to AI answers?
6. **QBR pack format** — what's the canonical OMX QBR pack? (Connects to Deck 18)
7. **Lens widget integration** — when does the answer render as a chart vs text?
8. **Staff change management** — adoption is the limiting factor; how do we incentivise asking?

## Audience

**Primary:** Chief Digital Officer + Head of Analytics + Sales Director.
**Secondary:** ExCo (they're heavy users of the QBR pre-fill use case).
**Tertiary:** IT / Data team — they're co-build participants.

## Reference

- Memory: **Lens (BI tool replacing Sisense)** — visualisation layer this deck sits above
- Memory: **Lens design language v1** — pattern for the answer-surface UX
- Memory: **OMX dbt models** at `lens/Current/libraries/dbt/models/presented/` — F_AGG_SALES_PERFORMANCE_*, F_CUST_TARGET_*, D_CALENDAR
- Memory: **PR-013** fact-checking gate — citation is non-negotiable
- Memory: **FDL REVIEW_OMX_dbt v2.0** — semantic-layer definitions live in dbt
- Anthropic Claude — Sonnet 4.6 / Opus 4.7 for text-to-SQL + summarisation
- Connection to **Deck 01 Ask Max** (phase 2 external surface), **Deck 18 Company Research** (QBR pre-fill use case), **Deck 14 IBP** (the MBR + Pre-S&OP audience)

---

## Research deepening (background-agent, 2026-06-28)

### Text-to-SQL accuracy — current benchmark state

| System | BIRD execution accuracy | Spider | Notes |
|---|---|---|---|
| **Snowflake Cortex Analyst** | >90% on Snowflake's customer evals; 2x better than baseline LLMs; +14% vs competing solutions | n/a (custom semantic-model harness) | Tightly coupled to semantic model definition |
| **Snowflake Arctic-Text2SQL-R1.5** | Outperforms GPT-5, Claude Sonnet 4.5, Gemini 2.5 Flash on internal Snowflake benchmarks | leader on internal benches | Snowflake-trained, free with Cortex |
| **Claude Opus 4.5** | 66.0% | 76.0% | Zero-shot; strong on complex reasoning |
| **GPT-5.1** | 53.3% | 77.6% | Lower BIRD; closer Spider |
| **OpenAI O3-mini** | 61.3% | 78.8% | Cheaper inference |
| **Gemini 3-Flash-preview** | 66.6% | 87.2% | Top Spider, near-top BIRD |
| **Semantic-model boost** | +20% accuracy avg when structured validation + schema-aware parsing layered | — | The OMX play |

**Key finding:** semantic-layer + structured-validation wrapper around an LLM **lifts accuracy by ~20%**. This is exactly the dbt-semantic-layer + Lens-widget pattern this deck proposes. Raw LLM-only text-to-SQL caps at 60-70% on hard enterprise queries; with semantic layer it consistently exceeds 85-90%.

**Sources:**
- Snowflake Cortex Analyst >90% accuracy + 2x lift — https://www.snowflake.com/en/engineering-blog/cortex-analyst-text-to-sql-accuracy-bi/
- Cortex Analyst semantic-model boost — https://www.snowflake.com/en/engineering-blog/agentic-semantic-model-text-to-sql/
- Real-time text-to-SQL behind Snowflake Intelligence — https://www.snowflake.com/en/blog/engineering/real-time-text-to-sql-snowflake-intelligence/
- Claude Opus 4.5 / GPT-5.1 / Gemini benchmark — https://datost.com/blog/text-to-sql-accuracy-benchmarks
- Mistral 2x text-to-SQL on Snowflake — https://mistral.ai/customers/snowflake/

### Semantic-layer platform comparison

| Platform | Pricing (USD/yr) | OMX fit |
|---|---|---|
| **dbt Semantic Layer (MetricFlow)** | dbt Cloud Team $100/seat/mo + per-queried-metric meter; Enterprise custom | Best if OMX is already on dbt Cloud; meter hostile to AI-agent workloads |
| **Cube.dev** | Free open-source / Cube Cloud $300-$2.5k+/mo / Enterprise custom | Meters developers + infra hours — benign for AI agent workloads (Ask Lens fits) |
| **Lightdash** | OSS free / Cloud from $29/mo / Enterprise custom | Strong dbt integration; lightest weight |
| **AtScale** | Meters semantic objects — unlimited users/queries | Heaviest enterprise |
| **Custom on Snowflake Cortex Semantic Views** | Snowflake compute only | Tightest integration; native to OMX stack |

**Sources:**
- Semantic-layer buyer's guide — https://davidsj.substack.com/p/semantic-layers-a-buyers-guide
- dbt vs Cube vs Lightdash 2026 — https://www.stackfyi.com/guides/semantic-layer-tools-dbt-cube-metricflow-lightdash-2026
- Cortex Semantic Views — https://docs.snowflake.com/en/user-guide/snowflake-cortex/cortex-analyst

**Recommendation for deck:** because OMX is already on Snowflake + dbt (FDL REVIEW_OMX_dbt v2.0 standard), the lowest-friction stack is **Snowflake Cortex Semantic Views as the metric definitions, dbt as the model layer, and Claude or Cortex Analyst as the conversation engine**. Cube.dev is the credible alternative if OMX wants vendor-portable semantic definitions.

### Cost per question at scale

Claude Sonnet 4.6 at $3/1M in, $15/1M out:
- Average internal analyst query: ~5k input (schema + history) + 500 output = **$0.02-$0.03/question**
- 500 staff @ 10 queries/day @ 250 days = 1.25M queries/yr = **$25k-$40k/yr Claude spend**

Cortex Analyst credit consumption: published as part of Cortex billing; typically equivalent or cheaper for Snowflake-native workloads.

### QBR pre-fill — sizing the killer use case

Memory: QBR prep currently days per account. Assume:
- 500 managed accounts x quarterly QBR x 1 day current prep = **2,000 analyst-days/yr**
- AM rate ~NZ$70/hr fully-loaded x 8hr = $560/QBR x 2,000 = **$1.12M/yr in analyst+AM time**
- Conversation-driven pre-fill at <1hr per QBR = **>80% reclaim = $900k/yr capacity**

This is the single largest immediately-defensible benefit number for the deck.

### Anthropic citation pattern (PR-013 compliant)
Citation MUST be inline for every conversation answer. Pattern:
1. AI generates SQL
2. SQL executes against semantic-view-defined metric
3. Answer surface shows: result + chart + "Source: dbt model F_AGG_SALES_PERFORMANCE_DAILY, metric: gross_margin_pct, filters: customer=X, period=last_week vs LY"
4. "Show me the query" affordance opens the SQL with read-only Snowflake link

---

## Vectors + visuals

### Lucide icons
- **Cover hero:** i-brain (semantic) + i-zap (conversation)
- **Problem grid:** i-life-buoy (analyst bottleneck) / i-coffee (reps can't ask) / i-file-text (QBR prep cost) / i-search (tribal knowledge)
- **Use case sweep:** i-bar-chart (margin) / i-search (stopped ordering) / i-shopping-cart (off-range) / i-truck (DIFOT) / i-shuffle (rebate) / i-sparkles (QBR pre-fill)
- **Phase 1 vs Phase 2:** i-laptop (internal staff) -> i-smartphone (external Ask Max)
- **Architecture:** i-search (question) -> i-brain (Claude + semantic) -> i-file-text (Snowflake) -> i-bar-chart (Lens widget)
- **Governance:** i-life-buoy (PII fence) + i-sparkles (citation)
- **QBR walkthrough:** i-door (one click) + i-file-text (pre-filled pack)

### Image concepts (cover + 5 key slides)
1. **Cover hero** — Photo of an OMX AM at laptop, typing in a chat-like interface, with the answer materialising as a chart on screen-right. Source: OMX internal team photography (with consent) or composited stock. Anchor: "Ask Lens. Get the answer."
2. **Analyst-bottleneck slide** — Stack of 30 paper tickets / data-request emails on a desk. Source: staged photo + real anonymised email subjects ("Can I get last week's margin for...", "Need DIFOT by sector..."). Caption: "Today: every question is a ticket."
3. **Use-case sweep slide** — Six chat-bubble screenshots, each showing one of the 6 sample questions and the resolved answer (chart + numbers). Source: Figma high-fidelity mock of conversation UI. Anchor: "Six questions. Six answers. Six minutes."
4. **QBR pre-fill demo slide** — Side-by-side: today (days of prep, multiple tabs, manual copy-paste) vs tomorrow (one prompt -> full QBR pack PDF in <1hr). Source: composite of real OMX QBR pack snippet (sanitised) + Figma demo.
5. **Architecture slide** — Pure infographic pipeline: User question -> Claude w/ dbt semantic context -> Snowflake query -> Lens widget + citation chain. Show citation explicitly (per PR-013).
6. **Phase 1 -> Phase 2 slide** — Visual showing same engine powering two surfaces: internal Lens chat (analyst at desk) -> external Ask Max (small-business owner on phone). Source: composite photo. Anchor: "Same engine. Two markets."