Deck 16 · Product Data Enhancement (PDX → Snowflake)

OMX Innovation · Deck 16 · Product Data Enhancement (PDX → Snowflake)

PDX takes all product data — ranged or not — and we round un-ranged SKUs through Snowflake as the staging layer.

This deck is the work to get every SKU we sell, quote, or might one day stock into PDX with the quality our search, quote, switching and IBP layers depend on.

01 / 06

Why now

The wedge.

Search relevance (Deck 08), Switching (Deck 09), Quoting (Deck 04), and IBP (Deck 14) are all only as good as the underlying product data
Today: ranged SKUs sit in PDX; off-range / quote-only SKUs sit unmanaged, leading to slow supplier loops and missing spec sheets
PDX is the canonical platform; Snowflake is the round-trip staging for un-ranged data until it earns its way into PDX
Modern AI (Anthropic / OpenAI) makes enrichment of descriptions, attributes, classifications, images cheap-per-SKU vs the historical manual-labour cost

02 / 06

What this covers

What's in scope.

01

PDX as the master

ranged or not, all product data lives there

02

Snowflake staging for un-ranged

when a SKU is quoted, requested, or scraped via Plex-CI, it lands in Snowflake first; promoted into PDX when it qualifies

03

AI-assisted enrichment

descriptions, attributes, dimensions, GTINs, image-presence checks, taxonomic classification — generated/validated by AI then human-reviewed

04

Data-quality scorecard

completeness, accuracy, consistency, age — per SKU, per category, per supplier

05

Spec sheet capture

supplier loop reduction; pull from supplier portals when possible, OCR when not

06

Image normalisation

catalog hero, alt views, scale references — standardised format

03 / 06

The problem

What's broken.

01

Off-range = unmanaged

when quoting team finds a SKU outside the ranged catalog, it's a multi-day supplier loop

02

Spec-sheet hunt

quoting/customer-service rep manually searches supplier websites for product information

03

Inconsistent attributes

same product might be described 3 different ways across PDX, Pronto, web, supplier feeds

04

Image gaps

many SKUs have no hero, alt views, or scale references

05

No data-quality SLA

no scorecard, no remediation backlog, no clear ownership

06

Snowflake as accidental graveyard

un-ranged SKUs sit there because nobody promotes them up

04 / 06

The benefits

The value story.

Lever

Mechanism

Sizing

Quoting speed

Spec sheets + attributes immediately available

Quote turnaround 3-5 days → <1hr (Deck 04 benchmark)

Search relevance

Better attributes = better search results

Conversion lift in Deck 08

Switching enablement

Cross-references to competitor SKUs

Deck 09 funnel works

AI enrichment cost

$0.01-0.10 per SKU enrichment vs $5-50 manual

Tens of thousands of SKUs × $5+ savings = material

Margin lift

Better data = better pricing decisions in PPSS/CI

Indirect; supports Deck 02 / Deck 13

05 / 06

The ask + roadmap

What we need.

Now

Problem vector grid (4)

: Off-range-pain / Spec-sheet-hunt / Image-gaps / Attribute-inconsistency

P2

PDX as the centre — round-trip diagram

: Ranged SKU → PDX. Un-ranged → Snowflake → enrichment → promote → PDX.

P3

AI-enrichment pipeline

: Supplier feed → AI propose → human-validate → PDX commit. Per-SKU economics.

P4

Data-quality scorecard

: completeness × accuracy × consistency × age — visual heat map by category

P5

Before/after a SKU record

: today's sparse data vs the enhanced data

Audience

Primary: Chief Commercial Officer + Master Data lead + Chief Digital Officer. Secondary: Customer Service (quoting team) + Sales (off-range frequent flyers). Tertiary: Buying — data quality drives pricing decisions.

References

Memory: PDX as the master, Snowflake as un-ranged round-trip (Jeff verbal 2026-06-28)
Memory: Deck 05 Global Catalog research — off-range margin leak $1.8M-$4.8M annual; PIM tools $100k-$300k/yr packaged
Memory: OMX dbt models at lens/Current/libraries/dbt/models/presented/ — existing structured product data
Memory: Plex-CI as competitor SKU data source — feeds the cross-reference layer
Memory: FDL REVIEW_OMX_dbt v2.0 standards — landing → ODS/STAGING → DW → PRESENTED; this deck's data flows fit that pattern

06 / 06