← Back to library
OMX innovation · Competitor Intelligence (Plex-CI) · Stella visit · 29 June 2026
Plex-CI is running.
Talk to the size.
59 NZ competitors scraped, sitemap channel unblocked. The numbers below are real — pulled from the 2026-06-27 run, not estimated.
01 / 04
The size

The pipeline exists.
The numbers are real.

545,789
URLs captured across 59 NZ competitors · sitemap channel · run 2026-06-27
471,991
URL change events detected vs. the 2026-04-24 baseline — the change signal that drives every downstream use case
59
NZ competitors in scope · scraped on schedule · 37 OK / 22 FAIL on the 2026-06-27 run
4
Clean direct suppliers (NXP, DiscountOffice, McGreals, Hurdleys) — the verified spine
Run · 23:05 UTC 2026-06-27
OK
FAIL
Status
Sitemap channel — direct Python → Postgres (bypassed broken SQLite chain)
37 / 59
22 / 59
Unblocked
Products channel — full product detail scrape per competitor
Parked
News channel — competitor PR / blog / announcement capture
Parked
Social channel — LinkedIn / Twitter / Meta surface
Parked
Source: scraper run state, MEMORY entry 2026-06-27. The 22 failures and 3 parked channels are the next-stage backlog — not an excuse to bin the size.
02 / 04
Vendor field — what's clean, what's contaminated
Reliability map

Four are clean.
Twelve are contaminated.

Real direct suppliers · use for supplier-side analysis
The 4 clean spine
  • NXP
  • DiscountOffice
  • McGreals
  • Hurdleys
SQL filter: WHERE slug IN ('nxp','discountoffice','mcgreals','hurdleys'). These four are the only competitors in the Shopify cohort where the vendor field reliably maps to a real supplier. Use this set for every supplier-side analysis.
Brand-owner duplicates · contaminated vendor data
The 12 noisy slots
Acquire, eLive, PBTech, JB Hi-Fi, Liquorland, Eels Supplies, Vidak, Whitcoulls and the rest. The vendor field on these is brand-owner data, not supplier — so it duplicates and misleads. Filter them out of supplier analysis. For brand-owner analysis, use products.brand across all 16 competitors instead — that gives the cleaner signal.
Classification confirmed 2026-06-22 from the Plex-CI Shopify cohort review. Use the SQL filter above; do not score the brand-owner slots as suppliers.
03 / 04
Context · what next

Pipeline is paid for.
Productisation is incremental.

What it is today
Plex-CI scrapes 59 NZ competitors on schedule. Sitemap channel unblocked 2026-06-27 via direct Python → Postgres path (bypassed the broken SQLite chain). Raw URLs JSON kept legacy-compat shape so downstream consumers don't break. The change-detection signal is captured against a 2026-04-24 baseline.
What's next · channels
Sitemap is unblocked. Products, news, and social channels are parked but designed. The same direct-to-Postgres pattern that fixed sitemap is the unblock path for the other three. Sequence: products first (highest value for pricing decisions), news second (commercial early-warning), social third (deeper read).
The ask
The IP, infrastructure, and operational tradecraft are already paid for — productisation is incremental, not a rebuild. Decide the sequencing on the parked channels, confirm the 4-clean / 12-noisy classification holds for downstream use cases, and lock the data into the pricing and competitive workflows that need it.
Deeper read · deck 13 deck-13-plex-ci-productised · the productisation thesis →
References: MEMORY entries 2026-06-27 (sitemap scraper unblock) and 2026-06-22 (vendor field classification). Plex-CI project lives at D:/20-PROJECTS/officemax/competitor-intelligence/.
04 / 04