This page documents the audit methodology that backs the readiness checklist. It is intentionally reproducible: a merchant team, a consultant and a platform partner should produce similar scores when applied to the same catalog.
Principles
- Sample-based. You never need to audit every SKU; you need a statistically meaningful sample per category.
- Weighted. P0 items are blocking; P1 and P2 weight the score progressively.
- Observable. Every check produces a concrete pass/fail with evidence (URL, feed row, screenshot, API payload).
- Reproducible. Two auditors running the method on the same data land within 5 percentage points.
Inputs
- Access to catalog feed (URL and credentials).
- Public PDP URLs.
- Public returns and shipping policy pages.
- Read-only access to PSP configuration (optional but improves transactional scoring).
- Server logs, 90 days (optional but enables observability scoring).
Sampling
- Identify top 5 categories by revenue.
- For each category, take a random sample of 10 SKUs (stratified by in-stock / out-of-stock if possible).
- For catalogs with >10,000 SKUs, increase sample to 20 per category.
- Exclude end-of-life and draft SKUs.
Check categories (six)
| Category | Weight | What it measures |
|---|---|---|
| Identity | 15% | GTIN / MPN / Brand coverage and consistency. |
| Semantics | 25% | Structured data, typed attributes, taxonomy. |
| Freshness | 20% | Price/stock parity, feed cadence, updated_at accuracy. |
| Policy | 15% | Returns/shipping/warranty as data. |
| Discovery | 15% | Feed compliance, canonical URLs, sitemap, AI-crawler access. |
| Transaction | 10% | Agent-pay readiness, lifecycle events. |
Scoring rules
- Each check returns 0 (fail), 0.5 (partial) or 1 (pass).
- Category score = average of checks in that category.
- Overall score = weighted sum across categories.
- If any P0 check fails, the category score is capped at 0.6.
Evidence requirements
Each check must store one of:
- A URL (for PDP/policy checks), screenshotted at audit date.
- A feed row (anonymized, stored as JSON).
- An API response (headers + body, timestamped).
- A tool output (Rich Results Test, Merchant Center diagnostic, log query).
Audits without evidence are not reproducible and not defensible.
Deliverables of an audit
- Score card — overall, by category, per SKU sampled.
- Top 10 remediation items — concrete, ranked by impact and effort.
- 90-day remediation plan — who owns what, expected score lift.
- Re-audit schedule — quarterly recommended.
Common audit pitfalls
- Auditing marketing promises instead of system behaviour. A beautiful returns page prose means nothing if JSON-LD is missing.
- Sampling only hero SKUs. Long-tail SKUs often fail faster.
- Trusting the feed alone. Feed and PDP must be cross-checked.
- Ignoring region variants — a catalog can be agent-ready in one region and not in another.
- Grading schema.org markup on presence instead of correctness.
Tools we recommend
- Google Rich Results Test and Schema Markup Validator.
- Google Merchant Center diagnostics.
- A feed linter (Channable, Feedonomics, or custom scripts).
- A headless browser for consent-wall / rendering checks.
- A log analyzer for crawler-traffic segmentation.
Where to go next
- Run the readiness checklist against a 50-SKU sample.
- Apply the principles of best practices.
- Compare against the target state in product catalogs for AI.