Skip to content
UCP
Menu

Satellite

Audit methodology

A transparent, reproducible methodology for auditing a merchant's agent-readiness, so any team can run it internally and any auditor can defend its findings.

Updated : April 2026 · Primary query : agent commerce audit methodology

This page documents the audit methodology that backs the readiness checklist. It is intentionally reproducible: a merchant team, a consultant and a platform partner should produce similar scores when applied to the same catalog.

Principles

  • Sample-based. You never need to audit every SKU; you need a statistically meaningful sample per category.
  • Weighted. P0 items are blocking; P1 and P2 weight the score progressively.
  • Observable. Every check produces a concrete pass/fail with evidence (URL, feed row, screenshot, API payload).
  • Reproducible. Two auditors running the method on the same data land within 5 percentage points.

Inputs

  1. Access to catalog feed (URL and credentials).
  2. Public PDP URLs.
  3. Public returns and shipping policy pages.
  4. Read-only access to PSP configuration (optional but improves transactional scoring).
  5. Server logs, 90 days (optional but enables observability scoring).

Sampling

  1. Identify top 5 categories by revenue.
  2. For each category, take a random sample of 10 SKUs (stratified by in-stock / out-of-stock if possible).
  3. For catalogs with >10,000 SKUs, increase sample to 20 per category.
  4. Exclude end-of-life and draft SKUs.

Check categories (six)

CategoryWeightWhat it measures
Identity15%GTIN / MPN / Brand coverage and consistency.
Semantics25%Structured data, typed attributes, taxonomy.
Freshness20%Price/stock parity, feed cadence, updated_at accuracy.
Policy15%Returns/shipping/warranty as data.
Discovery15%Feed compliance, canonical URLs, sitemap, AI-crawler access.
Transaction10%Agent-pay readiness, lifecycle events.

Scoring rules

  • Each check returns 0 (fail), 0.5 (partial) or 1 (pass).
  • Category score = average of checks in that category.
  • Overall score = weighted sum across categories.
  • If any P0 check fails, the category score is capped at 0.6.

Evidence requirements

Each check must store one of:

  • A URL (for PDP/policy checks), screenshotted at audit date.
  • A feed row (anonymized, stored as JSON).
  • An API response (headers + body, timestamped).
  • A tool output (Rich Results Test, Merchant Center diagnostic, log query).

Audits without evidence are not reproducible and not defensible.

Deliverables of an audit

  1. Score card, overall, by category, per SKU sampled.
  2. Top 10 remediation items, concrete, ranked by impact and effort.
  3. 90-day remediation plan, who owns what, expected score lift.
  4. Re-audit schedule, quarterly recommended.

Common audit pitfalls

  • Auditing marketing promises instead of system behaviour. A beautiful returns page prose means nothing if JSON-LD is missing.
  • Sampling only hero SKUs. Long-tail SKUs often fail faster.
  • Trusting the feed alone. Feed and PDP must be cross-checked.
  • Ignoring region variants, a catalog can be agent-ready in one region and not in another.
  • Grading schema.org markup on presence instead of correctness.

Tools we recommend

  • Google Rich Results Test and Schema Markup Validator.
  • Google Merchant Center diagnostics.
  • A feed linter (Channable, Feedonomics, or custom scripts).
  • A headless browser for consent-wall / rendering checks.
  • A log analyzer for crawler-traffic segmentation.

Where to go next