How We Designed the GrailData Schema
Schema design for a product catalog sounds simple until you're staring at Jordan 1 variants from six different regions, three different sole materials, and four different release channels, all with slightly different names across sources.
Here's how we approached it.
style_id is the canonical key
Every sneaker has a style_id — a manufacturer's style code like DZ5485-612. This is the field we use as the primary join key across all sources because it's stable across regions and channels. Names vary; style codes don't.
slug is derived from the name for URL-friendly access but shouldn't be used as a stable identifier.
Flat over nested
Our API returns flat objects, not nested ones. release_date is a top-level field, not release.date. retail_price is top-level, not pricing.retail.
This decision was driven entirely by developer feedback from the early beta. Nested schemas look clean but create friction at every consumption point — destructuring, optional chaining, type definitions. Flat schemas are verbose to look at but easier to use.
Null handling
We use null consistently for missing data rather than empty strings, 0, or omitting the field. If last_sale_price is null, there are no recorded sales. If it's 0, something went wrong upstream. This distinction matters for analytics.
What we chose not to include
Condition ratings (new/used), authentication status, and individual listing IDs from resale platforms. All of these are platform-specific and would require continuous normalization across incompatible schemas. We focus on market-level data, not individual listings.
Ready to start building?