What Actually Makes AI Products Work? It's Not the Model You Choose
By Amin Rabinia · Founder, Glissando AI
When an AI product gives inconsistent results, the instinct is almost always the same: try a better model. Upgrade the version. Add more examples to the prompt. Tune the parameters.
Sometimes that helps. Usually it doesn't — because the model was never the bottleneck.
The Problem Nobody Wants to Solve First
We learned this building an AI system for IQ Design, a furniture retailer. The product takes an inspiration photo — a living room someone saw on Instagram, a screenshot from a design magazine — and matches it to real products in their catalog.
The technical challenge sounds like a computer vision problem. And it partly is. But the harder problem was something else entirely: customers describe furniture in emotional, subjective language — "cozy," "warm," "Scandinavian," "the vibe of a cabin" — while a product catalog is organized in technical terms: material, category, dimensions, price.
No model, however capable, can bridge that gap on its own. "Cozy" doesn't map to a SKU. Something has to define what "cozy" means in terms the system can actually use — consistently, the same way, every time.
That something is a taxonomy. And building it is slow, unglamorous, and easy to skip.
What a Taxonomy Actually Is
Strip away the jargon and a taxonomy is just this: an explicit, structured vocabulary for a domain that previously only existed in people's heads.
For IQ Design, that meant defining the actual artifacts being described (silhouette, material, finish, proportion, color palette), the precise definitions of each one, the guidelines for how they relate to each other, and the syntax for how they get expressed in a way a model can use consistently.
Before this existed, "cozy" meant something different to every designer who used the word, and the AI had no consistent way to translate it into product attributes. After it existed, "cozy" had a specific, structured meaning the system could apply the same way across thousands of photos and thousands of products.
This is the difference between tacit knowledge and explicit structure. Tacit knowledge lives in an expert's head — real, valuable, but inconsistent and hard to scale. A taxonomy translates it into something a system can actually rely on.
The Decision That Changed the Results
The turning point for IQ Design wasn't a model upgrade. It was the decision to step back and build a high-level vision of the domain before writing a single prompt: what are the artifacts, what do they mean, what are the guidelines for using them, what's the syntax for expressing them.
That research and refinement phase took real time. It wasn't the part of the project that looked impressive in a demo. But it's the part that made everything downstream work.
This matters most in domains — like aesthetics, taste, or quality — where feelings and subjective description naturally compromise accuracy. If you don't structure that domain explicitly, you're asking the model to guess at a definition that changes depending on who's describing it. No amount of prompt engineering fixes a definition problem.
The Contrast: What Happens Without One
We've seen the other path too. Another development team approached a similar problem by skipping the taxonomy step entirely — going straight to prompt engineering, asking a model to interpret aesthetic descriptions without first defining what those descriptions should mean in a structured way.
The result was inconsistent and unsystematic. The same input photo could produce different matches depending on minor prompt wording changes. There was no shared vocabulary the system was reasoning from — just pattern-matching on the fly, which works occasionally and fails unpredictably.
This is the trap of treating prompting as the whole job. Prompting isn't product design — and in the same way, prompting isn't domain modeling either. A clever prompt can't substitute for a system that actually knows what it's talking about.
Why This Beats Chasing a Better Model
It's tempting to think of model choice as the lever that matters most, because it's the most visible one — you can point to a model name, a benchmark score, a version number. Domain structure is invisible. No one sees the taxonomy; they just see whether the product works.
But a better model trained on the same ambiguous, unstructured inputs will still produce ambiguous, unstructured outputs — just with more confidence. Foundation models are extraordinarily capable, but capability isn't the same as domain knowledge. They don't know what "cozy" means to your specific customers, in your specific catalog, in your specific market. You have to tell them, in a structure they can use reliably.
This is also why the choice between RAG and fine-tuning often matters less than people expect. Both are ways of injecting domain knowledge into a model. Neither works well if the domain knowledge itself hasn't been structured first. The taxonomy comes before the architecture decision, not after.
This Is Also a Function vs. Feature Question
A taxonomy isn't a feature you can point to in a product demo. It's infrastructure — closer to what we've called the function an AI product actually performs, as opposed to the visible features layered on top. Skipping it doesn't show up immediately. It shows up three months later, in inconsistent results that are hard to debug because there's no structure to debug against.
This is part of why we build AI products in phases rather than all at once — the early phases are exactly where this kind of domain work gets surfaced and prioritized, before it's expensive to fix.
What This Means for You
If an AI feature in your product is giving inconsistent or unpredictable results, the first question shouldn't be "which model should we switch to." It should be: has anyone actually defined, explicitly and structurally, what the system is supposed to understand in this domain?
If the answer is no, that's the work to do first — before any model upgrade, before more prompt tuning, before adding features on top of a foundation that isn't there yet.
It's slower. It's not the part that looks exciting in a pitch deck. But it's the difference between an AI product that's consistently useful and one that's a coin flip dressed up as intelligence.
If you're trying to figure out whether your AI product has a model problem or a domain-structure problem, that's exactly the kind of question worth working through with someone who's done this before. Get Expert Input ($99) — a paid session where we look at your specific case and tell you where the real gap is.
This post is part of the AI Agents Guide — from the basics to the technical depth behind agents that actually work.
Related reading
Get the next one in your inbox
One practical AI idea per week, from real client projects. No fluff, unsubscribe anytime.