When to use this scenario
Embedding-based recommendation computes similarity between a user's interaction history and a catalog of items (articles, products, courses, videos) to surface relevant next-best items. It complements collaborative filtering by handling the cold-start problem: new items with no interaction history can be recommended immediately based on their embedding similarity to items the user has engaged with.
OpenAI text-embedding-3-small encodes both catalog items and user history documents into the same vector space cheaply. A catalog of 500K product descriptions embedded at 200 tokens each consumes 100M tokens — a one-time $2 cost. Incremental embedding of new catalog items is negligible. The runtime query (embed user history + ANN search) costs fractions of a cent per session.
Voyage-3-Large is preferred for catalogs with long, complex item descriptions (technical documentation, academic papers, detailed product specs) where the higher-dimensional 1024-vector space captures more semantic nuance. Cohere Embed v4 offers Matryoshka representations that let you trade retrieval quality for storage cost at inference time — useful for very large catalogs under memory constraints.
Common pitfalls
- Embedding all item fields (title + description + reviews + specs) into a single vector without testing which field combination maximizes retrieval relevance for your specific task
- Not filtering by availability before returning recommendations — recommending out-of-stock products or deprecated content requires real-time metadata filtering in the ANN query, not just embedding similarity
- Treating recommendation as a pure offline problem — user preferences shift; catalog embeddings older than 6 months for fast-moving domains (news, fashion, software) degrade recommendation quality measurably
- Ignoring diversity constraints: an ANN search returns the N most similar items, which are often near-duplicates of each other; apply maximal marginal relevance or category diversification to produce a useful recommendation slate