When to use this scenario
High-volume translation: support tickets, product copy, UGC. Input ≈ output token count. Cost is purely a function of throughput × token rate.
Gemini 2.5 Flash, DeepSeek V3.5, and Qwen 3 Max all do excellent multilingual translation at <$1 per million tokens. Frontier models are wasteful here.
Common pitfalls
- Using a frontier model "for quality" when a 0.5× cost model is indistinguishable on common language pairs
- Ignoring batching — translation is the perfect batch API workload (50% off, 24h SLA acceptable)
- Not testing per language pair — quality varies dramatically