Resume screening

Compare Gemini Flash, Claude Sonnet, GPT-5 for resume-to-JD matching. From $9/mo for 200K resumes. Bias audit and structured scoring inside.

Your usage

Default assumptions

Monthly requests200,000

Avg input tokens2000

Avg output tokens300

When to use this scenario

Resume screening produces a structured fit score plus a summary of a candidate's qualifications relative to a job description. The goal is not to make hiring decisions but to help recruiters triage 500 applications down to 30 worth reading carefully. Speed and throughput matter: a company posting a senior engineer role may receive 1,000 applications in 48 hours.

Gemini 2.5 Flash handles this volume economically — at $0.075/million input tokens, screening 200,000 resumes/month (avg 2K tokens) costs $30 in input. Output is short (structured score + 3-sentence summary), keeping total cost under $50/month for most hiring funnels. Claude Sonnet is preferable when the job description is unusually complex or when misclassification rates from the cheaper model prove too high in A/B testing.

Output schema should include: match score (0–100), top 3 matching qualifications, top 2 gaps, a single recommendation string. Never output a binary yes/no — recruiters need to calibrate their own threshold.

Common pitfalls

Not including explicit instructions to ignore candidate name, address, and graduation year — models may inadvertently encode demographic proxies in their scoring
Evaluating model output only against hiring decisions, not against recruiter ratings — the model should optimize for recruiter workload reduction, not final offer rate
Using a single score across all roles — a software engineer role and a sales role have entirely different fit dimensions; use role-specific prompt templates
Failing to audit output distributions monthly for disparate impact across protected groups; many jurisdictions (NYC, EU) require this for algorithmic hiring tools

Recommended routing

Sorted by best value for your usage

PRIMARY

Gemini 2.5 Flash

Google · quality 78 · 320 tok/s

Monthly cost$270

Vs baseline−75%

P50 latency0.3s

Use this

FALLBACK

Claude 4.6 Sonnet

Anthropic · quality 89 · 85 tok/s

Monthly cost$2.1K

Vs baseline−-91%

P50 latency1.1s

Add as fallback

DeepSeek V3.5

DeepSeek · quality 81 · 95 tok/s

Monthly cost$73

Vs baseline−93%

P50 latency1.5s

Try

Baseline = GPT-5 at the same usage = $1.1K/mo.

Routing simulator

Phase 2 preview

Drag the slider to split traffic between Gemini 2.5 Flash (primary) and Claude 4.6 Sonnet (fallback). See how your monthly bill moves — without writing a line of gateway code.

Primary: Gemini 2.5 FlashFallback: Claude 4.6 Sonnet

70% Gemini30% Claude

Blended monthly cost$819at the usage assumed above

Vs GPT-5−26%$1.1K → $819

Phase 2 turns this routing into a real OpenAI-compatible endpoint — one key, one bill, automatic failover. Drop your email to be notified at launch.

Stored in your browser only until our email backend lands. No tracking, one click to remove.

Use this routing via API

Phase 2 preview · gateway not live yet

PHASE 2 PREVIEW · gateway not live yetThis endpoint does not exist yet. The gateway is in Phase 2 — what you see below is a design preview of the planned interface, not a live API. We will email subscribers when it launches.

Preview the planned API call

$ curl https://api.aipricly.com/v1/chat/completions \
  -H "Authorization: Bearer $AIPC_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "scenario": "resume-screening",
    "messages": [{"role": "user", "content": "..."}]
  }'

Get notified at launch

Related scenarios

B-roll & stock footage

Compare Hailuo-02, Kling 2.1, Google Veo 3 Fast for stock b-...

Brand voice content

Compare Gemini 2.5 Flash, Claude Haiku, GPT-5 for on-brand m...

Chat with docs

Compare LLMs for retrieval-augmented generation: long-contex...