Prompt Token Cost Calculator

Paste your prompt, set output tokens, and instantly compare API costs across GPT, Claude, Gemini, and Llama — with caching and batch pricing.

Cost Calculator

Quick fill:

Input Tokens

Output Tokens (est.)

500

Total Tokens

500

Characters

Expected Output Tokens

≈ 375 words

API Calls per Day

Used for the scale estimate table below

Cheapest

Llama 3.1 8B128K

Meta

$0.0002

In $0.00Out $0.0002

GPT-5 mini400K

OpenAI

$0.001

In $0.00Out $0.001

Gemini 2.5 Flash1M

Google

$0.00125

In $0.00Out $0.00125

Gemini 3 Flash1M

Google

$0.0015

In $0.00Out $0.0015

Claude Haiku 3.5200K

Anthropic

$0.002

In $0.00Out $0.002

Claude Haiku 4.5200K

Anthropic

$0.0025

In $0.00Out $0.0025

Gemini 2.5 Pro2M

Google

$0.005

In $0.00Out $0.005

Gemini 3 Pro1M

Google

$0.006

In $0.00Out $0.006

GPT-5.41M

OpenAI

$0.0075

In $0.00Out $0.0075

Claude Sonnet 4.6200K

Anthropic

$0.0075

In $0.00Out $0.0075

Claude Opus 4.6200K

Anthropic

$0.01

In $0.00Out $0.01

Production Scale — 1,000 calls/day

Model	Provider	Per Call	Daily Cost	Monthly Cost (×30 days)
Llama 3.1 8B	Meta	$0.00004	$0.04	$1.20
Gemini 2.0 Flash-Lite	Google	$0.00015	$0.15	$4.50
Gemini 2.5 Flash-Lite	Google	$0.0002	$0.20	$6.00
Gemini 2.0 Flash	Google	$0.0002	$0.20	$6.00
Llama 3.3 70B	Meta	$0.0002	$0.20	$6.00
GPT-5 mini	OpenAI	$0.001	$1.00	$30.00
Gemini 2.5 Flash	Google	$0.00125	$1.25	$37.50
Gemini 3 Flash	Google	$0.0015	$1.50	$45.00
Claude Haiku 3.5	Anthropic	$0.002	$2.00	$60.00
Claude Haiku 4.5	Anthropic	$0.0025	$2.50	$75.00
Gemini 2.5 Pro	Google	$0.005	$5.00	$150.00
Gemini 3 Pro	Google	$0.006	$6.00	$180.00
GPT-5.4	OpenAI	$0.0075	$7.50	$225.00
Claude Sonnet 4.6	Anthropic	$0.0075	$7.50	$225.00
Claude Opus 4.6	Anthropic	$0.01	$12.50	$375.00

How to Use

Paste your prompt

Type or paste your system message, user prompt, or full conversation. Token count updates instantly in your browser — nothing is uploaded.

Set expected output

Enter the average output tokens the model will generate. Default is 500 tokens (≈ 375 words). Adjust to match your typical response length.

Filter, sort & compare

Use provider tabs to focus on one vendor. Toggle "Cheapest first" to rank models by cost. Enable "Batch API" to see 50%-off async pricing. Toggle caching to simulate cached input.

Project production cost

Enter your daily API call volume — the scale table shows projected daily and monthly spend (×30 days) per model so you can budget before committing.

How LLM API pricing works

Every major LLM provider bills on a pay-per-token model. You pay separately for input tokens (your prompt, system message, conversation history) and output tokens (the model's generated response). Output tokens are typically priced 3–10× higher than input because generating each token requires a full sequential forward pass — whereas input tokens are read in parallel in a single pass.

The formula: Total cost = (input tokens × input rate) + (output tokens × output rate). Rates are expressed per million tokens ($/1M). A 1,000-token prompt at $3/1M costs $0.003 — small per call, but 10,000 calls/day at $0.01 each is $3,000/month. Use the scale table above to see how costs compound at your volume.

Looking to just count tokens without cost analysis? Try our AI Token Calculator — it shows words, characters, and a token-to-cost reference table alongside the count.

2026 LLM Pricing Reference

Current pricing per 1 million tokens. Batch pricing is 50% of standard for async workloads. Always verify with your provider before committing to a budget.

Model	Provider	Context	Input / 1M	Output / 1M	Out/In Ratio	Type
GPT-5.4	OpenAI	1M	$2.50	$15.00	6.0×	Standard
GPT-5.4 (Batch)	OpenAI	1M	$1.25	$7.50	6.0×	Batch
GPT-5 mini	OpenAI	400K	$0.250	$2.00	8.0×	Standard
GPT-5 mini (Batch)	OpenAI	400K	$0.125	$1.00	8.0×	Batch
Claude Opus 4.6	Anthropic	200K	$5.00	$25.00	5.0×	Standard
Claude Sonnet 4.6	Anthropic	200K	$3.00	$15.00	5.0×	Standard
Claude Sonnet 4.6 (Batch)	Anthropic	200K	$1.50	$7.50	5.0×	Batch
Claude Haiku 4.5	Anthropic	200K	$1.00	$5.00	5.0×	Standard
Claude Haiku 4.5 (Batch)	Anthropic	200K	$0.500	$2.50	5.0×	Batch
Claude Haiku 3.5	Anthropic	200K	$0.800	$4.00	5.0×	Standard
Gemini 3 Pro	Google	1M	$2.00	$12.00	6.0×	Standard
Gemini 3 Flash	Google	1M	$0.500	$3.00	6.0×	Standard
Gemini 2.5 Pro	Google	2M	$1.25	$10.00	8.0×	Standard
Gemini 2.5 Flash	Google	1M	$0.300	$2.50	8.3×	Standard
Gemini 2.5 Flash-Lite	Google	1M	$0.100	$0.400	4.0×	Standard
Gemini 2.0 Flash	Google	1M	$0.100	$0.400	4.0×	Standard
Gemini 2.0 Flash-Lite	Google	1M	$0.075	$0.300	4.0×	Standard
Llama 3.3 70B	Meta	128K	$0.230	$0.400	1.7×	Standard
Llama 3.1 8B	Meta	128K	$0.050	$0.080	1.6×	Standard

Cost Optimization Strategies

Practical techniques to reduce API spend without sacrificing output quality.

Strategy	Typical Savings	Notes
Prompt Caching	Up to 90% on input	Cache static system prompts or documents. Supported by OpenAI and Anthropic. Toggle in calculator above.
Batch API	50% overall	OpenAI Batch and Anthropic Batches process requests within 24 hours at 50% discount. Enable in calculator above.
Shorten system prompt	10–40% on input	Every request sends the system prompt. A 500-token reduction × 10K calls/day = 5M fewer input tokens/day.
Limit max_tokens	20–60% on output	Set max_tokens to the minimum your use case needs. Output is typically your most expensive line item.
Route to smaller model	80–95% overall	Use Haiku / Flash-Lite for classification, extraction, and summarization. Reserve Opus / Pro for complex reasoning.
Compress conversation history	30–70% on input	Summarize older turns instead of appending the full chat history to every new request.

Savings percentages are estimates. Actual results depend on your specific prompts, output length, and provider plan. Always measure with your real workload.

FAQ

Have more questions? Contact us