Estimator methodology
Version 2026.06 · updated 2026-06-12. Every number below is versioned in the open and carries a citation — when a provider publishes a new disclosure, one row changes.
The estimation chain
energy (Wh) = (tokens / 1000) × Wh-per-1k-tokens(model class) × PUE emissions (gCO₂e) = energy (kWh) × grid intensity of region (gCO₂/kWh)
Results are always shown as a range: the honest uncertainty on per-token energy is ±3–5×, so a single number would be false precision. When you enter queries instead of tokens, we assume 500 output tokens per query (a typical chat answer).
1 · Energy per token
Anchored on 2025 provider disclosures — Google's median Gemini text prompt (0.24 Wh / 0.03 gCO₂e, including idle machines, CPU/RAM and datacenter overhead), Epoch AI's ~0.3 Wh per average ChatGPT query, and Mistral's LCA (~1.14 gCO₂e per 400-token answer). You pick a model class, not an exact model:
| Class | Examples | Wh / 1k output tokens |
|---|---|---|
| Light | Haiku, GPT-4o-mini, Gemini Flash | 0.12–0.25 |
| Standard | GPT-4o, Sonnet, Gemini Pro | 0.4–0.8 |
| Frontier / reasoning | Opus, o-series, extended thinking | 1.5–6 |
Frontier/reasoning models include hidden thinking tokens. The often quoted “3 Wh per query” figure predates provider disclosures and measured much less efficient 2023-era serving — we treat it as obsolete.
2 · Grid intensity
Your cloud region maps to a physical grid. We use location-based annual intensities (Ember) — the physical reality of the grid the datacenter draws from. Providers usually advertise market-based figures (renewable PPAs and certificates), which can show “0 g” on a coal-heavy grid; we note them but do not use them. A live ENTSO-E layer for EU regions is on the roadmap via our agents service.
| Region | Grid | gCO₂/kWh | Cloud regions |
|---|---|---|---|
| Europe (Stockholm) | Sweden (hydro + nuclear) | 25 | AWS eu-north-1 · Azure Sweden Central |
| Canada (Montréal) | Canada — Québec (hydro) | 30 | AWS ca-central-1 · GCP northamerica-northeast1 |
| Europe (Paris) | France (nuclear) | 55 | AWS eu-west-3 · Azure France Central |
| Europe (London) | United Kingdom | 220 | AWS eu-west-2 · Azure UK South |
| Europe (Amsterdam) | Netherlands | 270 | Azure West Europe · GCP europe-west4 |
| US West (Oregon) | United States — Oregon | 285 | AWS us-west-2 · GCP us-west1 |
| Europe (Ireland) | Ireland | 290 | AWS eu-west-1 · Azure North Europe |
| US Central (Iowa) | United States — Iowa | 340 | GCP us-central1 · Azure Central US |
| Europe (Frankfurt) | Germany | 350 | AWS eu-central-1 · GCP europe-west3 |
| US East (N. Virginia) | United States — Virginia | 370 | AWS us-east-1 · Azure East US · GCP us-east4 |
| Asia (Tokyo) | Japan | 450 | AWS ap-northeast-1 · Azure Japan East |
| Asia (Singapore) | Singapore | 470 | AWS ap-southeast-1 · GCP asia-southeast1 |
| Australia (Sydney) | Australia — NSW | 510 | AWS ap-southeast-2 · Azure Australia East |
| Asia (Mumbai) | India | 630 | AWS ap-south-1 · GCP asia-south1 |
3 · Vendor factors
The vendor comparison applies a multiplier to the model-class baseline (managed, well-batched serving at hyperscale PUE ~1.1) and maps each vendor to the regions it actually serves from. These are honest estimates, not disclosures — except Google, the only provider that has published a per-prompt figure.
| Vendor | Factor | Why |
|---|---|---|
| OpenAI | ×1–1.2 | Serves from Microsoft Azure, predominantly US — typically US East (Virginia). |
| Anthropic | ×1–1.2 | Serves from AWS and Google Cloud, US-heavy. |
| Google Gemini | ×0.8–1.1 | TPU serving on GCP; the only provider with a published per-prompt figure (0.24 Wh median). |
| Self-hosted (open weights) | ×2–4 | Own GPUs: lower batching and utilization than hyperscale serving, usually higher PUE. |
4 · Overheads
- PUE (Power Usage Effectiveness) — cooling and facility overhead on top of IT load. Default 1.2; adjustable 1.1–1.6 (hyperscalers report ~1.1, the global average is ~1.55).
- Embodied carbon — chip manufacturing, servers and buildings; optional +30% uplift, a common LCA approximation.
- Training is not amortized into per-query figures — it is a separate, one-off cost that would add false precision per query. We call it out instead of hiding it in the number.
Sources
- Google (2025) — median Gemini text prompt: 0.24 Wh / 0.03 gCO2e, incl. idle, CPU/RAM and datacenter overhead ↗
- Epoch AI / Sam Altman (2025) — ~0.3 Wh per average ChatGPT query ↗
- Mistral AI (2025) — LCA: ~1.14 gCO2e per 400-token answer ↗
- Ember — Yearly electricity data (grid intensity per country, location-based) ↗
- ENTSO-E Transparency Platform — live EU generation mix (planned live layer) ↗
- Uptime Institute (2024) — global average PUE ~1.55 ↗
Try it on your own workload — open the AI Footprint Estimator.
Research & Business plans · machine-readable export
↓ Download methodology.jsonVersioned constants, sources and grid intensities as JSON. Upgrade to Research to unlock.