Methodology¶
The Vetted Inference per-query environmental accounting methodology, version-pinned to the receipt that cites it.
Source repo (AGPL-3.0) Methodology paper PDF
The four-tier ladder¶
We compute four estimates per query at four levels of evidential strength. The receipt declares which tier was returned and the 90% confidence interval at that tier.
| Tier | Method | Inputs | Typical 90% CI | Best for |
|---|---|---|---|---|
| 1 | Proxy | Public benchmarks | ±60% | Unsupported models, sketch estimates |
| 2 | Parametric | BoaviztAPI + ecoinvent + ADEME | ±25–35% | Default; CSRD reporting |
| 3 | Telemetry | NVML + vLLM + live grid | ±10–18% | Enterprise, calibration |
| 4 | Audited | ISAE 3000 limited assurance | tier-3 + opinion | Regulatory disclosure |
Uncertainty propagation¶
Uncertainty propagates through the calculation graph using:
- Pedigree (Weidema/Ciroth) for emission-factor data quality scoring
- Monte Carlo log-normal for variance propagation
- Bayesian hierarchical pooling for cross-model and cross-region calibration
- Conformal prediction for honest 90% interval calibration
- Conformal coverage publishing for the held-out empirical coverage evidence surfaced each release
- Sobol sensitivity for dominant-input identification (quarterly publication)
- Shapley attribution for multi-tenant batch fair-allocation
Three indicators, two boundaries¶
We report three indicators (climate, water, resource depletion) following the Mistral × Carbone 4 × ADEME LCA convention and AFNOR Frugal AI methodology. We report two boundaries (narrow accelerator-only, comprehensive accelerator+host+idle+PUE) following Google Gemini 2025 disclosure conventions.
The default indicator is climate (gCO₂e); the default boundary is comprehensive. Both can be requested via extra_body.vetted.boundary.
Fresh versus live¶
The methodology makes a strict distinction between fresh and live regional grid evidence.
- Live means the source observation falls within the strict
+/-15 minutereceipt-matching window. - Fresh means the signal is recent enough to remain operationally useful, but it does not meet that strict live rule.
- Fallback means the receipt had to use a lower-fidelity source class such as prior-week or annual factors.
That distinction matters more than marketing tone. A green operational status page does not mean every region is live, and a useful near-time region is not automatically strict live. Current region-bucket posture is published in the carbon intensity sources page and should be read alongside the receipt provenance fields:
grid_intensity_observed_atgrid_intensity_requested_atgrid_intensity_age_minutesgrid_temporal_match
Current calibration status¶
Methodology v0.2.0 work is now grounded in real telemetry and legitimate truth-bearing observations, but the evidence is not all of one kind.
What is calibrated now:
- a real calibration dataset assembled from production audit-ledger snapshots plus legitimate truth overlays
- a held-out conformal artifact generated in stratified
cellmode - a real hierarchical fit on the current dataset
- a real Sobol sensitivity report on the current calibrated evidence base
What is still shadow-only:
- the strongest truth-bearing calibration cells today come from self-hosted or production-like shadow domains
- those cells are valid for methodology calibration and explanatory power
- they are not equivalent to exact hosted provider truth for a specific hosted production cell
- the current reference-cell transferability report does not show strong or moderate transfer support from the calibrated shadow cells into the hosted FR H100 cells
- that means the reference-cell lane is already scientifically useful, but it is not yet a substitute for hosted exact-cell closure
What exact hosted cells remain blocked:
- the current hosted FR H100 cells are still truth-empty at exact-cell level
- the lead blocked cell is
mistral-medium-3|nvidia_h100_sxm|scaleway_par2_fr - that cell now has enough hosted receipt corpus to be useful immediately once a qualifying provider artifact appears, but it does not yet have legitimate
provider_published_kwh - that same lead cell is still not fully join-ready today because provider-request-id coverage is incomplete in the current hosted validation report
This is deliberate. The methodology pages should distinguish:
- calibrated now
- shadow-only today
- exact hosted-cell blocked
instead of collapsing them into one maturity bucket.
Worked example: 400 tokens, four ways¶
We publish a complete worked example reproducing the same prompt through all four tiers, with intermediate calculations, BoaviztAPI request payloads, NVML samples, and conformal-calibration fits, at github.com/vetted-inference/methodology/examples/2026-03-17-400-tokens. The full write-up is published as a Journal entry on the marketing site.
Headline result for that example:
| Tier | gCO₂e median | 90% CI | Boundary |
|---|---|---|---|
| 1 (proxy) | 1.62 | 0.85–3.10 | comprehensive |
| 2 (parametric) | 1.07 | 0.71–1.58 | comprehensive |
| 3 (telemetry, region-adjusted) | 0.97 | 0.83–1.11 | comprehensive |
The agreement is by construction (the calibration that produces the conformal interval forces it). What matters is that the methodology surfaces the conditions under which it would not agree.
License¶
The methodology code is AGPL-3.0. The methodology document is CC-BY-4.0. Closed-source forks of the methodology code are grounds for Foundation veto-share intervention per Articles § 11(d).
References¶
These pages reference, throughout:
- ISO 14040:2006 / 14044:2006 — LCA principles and framework
- ISO/IEC 21031:2024 — Software Carbon Intensity (SCI)
- ISO/IEC 42001:2023 — AI Management Systems
- ESRS E1 / E3 / E5 (post-Omnibus simplified ESRS, mid-2026)
- ecoinvent v3.10 (Zürich, 2024)
- ADEME Base Empreinte 2024 (Paris)
- JRC NEEFE 2024 (Joint Research Centre)
- IPCC AR6 GWP100 (2021)
- Weidema & Wesnæs (1996); Ciroth et al. (2016) — pedigree matrix
- Lloyd & Ries (2007) — log-normal Monte Carlo for LCA
- Angelopoulos & Bates (2021) — conformal prediction
- Han et al. (ISCA 2025) — Shapley attribution for multi-tenant inference
- Mistral × Carbone 4 × ADEME LCA (2025) — Mistral Large 2 LCA
- Google Gemini Production Disclosure (2025) — comprehensive boundary methodology
Specific page citations are listed at the bottom of each tier page.