Pedigree (Weidema/Ciroth)¶

What it is¶

A five-axis quality score attached to every emission factor used in the calculation. Originally proposed by Weidema and Wesnæs (1996) and elaborated by Ciroth et al. (2016), the pedigree matrix scores data quality on a 1-to-5 scale across five axes:

Axis	1 (best)	2	3	4	5 (worst)
Reliability	Verified data, measurement	Verified data, partly assumption	Non-verified data, partly assumption	Qualified estimate	Non-qualified estimate
Completeness	Representative, sufficient sample	Representative, smaller set	Representative, > 50% sites	Representative, < 50% sites	Unknown
Temporal correlation	< 3 years	< 6 years	< 10 years	< 15 years	Unknown / older
Geographic correlation	Area under study	Similar area	Different area	Unknown	Unrelated
Technological correlation	Same technology	Related technology	Different technology, same materials	Different processes, same technology	Unrelated

Why we use it¶

The pedigree score is a structured, auditable representation of "how confident are we in this emission factor for this query?" It feeds two downstream operations:

Monte Carlo prior dispersion — pedigree scores are mapped to log-normal standard deviations using Ciroth's lookup table, which become the priors for the Monte Carlo variance propagation.
Conformal interval width — wider pedigree priors produce wider conformal intervals at calibration time.

Mapping example¶

For a Mistral Medium 3 query running on Scaleway PAR-1 with live ENTSO-E grid data:

Emission factor	Reliability	Completeness	Temporal	Geographic	Technological
GPU energy (BoaviztAPI)	2	2	1	1	2
Host CPU+DRAM share	2	3	1	1	2
Datacentre PUE	2	2	2	1	1
Grid intensity (ENTSO-E live)	1	1	1	1	1
Embodied amortisation	3	3	2	2	2

The composite pedigree on the receipt is the median across factors, weighted by their share of the total impact. For tier 2, the typical composite is [2, 2, 1, 1, 2] — verified data, partly based on assumptions, recent, geographically and technologically aligned.

Pedigree score → log-normal SD lookup¶

We use Ciroth et al.'s lookup, simplified:

Score	Reliability SD	Completeness SD	Temporal SD	Geographic SD	Technological SD
1	1.00	1.00	1.00	1.00	1.00
2	1.05	1.02	1.03	1.01	1.18
3	1.10	1.05	1.10	1.02	1.50
4	1.20	1.10	1.20	1.10	2.00
5	1.50	1.20	1.50	1.50	3.00

The composite log-normal SD is the geometric combination of the per-axis SDs; this becomes the prior for that emission factor in the Monte Carlo simulation.

Auditor expectations¶

Assurance partners under ISAE 3000 review the pedigree-score worksheet as evidence of methodological rigour. We provide:

The pedigree score for every emission factor used in the period
The justification for each score
The lookup table version used to convert pedigree to log-normal SD
The composite score as it appears on each receipt

Where this is implemented¶

methodology/uncertainty/pedigree.py

Citations¶

Weidema, B. P., & Wesnæs, M. S. (1996). Data quality management for life cycle inventories — an example of using data quality indicators. J. Cleaner Production 4(3-4).
Ciroth, A., Muller, S., Weidema, B. P., & Lesage, P. (2016). Empirically based uncertainty factors for the pedigree matrix in ecoinvent. Int. J. Life Cycle Assessment 21(9).