Skip to content

Region selection

Every inference runs in a specific EU region. You can pin a region for compliance reasons, or let us route by current grid carbon intensity.

Available regions

Region code Provider Location Grid (annual avg, gCO₂/kWh) Notes
scaleway-par-1 Scaleway Paris, France 58 Default; lowest-latency for Western Europe
scaleway-par-2 Scaleway Paris, France 58 Failover for scaleway-par-1
atnorth-sto-1 atNorth Stockholm, Sweden 12 Lowest grid intensity available
atnorth-isl-1 atNorth Iceland 18 Geothermal; EEA
ovh-gra-1 OVHcloud Gravelines, France 58 Redundant Western Europe
ovh-rbx-1 OVHcloud Roubaix, France 58 Redundant Western Europe
hetzner-fsn-1 Hetzner Falkenstein, Germany 380 Gateway hosting; not used for inference by default
hetzner-hel-1 Hetzner Helsinki, Finland 95 Gateway hosting (low-carbon path)

Annual averages from JRC NEEFE 2024. Live values from ENTSO-E or regional ISO/TSO are available at grid sources.

Default routing

By default we route to scaleway-par-1 for inference. If this region is degraded or saturated, we fail over to scaleway-par-2 then ovh-gra-1. The receipt always reflects the actual region used.

Pinning a region

response = client.chat.completions.create(
    model="mistral-medium-3",
    messages=[...],
    extra_body={
        "vetted": {
            "region": "atnorth-sto-1"
        }
    },
)

If the requested region is unavailable, you receive a 503 service_unavailable error rather than a silent failover. This is by design — region pinning is usually motivated by compliance, and silent failover would defeat the purpose.

Carbon-aware routing

response = client.chat.completions.create(
    model="mistral-medium-3",
    messages=[...],
    extra_body={
        "vetted": {
            "routing": "carbon_aware"
        }
    },
)

Carbon-aware routing chooses the region with the lowest current grid intensity among regions where (a) your model is available, (b) latency budget allows, © the region is not saturated.

If no provider has a fresh enough operational live signal, the router degrades explicitly to the standard EU-preferred path instead of pretending an annual fallback is a live routing signal. The response exposes this in routing_trace.decision_mode and routing_trace.decision_reason. The receipt itself remains conservative and only reports the signal actually used for the environmental calculation.

Compliance considerations

  • All regions are within the EU/EEA. There is no third-country routing.
  • For customers under DORA scope (financial services), we recommend pinning to a single primary region with a documented secondary, rather than carbon-aware routing.
  • For customers under the Cyber Resilience Act vulnerability-reporting scope (effective 11 September 2026), the region choice does not affect reporting obligations.

Latency

Median round-trip latency from a Frankfurt origin to each region (rough indicators, your mileage will vary):

Region p50 p95
scaleway-par-1 22ms 45ms
ovh-gra-1 24ms 50ms
atnorth-sto-1 32ms 70ms
atnorth-isl-1 55ms 110ms

These numbers measure network-only latency to the gateway, not end-to-end inference completion time. End-to-end depends on model and prompt length.