Developer guide

How routing works

phi-cloud is one OpenAI-compatible endpoint in front of many providers. Two headers — X-Region and X-PHI — decide which host serves a request, and the response tells you exactly what happened.

The decision

Every request is resolved in four steps.

You never call a provider directly. You describe the request; the router resolves an ordered list of routes that satisfy your region and data class, then uses the best one.

1 · Pick the region

The X-Region header chooses where the request is processed. The router sends it to that region’s data-resident provider first — never to a host outside the region you named.

2 · Declare the data class

X-PHI: true restricts routing to providers whose data-processing agreement is verified for that jurisdiction. If none exists for the region + modality, the request is refused (403) rather than crossing a border.

3 · Choose a model (or auto)

model: "auto" lets the router pick the cheapest eligible model. Or pin a concrete id, or a provider-agnostic canonical alias that resolves per region.

4 · Failover, in-tier

On an upstream error the router walks an ordered candidate list. PHI requests only ever see PHI-eligible routes, so a failover can never spill to an unverified host — and never leaves the region.

Region residency

Each region has a resident host.

For auto traffic the router prefers the region’s data-resident provider, then any other cloud serving the region. A pinned model still wins if it serves the region.

RegionResident providerPHI
CHInfomaniakEligible
EUScalewayEligible
USTogether AIGeneral
UK · ASIA · ANZ · LATAM · MENA · INDIA · WORLDGoogle (Gemini)General

PHI for the document & voice modalities (OCR, speech-to-text, text-to-speech) is additionally served by Microsoft Azure in CH, EU and US under a HIPAA BAA + regional DPA. See /vendors for the full registry.

Modality coverage

What’s available where.

Text and embeddings run broadly; documents and voice are deliberately Swiss + EU + US only and residency-separate (no cross-border failover). A region with no host for a modality returns 422 no_route — by design.

ModalityRegionsPHI
Chat completionsEvery regionCH, EU
EmbeddingsCH, EU (production)CH, EU
VisionCH, EUCH, EU
OCRCH, EU, USCH, EU, US
Speech-to-textCH, EU, USCH, EU, US
Text-to-speechCH, EU, USCH, EU, US

In the hosted chat these surface as the paperclip (image → vision, PDF → OCR, audio → transcription), the mic (voice input), and the “Listen” button (text-to-speech) — each enabled only when your pinned region supports it.

Canonical aliases

One name, any region.

Instead of a concrete model id you can pass a provider-agnostic alias. Combined with X-Region it resolves to whichever provider hosts that model in your region — so the same code runs everywhere.

gemma-4

CH → Infomaniak 31B · EU → Scaleway 26B

qwen3.5

CH → Infomaniak · EU → Scaleway · US → Together

qwen3-embedding-8b

CH → Infomaniak · EU → Scaleway · (4096-d everywhere)

whisper

CH / EU / US → Azure Speech (STT)

Audit headers

Every response is auditable.

The routing decision is echoed back on each 200, so you can record exactly what served the request and what it cost.

x-phi-routed:   gemma-4-31b/CH/general
x-phi-tier:     non_phi      # or phi
x-phi-attempts: 1            # routes tried before success
x-phi-cost-micro: 4200       # metered cost, micro-USD
x-request-id:   9f1c…        # quote in support

A 403 phi_blocked means you asked for PHI in a region/modality with no verified host — the gateway refuses rather than crossing a border. Pin a region that has one (e.g. CH or EU).

Ready to wire it up?

The API reference has copy-paste cURL, Python and Node for every endpoint, including vision, OCR and audio.