1 · Pick the region
The X-Region header chooses where the request is processed. The router sends it to that region’s data-resident provider first — never to a host outside the region you named.
Developer guide
phi-cloud is one OpenAI-compatible endpoint in front of many providers. Two headers — X-Region and X-PHI — decide which host serves a request, and the response tells you exactly what happened.
The decision
You never call a provider directly. You describe the request; the router resolves an ordered list of routes that satisfy your region and data class, then uses the best one.
The X-Region header chooses where the request is processed. The router sends it to that region’s data-resident provider first — never to a host outside the region you named.
X-PHI: true restricts routing to providers whose data-processing agreement is verified for that jurisdiction. If none exists for the region + modality, the request is refused (403) rather than crossing a border.
model: "auto" lets the router pick the cheapest eligible model. Or pin a concrete id, or a provider-agnostic canonical alias that resolves per region.
On an upstream error the router walks an ordered candidate list. PHI requests only ever see PHI-eligible routes, so a failover can never spill to an unverified host — and never leaves the region.
Region residency
For auto traffic the router prefers the region’s data-resident provider, then any other cloud serving the region. A pinned model still wins if it serves the region.
| Region | Resident provider | Models | PHI |
|---|---|---|---|
| CH | Infomaniak | Swiss-resident. Gemma-4 31B (chat + vision), Qwen3.5, Qwen3-Embedding-8B. | Eligible |
| EU | Scaleway | Paris / Amsterdam. Gemma-4 26B (chat + vision), Qwen3.5/3.6, Devstral, Qwen3-Embedding-8B. | Eligible |
| US | Together AI | US-resident. Qwen3.5 397B. General traffic only — no verified US chat PHI host yet. | General |
| UK · ASIA · ANZ · LATAM · MENA · INDIA · WORLD | Google (Gemini) | Broadest-coverage cloud for regions with no dedicated resident host. Gemini 2.5 Flash / Pro. | General |
PHI for the document & voice modalities (OCR, speech-to-text, text-to-speech) is additionally served by Microsoft Azure in CH, EU and US under a HIPAA BAA + regional DPA. See /vendors for the full registry.
Modality coverage
Text and embeddings run broadly; documents and voice are deliberately Swiss + EU + US only and residency-separate (no cross-border failover). A region with no host for a modality returns 422 no_route — by design.
| Modality | Endpoint | Regions | PHI |
|---|---|---|---|
| Chat completions | /v1/chat/completions | Every region | CH, EU |
| Embeddings | /v1/embeddings | CH, EU (production) | CH, EU |
| Vision | /v1/chat/completions (image_url) | CH, EU | CH, EU |
| OCR | /v1/ocr | CH, EU, US | CH, EU, US |
| Speech-to-text | /v1/audio/transcriptions | CH, EU, US | CH, EU, US |
| Text-to-speech | /v1/audio/speech | CH, EU, US | CH, EU, US |
In the hosted chat these surface as the paperclip (image → vision, PDF → OCR, audio → transcription), the mic (voice input), and the “Listen” button (text-to-speech) — each enabled only when your pinned region supports it.
Canonical aliases
Instead of a concrete model id you can pass a provider-agnostic alias. Combined with X-Region it resolves to whichever provider hosts that model in your region — so the same code runs everywhere.
gemma-4CH → Infomaniak 31B · EU → Scaleway 26B
qwen3.5CH → Infomaniak · EU → Scaleway · US → Together
qwen3-embedding-8bCH → Infomaniak · EU → Scaleway · (4096-d everywhere)
whisperCH / EU / US → Azure Speech (STT)
Audit headers
The routing decision is echoed back on each 200, so you can record exactly what served the request and what it cost.
x-phi-routed: gemma-4-31b/CH/general
x-phi-tier: non_phi # or phi
x-phi-attempts: 1 # routes tried before success
x-phi-cost-micro: 4200 # metered cost, micro-USD
x-request-id: 9f1c… # quote in supportA 403 phi_blocked means you asked for PHI in a region/modality with no verified host — the gateway refuses rather than crossing a border. Pin a region that has one (e.g. CH or EU).
The API reference has copy-paste cURL, Python and Node for every endpoint, including vision, OCR and audio.