Capability · Embeddings

Embeddings in one shared vector space.

A single 4096-dimension embedding model in every region — for general and PHI traffic alike — so your vectors stay comparable across Switzerland, the EU and the US.

POST /v1/embeddings

What you get

Built for regulated workloads

Embeddings resolve to the canonical qwen3-embedding-8b (4096-d) in every region, for both general and PHI traffic — CH on Infomaniak, EU on Scaleway, US on the worker (general). Because the model is the same everywhere, a vector embedded in Zurich is directly comparable to one embedded in Paris or Virginia, so you can build one index that spans jurisdictions.

Portable

Cross-region comparable

The same 4096-d space in CH, EU and US means a query embedded in one region matches documents embedded in another. No re-embedding when you add a jurisdiction.
Same-space failover

Dimension-safe routing

Failover is filtered to a single vector dimension per request — an EU embed fails over Scaleway→worker (both 4096-d Qwen), never to a different-dimension model that would corrupt your index.
PHI

PHI embeddings, resident

PHI embeddings run on Infomaniak (CH) and Scaleway (EU) under verified DPAs. US PHI embeddings are staged on Fireworks (same 4096-d space), pending BAA confirmation.
Options

Pin a different space

Need a lighter or multilingual model? bge-multilingual-gemma2 (3584-d) and all-MiniLM-L12-v2 (384-d) are pin-only in CH. The same-space filter keeps any one request to a single dimension.

Availability & pricing

Where it runs, what it costs

Every route is region-resident and the PHI gate is enforced per call. Prices include the flat +10% gateway margin and mirror the live /v1/pricing rate card.

RegionProviderModelTierPrice
CHInfomaniakqwen3-embedding-8b · 4096-dPHI$0.088 / 1M
EUScalewayqwen3-embedding-8b · 4096-dPHI$0.121 / 1M
US / WORLDMedishift workerqwen3-embedding-8b · 4096-dGeneralFree tier
USFireworksqwen3-embedding-8b · 4096-dStaged$0.018 / 1M
CHInfomaniakbge-multilingual-gemma2 · 3584-dPHI$0.055 / 1M
CHInfomaniakall-MiniLM-L12-v2 · 384-dPHI$0.022 / 1M

Per 1M input tokens, gateway margin included. The canonical qwen3-embedding-8b is the default everywhere; bge / minilm are pin-only CH options in their own dimension.

Try it

A real call, end to end

Embed a batch in the EU. The response is a standard OpenAI embedding object; the headers confirm the route and space.

curl
curl https://phi-cloud.com/api/v1/embeddings \
  -H "Authorization: Bearer $PHI_API_KEY" \
  -H "X-Region: EU" \
  -H "X-PHI: true" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-embedding-8b",
    "input": ["patient presents with...", "follow-up in two weeks"]
  }'
response
x-phi-routed: scaleway-qwen3-embedding-8b/EU/phi
x-phi-tier: phi
x-phi-usage: verified
# data[].embedding -> 4096 floats

Residency & vectors

  • CH embeddings stay on Infomaniak via the Swiss /2/ AI surface.
  • EU embeddings run on Scaleway; general EU traffic can fail over to the worker in the same 4096-d space.
  • US and other regions embed on the worker (general only) until Fireworks PHI is enabled.
  • Build one cross-jurisdiction index; the shared space keeps similarity scores meaningful.

FAQ

Common questions

Vector comparability. If different regions used different models, vectors would live in incompatible spaces and you could not search across them. The single 4096-d canonical keeps one index valid worldwide.
No — vectors are only comparable within one model/dimension. The gateway enforces this: failover stays in the request’s vector dimension. Pin bge or minilm explicitly only if you maintain a separate index for them.
Not yet in production. The Fireworks US PHI route is staged on the same 4096-d space and enables once its HIPAA BAA and serverless coverage are confirmed. US embeddings today are general-tier on the worker.

Ready when you are

Put embeddings & search in production — without giving up your data.

Spin up a key in minutes. The residency and PHI posture above arrives unchanged.

Free to test · Prepaid credits, no subscription · No data retained