Solution · Retrieval-augmented generation

Search and RAG over protected data.

Embed, retrieve and answer over clinical corpora in one resident 4096-d vector space — so a single index can span Switzerland, the EU and the US.

The problem

Why this is hard

Retrieval-augmented generation over clinical data means embedding documents that are protected health information, then feeding retrieved passages to a model. If embeddings or the answer step leave your jurisdiction, the whole index is a compliance problem. phi-cloud embeds PHI on resident providers in a single 4096-d space — so one index stays valid across borders — and answers on a PHI-gated chat model.

How it works

The pipeline, end to end

  1. 1

    Ingest the corpus

    POST /v1/ocr

    Convert scanned records and PDFs to clean markdown with Azure Document Intelligence, then chunk them in your pipeline.

  2. 2

    Embed in one space

    POST /v1/embeddings

    Embed chunks with the canonical qwen3-embedding-8b (4096-d). The same space in CH, EU and US means one index spans jurisdictions and failover never changes the dimension.

  3. 3

    Retrieve

    Your vector DB

    Store the 4096-d vectors in your own store (pgvector, Qdrant, etc.) and retrieve the top passages for a query — phi-cloud stays stateless.

  4. 4

    Answer, grounded

    POST /v1/chat/completions

    Send the query plus retrieved passages to a resident chat model under X-PHI:true for a grounded answer with citations you control.

Why phi-cloud

What makes it compliant

One vector space

qwen3-embedding-8b at 4096-d everywhere — a query in Zurich matches a document embedded in Paris.

PHI embeddings, resident

Embeddings run on Infomaniak (CH) and Scaleway (EU) under verified DPAs — not a US default.

Dimension-safe failover

Failover is filtered to the request’s vector dimension, so your index never gets a mismatched vector.

OCR ingestion

Bring scanned records into the index with PHI-eligible OCR — no separate vendor.

In code

A representative call

Embed a batch of chunks in Switzerland — 4096-d vectors, same space as every other region.

curl
curl https://phi-cloud.com/api/v1/embeddings \
  -H "Authorization: Bearer $PHI_API_KEY" \
  -H "X-Region: CH" -H "X-PHI: true" \
  -d '{
    "model": "qwen3-embedding-8b",
    "input": ["chunk 1 ...", "chunk 2 ..."]
  }'
response
x-phi-routed: infomaniak-qwen3-embedding-8b/CH/phi
x-phi-tier: phi
# data[].embedding -> 4096 floats, cross-region comparable

Compliance posture

  • Embeddings: CH → Infomaniak, EU → Scaleway, both PHI-eligible and 4096-d.
  • Answers: resident chat model under the PHI gate (CH Infomaniak, EU Scaleway).
  • Your vector store holds the embeddings; phi-cloud persists nothing.
  • One 4096-d space keeps a single index valid across CH, EU and US.

FAQ

Common questions

Yes — that is the design. Because every region embeds with the same qwen3-embedding-8b at 4096-d, vectors are comparable across CH, EU and US. You maintain one index; the gateway keeps failover in the same dimension so it never corrupts.
In your own vector database. phi-cloud is a stateless proxy — it returns the embeddings and stores nothing. You control retention and access on the store.
Yes. The final /v1/chat/completions call carries X-PHI:true and routes to a resident, PHI-eligible model, so retrieved passages are never sent to an unverified provider.

Ready when you are

Build search & rag over phi on a gateway that survives the audit.

Free to test. Prepaid credits when you go live. The residency and PHI posture is the same in production.

Free to test · Prepaid credits, no subscription · No data retained