Question 1

Why Falcon-H1 Arabic and Jais 2 instead of GPT-4 or Claude?

Accepted Answer

For Arabic-first workloads, Falcon-H1 Arabic and Jais 2 outperform Western frontier models on MSA and major Gulf dialects, and they're available as open weights. They also stay on your infrastructure. For English-only workloads we recommend Llama 3.3 70B or Qwen 2.5 — both are strong, both are open weights, both run on the hardware tiers above. We benchmark for your specific corpus before selecting.

Question 2

What ongoing costs should I expect after the build?

Accepted Answer

Inference electricity and cooling for the Foundations tier is typically AED 4,000–8,000 per month. Sovereign tier with H100 GPUs runs AED 12,000–25,000 per month depending on utilisation. We offer an optional managed-operations retainer (separate SoW) starting at AED 18,000/month for monitoring, eval drift detection, and quarterly model upgrades.

Question 3

Can we swap models later — say, when Llama 4 ships?

Accepted Answer

Yes. The deployment architecture is model-agnostic — your application code calls a stable internal API, and the inference server underneath is swappable. We handle major model upgrades on the managed-operations retainer; if you self-manage, the swap is typically a 2–3 day exercise.

Question 4

How does this compare to Azure OpenAI or AWS Bedrock?

Accepted Answer

Bedrock and Azure OpenAI are managed cloud services — your data still leaves your environment to call the model, even if the cloud provider promises not to retain it. The Sovereign tier here keeps everything inside your infrastructure. For workloads where contractual or regulatory language requires zero third-party data exposure, Bedrock and Azure OpenAI fail the test. Where they pass, our Agentic Pilot offer is usually a better fit.

Question 5

Do you support hybrid deployments — sovereign for regulated workloads, cloud for general?

Accepted Answer

Yes. A common pattern is Foundations or Sovereign tier for regulated data (client records, financial details), with cloud-routed Claude/GPT for general productivity tasks (drafting, summarisation of public material). We design the routing layer so the same end-user interface uses the appropriate backend transparently.

Sovereign AI + RAG

Architecture

Deployment

Production

Sovereign AI + RAG. From AED 150,000 · Sovereign tier from AED 280,000. Timeline 8–14 weeks per tier.