SGA AI Platform — Sandbox Provisioning Request

TL;DR

We need one Azure subscription (sga-ai) plus a parallel PHI-ready subscription (sga-ai-prod), plus an Anthropic Enterprise account with BAA. Once the subscriptions exist and a service principal has Owner, our team provisions everything else by code — Foundry, Postgres, Blob, Search, Document Intelligence, Container Apps — without further IT cycles. Two BAA paths (Microsoft + Anthropic) is what unblocks Phase B (clinical photos, patient comms, treatment data) on a realistic timeline.

Part 1

The Azure IT Ask

Please provision the following inside our existing Azure tenant so we can stand up the AI platform via Bicep/CLI without further IT cycles.

Two subscriptions in our existing tenant

Both in the same tenant as the existing Azure warehouse and Power BI footprint — cross-tenant kills the integration value.

sga-ai — dev/sandbox environment, non-PHI by policy
sga-ai-prod — PHI-ready environment, provisioned now so we have a promotion target ready when prototypes graduate (nothing in it until then)

RBAC on sga-ai

Owner for my account
A break-glass admin distinct from my daily account
Same on sga-ai-prod, with prod access gated behind an approval workflow if IT prefers

Service principal sp-sga-ai-deploy

Owner on sga-ai
Contributor + User Access Administrator on sga-ai-prod
GitHub OIDC federation to our deploy repo (no client secret)
Client ID / tenant ID / subscription IDs returned in writing

Region: East US 2

For both subscriptions — broadest Foundry model availability. Please confirm tenant policy allows.

Quota unlocks on sga-ai

Sized for me + 6 team members doing real agentic work.

Resource	Limit
Claude Sonnet 4.6 (workhorse)	500k TPM
Claude Opus 4.7 (deep reasoning)	200k TPM
Claude Haiku 4.5 (high-volume)	1M TPM
GPT-5 / o-series	200k TPM
text-embedding-3-large	500k TPM
Azure AI Search	Standard SKU
Azure AI Document Intelligence	S0
Azure Container Apps	20 vCPU / 40 GB
Azure Database for PostgreSQL Flexible	enabled
Storage accounts	default

File quota tickets the same day the subscription is created. Quota approval is the #1 cause of "we have access and nothing works."

Budget

$500/month on sga-ai with alerts at 50/80/100% to my email. Hard cap if our agreement supports it. (Expect this to need a bump to ~$1,500–2,000/month once active development is underway.) Prod budget set later when something is ready to promote.

Policy posture — split by environment

sga-ai (dev) RGs (rg-sga-ai-dev-*): public endpoints allowed, HIPAA/HITRUST blueprint not applied. Speed-of-iteration matters; PHI excluded by policy and team discipline. Written confirmation that no PHI is to be stored in sga-ai.
sga-ai-prod RGs (rg-sga-ai-prod-*): full HIPAA/HITRUST blueprint applied from day 1. Private endpoints required, customer-managed keys on storage, audit logging mandatory.

BAA — Microsoft side

Written confirmation that Microsoft's tenant BAA covers Azure OpenAI, Foundry, Postgres Flexible, Blob, Key Vault, Application Insights, AI Search, and Document Intelligence. Needed before any PHI lands in sga-ai-prod.

Existing-data access for the service principal

Add sp-sga-ai-deploy as a member of the relevant Power BI workspaces so it can execute DAX queries against existing semantic models. This preserves the DAX measure library already built (powerbi-bridge, registered semantic models, provider-type thresholds, days-worked rules) — Foundry agents call into those measures rather than redefining them in raw SQL.
Optionally: warehouse-side SQL read for cases where we need to ingest transaction-level data into a Foundry vector index for RAG.

Part 2

The Anthropic Procurement AskParallel Track

Not an Azure IT ticket — a procurement / vendor-management item that should be filed alongside.

Initiate Anthropic Enterprise account for SGA

Enterprise tier (or Teams + BAA addendum if that's what's currently quoted)
Business Associate Agreement (BAA) signed before any PHI work begins
Centralized billing + admin console under SGA
SSO (SAML or OIDC), federated against the SGA Entra ID tenant
Seat count: 7 (me + 6 team members) with room to scale
API access in addition to Claude Code seats — both the Anthropic SDK and Claude Code itself
Rate-limit tier appropriate to enterprise usage

Why a Second BAA Path Matters

Microsoft's BAA covers Anthropic-models-on-Azure (via Foundry). Anthropic's direct BAA covers Claude-via-native-API and Claude Code. Having both means we're not single-vendor on the legal side, and Claude Code itself runs under BAA — which it can't if we're only on Azure.

Cost Bottom Line

LLM usage dominates everything else. Infrastructure is ~$200–300/mo flat; model spend scales with how hard we push it. Plan ~$100k/year all-in for year one across Azure + Anthropic, climbing to $200–250k/year at mature production scale. Competitive with one full-time engineer for the same throughput.

Tier Estimates

Three Stages, Three Cost Profiles

Tier 1 · Sandbox

$2.0–3.5k

per month, all-in

Months 1–2 · exploration, light load

~$200/mo Azure infra
$300–800/mo model spend
$1.5–2.5k/mo Anthropic Enterprise (7 seats + light API)

Tier 2 · Active Dev

$6.0–10.5k

per month, all-in

Month 3+ · real workloads, 6 active users

~$300/mo Azure infra
$3–6k/mo model spend
$2.5–4k/mo Anthropic Enterprise w/ real API

Tier 3 · Production

$15–25k

per month, all-in

Full network usage at maturity

Infra stays flat (~$300–500/mo)
Model bill becomes 75% of total
Scales with adoption, not headcount

Azure infrastructure breakdown (sandbox tier)

Resource	Monthly
Postgres Flexible Server (Burstable B2ms)	~$50
Container Apps (scale-to-zero, light use)	~$30
Blob Storage (100 GB Hot + transfer)	~$3
AI Search (Basic)	~$75
Document Intelligence (S0, pay-per-page)	~$20
Key Vault	<$1
App Insights + Log Analytics (5 GB)	~$15
Container Registry (Basic)	~$5
Infrastructure subtotal	~$200

Model spend breakdown (active dev tier)

Mix-weighted across ~1.5B tokens/month combined usage.

Model	Volume	Monthly
Claude Sonnet 4.6 (workhorse, 60%)	~900M tokens	$2,500–4,000
Claude Opus 4.7 (deep reasoning, 10%)	~150M tokens	$1,500–2,500
Claude Haiku 4.5 (high-volume cheap, 25%)	~400M tokens	$400–700
GPT-5 / o-series (fallback, 5%)	~80M tokens	$300–500
text-embedding-3-large	~30M tokens	~$4
Total model spend	~1.5B	$4.7–7.7k

$500/month budget alert in the ask is the ALERT level, not the spending target. It'll trip the moment usage gets real. We'll request a bump within weeks of active use — that's normal and expected.

What we get for the spend

Two BAA paths (Microsoft + Anthropic) unlock the Phase B PHI roadmap — clinical photos, patient comms, call recordings, treatment plans.
Replace single-vendor LLM risk. Today everything goes through Anthropic. Foundry adds GPT-5 / o-series / Llama / Phi as fallback and procurement leverage.
Foundry agent runtime + eval harness replaces the hand-rolled Python Council runner. Built-in tracing, evaluations, content safety.
Azure-native peering with the existing SGA warehouse — enables Foundry agents to query Power BI semantic models directly, preserving the DAX measure library.
Centralized observability via App Insights replaces "Pino + Railway logs only."
Throughput equivalent to one FTE engineer for the same annual cost — with no PTO, no ramp, no recruiting drag.

Plain English

Every Azure service we're asking for, explained in non-IT terms. What it does, what it replaces in our current stack, and a concrete example of how we'd use it. Skim the headers; dive in where you want detail.

The AI Layer

Azure AI Foundry

Azure's AI workspace — one place to deploy models, build agents, run evaluations, see traces, and connect data. Sits on top of Azure OpenAI + the Anthropic-on-Azure deployments + the agent runtime.

Replaces: hand-rolled Anthropic SDK + Python Council runner + ad-hoc eval scripts. Example: upload brand guidelines → Foundry indexes them → an agent answers "does this Instagram post match the brand?" using Claude, with full traces stored automatically.

Foundry Hub vs. Project

Hub = shared infrastructure (which models are deployed, which storage to use, who has access). Projects = workspaces inside the Hub where individual workloads live (content engine, OM daily, IT-AI-agents, MI6, etc.). Costs and RBAC scope per project.

Familiar comparison: Hub = the org account. Projects = team workspaces inside it.

Model deployments

A "deployment" is an instance of a model with its own endpoint URL and TPM quota. You don't just "use Claude" — you create a deployment named e.g. sonnet-46-default with 500k TPM, and your code calls that endpoint.

Example: sonnet-46-default (500k TPM, daytime work) vs. sonnet-46-bulk (200k TPM, overnight batch) vs. haiku-45-classifier (1M TPM, action briefings).

Azure AI Search

A managed search engine with keyword and vector (semantic) search. Feed it documents; it indexes them; you query with text or with an embedding and get ranked results. The piece that makes RAG (retrieval-augmented generation) work.

Replaces: hand-rolling Pinecone/Weaviate + separate text search. Example: index 260 practice brand guidelines → "find clinical photos matching Modis brand voice" → top 10 results → Claude writes the recommendation.

Azure AI Document Intelligence

Reads PDFs, scanned docs, forms, invoices. Pulls out structured data: text, tables, key/value pairs, signatures. OCR + structure detection + form parsing in one.

Replaces: hand-rolled Python extractors for dental audit corpus ingestion. Example: upload an insurance fee schedule PDF → structured JSON of every code + fee → loaded into Postgres for the fee negotiation tool.

Data Storage

Azure Database for PostgreSQL — Flexible Server

A managed Postgres database. Microsoft handles backups, patching, HA, scaling. Connect with a normal Postgres connection string.

Replaces: Railway Postgres + the AWS RDS Postgres we settled on but never built. Example: existing Drizzle schemas run unchanged; just repoint the connection string.

Azure Blob Storage

Object storage for binary files (images, videos, PDFs). Files in buckets (called "containers" in Azure, unrelated to Docker). Same model as AWS S3. Tiers: Hot / Cool / Archive for cost.

Replaces: AWS S3 + Railway disk volumes. Example containers: originals/, variants/, previews/ for the DAM; plus backups/, documents/, logs/ for other workloads.

Code & Deployment

Azure Container Apps

Run containerized code without managing Kubernetes. Auto-scales with traffic. Scales to zero when idle so you don't pay for empty boxes. Easier than AKS, more capable than Heroku.

Replaces: AWS ECS/Fargate; the Azure-side option when something needs to run inside the tenant. Railway compute stays where it is for now. Example: the asset registry Fastify API runs here. Idle at 3 AM = zero cost.

Azure Container Registry (ACR)

A private place to store Docker images. CI builds images; Container Apps pulls and runs them.

Familiar comparison: Docker Hub or AWS ECR but inside our tenant.

Azure Static Web Apps

Hosting for static front-end builds (React/Vite output) plus optional serverless endpoints. Native Entra ID auth integration.

Replaces (selectively): Cloudflare Pages for SGA-staff-only dashboards that need SSO. Public reports stay on Cloudflare.

Security & Secrets

Azure Key Vault

Where you store passwords, API keys, connection strings, certificates. Code references secrets by name; Azure resolves them at runtime using the service's identity. Actual values never appear in code, env files, or GitHub.

Replaces: the .env sprawl across Railway / Cloudflare / local. Example: Container App starts → reads DATABASE_URL from Key Vault → connects to Postgres. No password in any config file.

Microsoft Entra ID

Microsoft's identity directory. SGA already has one — where every employee logs in to Microsoft 365. We use it for: Azure portal login, the service principal in the ask, and eventually staff SSO into our apps.

Replaces: nothing today — the alternative to AWS Cognito for staff auth.

Observability

Application Insights

Monitoring for running code. Auto-traces, metrics, error tracking, dependency maps. Tells you which requests are slow, where errors happen, how services connect.

Replaces: "Pino + Railway logs only" — today's #1 observability pain. Roughly = Sentry + Datadog APM combined. Example: /api/assets/list is slow → App Insights shows the DB query took 2 seconds → add an index.

Log Analytics Workspace

The data store under App Insights and most Azure telemetry. Where logs and metrics live. Queried with KQL (Kusto Query Language — similar to SQL, different syntax).

Familiar comparison: the "database" App Insights queries.

Azure Communication Services — Email

Transactional email. Verify a domain, call an API, email sends.

Replaces: the AWS SES integration that was settled but never provisioned, plus the current EmailStubProvider. Example: studio request approved → notification sent to the champion.

What We're Explicitly NOT Asking For

Scope discipline. If asked "is that everything?", these are deliberately deferred:

AKS (Kubernetes) — Container Apps is enough
Synapse / Fabric — long-term data platform, not now
Entra External ID — only if we add patient/practice auth
Azure Data Explorer / ClickHouse Cloud — only if we outgrow Postgres
Event Hubs (Kafka surface) — Kafka is stubbed today
Front Door / WAF — comes with prod hardening
ExpressRoute, VPN gateways, HSM, Confidential Compute — none needed for sandbox
Azure Functions — Container Apps Jobs covers this
Logic Apps / Power Automate — not in scope