# 25 — FinOps

We treat cloud spend like any other line item: budgeted, attributed, monitored, and reviewed.

## Budgets

- Monthly cap per environment (`dev`, `staging`, `production`).
- Monthly cap per service (API, dashboard, website, queues, DB, storage, CDN, observability).
- Alert at 70/85/95/100% of cap.

## Attribution

- All cloud resources tagged with `env`, `service`, `owner`, `cost-center`.
- Untagged resources are auto-flagged and reviewed.
- Cost reports per service published monthly.

## Cost discipline

- Right-size DB and Redis quarterly based on real metrics.
- Use spot/preemptible for batch jobs.
- Lifecycle rules on object storage (move cold blobs to infrequent-access).
- Cache aggressively at the CDN; pay once for static.
- Avoid cross-region traffic in hot paths.

## Vendor cost categories

| Category | Example vendors | P1 budget signal |
|---|---|---|
| Compute & containers | AWS EC2/EKS, Cloud Run | tracked |
| Database | RDS / Cloud SQL | tracked |
| Cache & queues | Redis Enterprise / ElastiCache | tracked |
| Object storage & CDN | S3+CloudFront / R2+Cloudflare | tracked |
| Email | SES / Resend / Postmark | small |
| SMS | Twilio / Unifonic | scales with auth |
| Maps | Mapbox / Google | scales with discover |
| Payments | Stripe / Telr / Checkout | scales with GMV |
| AI | OpenAI / Anthropic / Bedrock | scales with planner |
| Observability | Datadog / Grafana stack | flat-ish |
| Error reporting | Sentry | flat-ish |
| Issue tracking, design, comms | Linear, Figma, Slack | flat |

## Showback / chargeback

- Monthly internal report shows cost by service, environment, and feature area.
- Owners review and act; persistent over-budget triggers a review with the CTO.

## FinOps practices

- New resources require a cost estimate in the PR description.
- Big-ticket features (AI itinerary generation, video transcoding, etc.) require an explicit unit-economics analysis.
- A monthly "savings PR" identifies removable resources.

## Unit economics watchpoints

- AI cost per generated trip; target ≤ AED 0.50 average.
- Map tiles per session; cache aggressively.
- SMS per signup; rate-limit OTP retries.
- Push per session; deduplicate.

## Documented assumptions

- We are willing to over-pay slightly for managed services in P1 to keep the team small.
- We will negotiate vendor commitments when usage stabilizes (typically end of P2).
- Engineering time is more expensive than hosting in this phase; we optimize for shipping.
