← Sam — Engineering Coworker

Sam GCP infra (Terraform)

On this page

One-time GCP setup for Sam’s Cloud Run CI/CD. Single config file (config.yaml) drives both Terraform and the CI workflow.

Mental model

Everything Sam needs lives in 3 places:

Where What Edited by
infra/config.yaml All non-secret deployment config (project, regions, Slack IDs, runtime knobs, secret name map) You, by hand
infra/config.generated.yaml TF-derived values (WIF provider path, SA emails, AR repo URL) Terraform writes it; you commit it once
GCP Secret Manager The application secrets (Slack tokens, GitHub PAT, Linear API, Exa key, GitHub webhook HMAC) bash infra/scripts/upload-secrets.sh

There are no GitHub Actions secrets or variables to set. The workflow reads both YAMLs directly.

What this provisions

What this does not provision (kept manual on purpose):

Persistence model

Sam writes to /data/journal/*.md and /data/sam.lock. Cloud Run is stateless by default — /data would be wiped on every restart. To keep journal state across restarts/redeploys, the workflow mounts the GCS bucket as a Cloud Run volume:

--add-volume=name=sam-data,type=cloud-storage,bucket=<bucket>
--add-volume-mount=volume=sam-data,mount-path=/data
--execution-environment=gen2

Notes:

CI security checks

The ci-checks job runs on PRs targeting main, not on pushes to main. The assumption: every commit reaching main got there via a PR that already passed checks. This halves CI minutes and avoids re-running expensive scans (trivy, gitleaks history) on the same code twice.

Required: branch protection on main. Without it, someone could push directly to main and skip all checks. Set this once in GitHub repo settings:

Settings → Branches → Add rule → main ☑ Require a pull request before merging ☑ Require status checks to pass before merging → select ci-checks ☑ Do not allow bypassing the above settings

What runs:

Check Stack What it catches
ruff check src/ Python Code quality
ruff check src/ --select=S Python Security antipatterns (bandit subset — subprocess shell=True, eval, pickle, weak crypto, etc.)
pip-audit -r src/runtime/requirements.txt Python Known CVEs in pinned deps
docker build Docker Dockerfile + deps resolve cleanly
trivy-action@0.28.0 (HIGH/CRITICAL, ignore-unfixed) Docker OS/package CVEs in the built container image

Secret scanning is delegated to GitHub. Since the repo is public, GitHub’s native secret scanning runs on every push automatically, surfaces findings in the Security tab, and burns zero CI minutes. We removed the in-CI gitleaks step in favor of it.

To suppress a specific finding:

First-time bootstrap

# Auth — uses your gcloud ADC
gcloud auth application-default login

# Edit config.yaml first if you need to change defaults (project, region, etc.)
$EDITOR infra/config.yaml

# Apply
cd infra/
terraform init
terraform plan
terraform apply

# Commit the generated config (workflow needs it)
git add infra/config.generated.yaml
git commit -m "infra: capture WIF provider + SA emails from terraform apply"
git push

# Upload your local .env secrets into GCP Secret Manager (one time)
bash scripts/upload-secrets.sh

After that, any push to main triggers a deploy.

Changing config later

GitHub webhooks (SAM-5)

Sam’s Cloud Run service stays --no-allow-unauthenticated — it never accepts public traffic. GitHub can’t present a GCP IAM token, so it can’t call Sam directly. The public edge function github-webhook-proxy is the only door: GitHub → proxy (public, HMAC-signed body) → forwards with an IAM token → Sam’s private /github/webhook → Sam validates the HMAC and acts.

One org-level webhook covers every repo in the org — current and future — so there’s no per-repo setup, ever. It pairs with Sam’s contributor filter (the daemon ignores events on PRs the bot didn’t author), so the org firehose only wakes Sam for repos it actually works in.

Setup, once:

# 1. Provision the proxy function, its IAM, AND the HMAC secret. The secret is
#    auto-generated by Terraform (random_password → Secret Manager) — no human
#    picks or types it, and a version exists before Sam's deploy mounts it.
terraform apply

# 2. Deploy Sam so it picks up the secret: push to main, or
#    `gh workflow run ci-deploy.yml` (SAM-19).

# 3. Register the ONE org webhook. Needs YOUR org-admin gh creds — the bot has
#    only write, so it can't self-register. Idempotent. Reads the
#    Terraform-generated secret from Secret Manager and sets it on the hook.
bash scripts/register-webhooks.sh            # org from config.yaml (Dembrane)
# or: bash scripts/register-webhooks.sh SomeOtherOrg

The proxy is a thin forwarder — it does not hold the secret or validate the signature. Sam is the single HMAC validator. Junk traffic is forwarded once and HMAC-rejected by Sam (fast 401, no session). If the secret is unset, Sam’s daemon doesn’t expose /github/webhook at all and the loop is simply off.

The secret is the only human-free part now: Terraform generates it, Sam reads it to validate, the script reads it to register. The one irreducible manual step is the org-admin-gated registration in step 3 — because the bot can’t have admin.

State notes

State is local (terraform.tfstate in this directory, gitignored). For a single-operator setup this is fine; migrate to a GCS backend if more than one person needs to apply.

destroy will delete SAs and AR repo. Secret Manager has 30-day soft-delete by default; APIs stay enabled (deliberate — turning them off project-wide breaks anything else using them).