IAM & Service Accounts¶
This document covers how Validibot uses Google Cloud IAM (Identity and Access Management) for secure access to GCP resources.
Resource naming convention
All GCP resource names are derived from GCP_APP_NAME, which is set in
.envs/.production/.google-cloud/.just (defaults to validibot). The
naming pattern is $GCP_APP_NAME-{resource}[-{stage}]. For example, with
the default app name, the dev service account is validibot-cloudrun-dev
and the prod storage bucket is validibot-storage.
Overview¶
We use two types of service accounts per environment:
- Web/Worker SA (
$GCP_APP_NAME-cloudrun-{stage}) - Used by Cloud Run web and worker services. Has broad access to run the Django application. - Validator SA (
$GCP_APP_NAME-validator-{stage}) - Used by validator Cloud Run Jobs (EnergyPlus, FMU). Least-privilege: only storage access and worker callback permission.
This ensures:
- Environment isolation (dev can't access prod data)
- Least privilege (validators can't read secrets or access the database)
- No hardcoded credentials in code
Service Accounts¶
Web/Worker Service Account¶
| Stage | Service Account |
|---|---|
| dev | $GCP_APP_NAME-cloudrun-dev |
| staging | $GCP_APP_NAME-cloudrun-staging |
| prod | $GCP_APP_NAME-cloudrun-prod |
Roles granted:
| Role | Scope | Purpose |
|---|---|---|
roles/cloudsql.client |
Project | Connect to Cloud SQL |
roles/secretmanager.secretAccessor |
Project | Read secrets |
roles/run.invoker |
Project | Invoke Cloud Run services/jobs |
roles/cloudtasks.enqueuer |
Project | Create tasks in queues |
roles/cloudtasks.viewer |
Project | View queue status |
roles/storage.objectAdmin |
Stage bucket | Read/write storage objects |
roles/cloudkms.viewer |
KMS key | View signing key metadata |
roles/cloudkms.signerVerifier |
KMS key | Sign validation credentials |
roles/iam.serviceAccountTokenCreator |
Self | Create OIDC tokens for Cloud Tasks |
roles/iam.serviceAccountUser |
Self | Act as the service account |
Custom validibot_job_runner |
Validator jobs | Trigger jobs with env overrides |
Validator Service Account¶
| Stage | Service Account |
|---|---|
| dev | $GCP_APP_NAME-validator-dev |
| staging | $GCP_APP_NAME-validator-staging |
| prod | $GCP_APP_NAME-validator-prod |
Roles granted:
| Role | Scope | Purpose |
|---|---|---|
roles/storage.objectAdmin |
Stage bucket | Read inputs, write outputs |
roles/run.invoker |
Worker service | POST callbacks with results |
The validator SA deliberately does not have:
secretmanager.secretAccessor(no access to Django secrets, Stripe keys, etc.)cloudsql.client(no database access)cloudtasks.enqueuer(no task queue access)- KMS roles (no credential signing)
This limits the blast radius if a validator container is compromised by a malicious user-provided model (IDF, FMU, etc.).
Setup¶
All service accounts and IAM bindings are created automatically by just gcp init-stage:
just gcp init-stage dev # Creates both SAs + all bindings
just gcp init-stage prod # Same for production
The just gcp validator-deploy command additionally grants:
validibot_job_runneron the job to the main SA (so web/worker can trigger it)roles/run.invokeron the worker service to the validator SA (so the job can POST callbacks)
Application Default Credentials (ADC)¶
Our Django application uses Application Default Credentials to authenticate with GCP services. This means:
- No credentials in code - No JSON key files, no access keys
- Automatic detection - Libraries detect the environment and use appropriate credentials
- Environment-specific - Uses local user credentials for development, service account for Cloud Run
How ADC Works¶
| Environment | Credential Source |
|---|---|
| Local dev | gcloud auth application-default login |
| Cloud Run | Attached service account (metadata) |
Local Development Setup¶
To use GCP services locally (optional - local filesystem works for most development):
This stores credentials at ~/.config/gcloud/application_default_credentials.json.
The django-storages library and other Google Cloud libraries automatically detect and use these credentials.
Security Best Practices¶
Do¶
- Use separate service accounts per environment
- Use dedicated least-privilege SAs for untrusted workloads (validators)
- Grant permissions at the resource level (bucket, service) when possible
- Rely on ADC instead of key files
Don't¶
- Use the same service account for dev and prod
- Grant broad roles to components that don't need them
- Create and download JSON key files unless absolutely necessary
- Store credentials in code or version control
Troubleshooting¶
"Could not automatically determine credentials"¶
ADC isn't configured. Solutions:
- Locally: Run
gcloud auth application-default login - Cloud Run: Check that a service account is attached to the service
"Permission denied" errors¶
The service account doesn't have the required role. Check:
- The service account is attached to the Cloud Run service/job
- The service account has the correct role on the specific resource
- The role is on the right resource (e.g., the correct bucket)
Verifying Service Account Permissions¶
# List roles for the web/worker SA
gcloud projects get-iam-policy $GCP_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:$GCP_APP_NAME-cloudrun-prod" \
--format="table(bindings.role)"
# List roles for the validator SA
gcloud projects get-iam-policy $GCP_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:$GCP_APP_NAME-validator-prod" \
--format="table(bindings.role)"