Important Notes¶
Keep these points in mind during deployment and maintenance.
Docker Compose¶
- Project name must be
validibot: The compose files reference a hardcoded volume name (validibot_validibot_storage). If you use a different project name via-porCOMPOSE_PROJECT_NAME, advanced validator containers won't be able to access shared storage. Either use the default project name or updateVALIDATOR_STORAGE_VOLUMEto match. (Network is disabled by default; if you enableVALIDATOR_NETWORK, it must also match.) - Advanced validator network isolation: By default, advanced validator containers run with no network access (
network_mode='none'). This is secure because they communicate via the shared storage volume. If validators need external network access (to download files or call APIs), uncommentVALIDATOR_NETWORKin the compose files. - Docker socket access: The worker container mounts
/var/run/docker.sockto spawn advanced validator containers. This grants root-equivalent access to the host. The entrypoint adjusts group permissions and drops to an unprivileged user, but treat the worker service as privileged. - Scheduler singleton: Only run one scheduler instance. The Beat scheduler doesn't coordinate across replicas, so multiple instances would create duplicate scheduled tasks. Docker Compose naturally runs one, but if you scale via external orchestration, ensure exactly one replica.
- Health checks: All services expose
/health/for container health checks. The endpoint verifies Django is running and the database is reachable.
Google Cloud Platform¶
- Service accounts: Cloud Run services use stage-specific service accounts (for example
$GCP_APP_NAME-cloudrun-prod@...and$GCP_APP_NAME-cloudrun-dev@...). Ensure they have the necessary IAM roles (Cloud SQL Client, Secret Manager Accessor, Storage Object User, Run Invoker). - Secrets: All secrets are stored in Secret Manager and mounted as
/secrets/.env. Update withjust gcp secrets, then redeploy. - Cloud SQL connections: Cloud Run connects to Cloud SQL via the Cloud SQL Auth Proxy (configured via
--add-cloudsql-instances). - Worker vs Web: The worker service (
APP_ROLE=worker) handles validator callbacks and scheduled tasks. It is deployed with--no-allow-unauthenticated, so calls must be authenticated with a Google-signed ID token (Cloud Scheduler, Cloud Run Jobs, and Cloud Tasks if/when we use it). In production, setSITE_URL=https://validibot.comandWORKER_URL=<worker *.run.app URL>so internal callbacks and scheduler traffic don’t go to the public domain.
Common Issues¶
- Cold starts: Cloud Run scales to zero by default. First request after idle may take 2-3 seconds. Set
--min-instances=1if this is a problem. - Migrations: Always run
just gcp migrateafter deploying schema changes. - Scheduled jobs: If Cloud Scheduler jobs return 404, verify the worker service is deployed and
APP_ROLE=workeris set.