Post-Deployment Verification (PDV)¶
After deploying to any environment, run the PDV smoke tests to verify that critical functionality is working correctly.
Quick Start¶
# After deploying to dev
just gcp deploy-all dev
just verify-deployment dev
# After deploying to production
just gcp deploy-all prod
just verify-deployment prod
Application Health Check¶
For Docker Compose deployments or to verify application-level configuration, use the built-in health check command:
# Basic health check
python manage.py check_validibot
# Verbose output with details
python manage.py check_validibot --verbose
# Attempt to auto-fix common issues
python manage.py check_validibot --fix
# JSON output for scripting/monitoring
python manage.py check_validibot --json
The check_validibot command verifies:
| Check | What it verifies |
|---|---|
| Database | Connection, PostgreSQL version |
| Migrations | All migrations applied |
| Cache | Redis/cache connectivity |
| Storage | File storage read/write access |
| Site | Django Sites configuration |
| Roles & Permissions | Required roles and permissions exist |
| Validators | System validators configured |
| Background Tasks | Celery broker connectivity, schedules |
| Docker | Docker availability, validator images |
| SMTP server reachability | |
| Security | DEBUG mode, SECRET_KEY, ALLOWED_HOSTS, HTTPS settings |
This is similar to GitLab's gitlab:check rake task or Zulip's health check plugins.
When to Run¶
- After initial
setup_validibotto verify everything is configured - After upgrades to catch configuration drift
- When troubleshooting issues
- As part of monitoring/alerting (use
--jsonoutput)
What Gets Tested¶
The PDV suite verifies:
Web Service¶
- Homepage is accessible (returns 200)
- Static files are served (robots.txt)
- API documentation is accessible (/api/v1/docs/)
- API endpoints require authentication
Worker Service Security¶
- IAM protection: Unauthenticated requests are rejected with 403
- Callback endpoint: Cannot be spoofed by external attackers
- Scheduled task endpoints: Protected from external access
- Authenticated requests: Reach Django when properly authenticated
The worker security tests are particularly important because the callback endpoint is how validator jobs report their results. If this endpoint were exposed, attackers could spoof validation results.
Commands¶
Full Verification¶
Runs the complete pytest smoke test suite:
Options:
- just verify-deployment dev - Test dev environment
- just verify-deployment staging - Test staging environment
- just verify-deployment prod - Test production
You can pass pytest arguments:
# Run only callback tests
just verify-deployment prod -k "callback"
# Show more detail
just verify-deployment prod -vv
# Stop on first failure
just verify-deployment prod -x
Quick Verification¶
A faster check that just verifies services are up and IAM is working:
This uses curl to check: 1. Web service returns 200 2. Worker service returns 403 (IAM protected)
Good for a quick sanity check, but doesn't test as thoroughly as the full suite.
Prerequisites¶
- Deployed services: Both web and worker services must be deployed to the target stage
- gcloud CLI: Must be installed and in your PATH
- Valid credentials: Must be logged in with
gcloud auth login - Cloud Run Invoker role: Your account needs permission to invoke the worker service (for authenticated tests)
How It Works¶
All smoke tests run locally on your machine and make HTTP requests to the remote deployed services. No code runs on the server as part of PDV.
Your laptop GCP Cloud Run
───────────────── ─────────────────
just verify-deployment prod
│
├─► gcloud: resolve service URLs
│
├─► pytest tests/smoke/
│ │
│ ├─► HTTP requests ──────────────────► Web Service
│ │
│ ├─► HTTP requests ──────────────────► Worker Service
│ │
│ └─► Verify responses
│
└─► Results displayed locally
This approach: - Tests the real deployed infrastructure (Cloud Run, IAM, load balancers, DNS) - Requires no additional deployment or management commands on the server - Is simple to run - just needs gcloud credentials locally
The tests use the SMOKE_TEST_STAGE environment variable to know which stage to test. The just verify-deployment command sets this automatically - you don't need to add it to any secrets or environment files.
Test Structure¶
tests/smoke/
├── __init__.py # Module docstring
├── conftest.py # Fixtures (URLs, HTTP sessions)
├── test_web_service.py # Web service health tests
└── test_worker_security.py # Worker IAM/security tests
Adding New Tests¶
To add a new smoke test:
- Create a new test file in
tests/smoke/or add to an existing file - Use the provided fixtures:
web_url- The deployed web service URLworker_url- The deployed worker service URLhttp_session- Unauthenticated requests sessionauthenticated_http_session- Session with gcloud identity tokenstage- The current stage (dev/staging/prod)
Example:
def test_my_new_endpoint(web_url: str, http_session):
"""Verify my new endpoint works."""
response = http_session.get(f"{web_url}/api/v1/my-endpoint/", timeout=30)
assert response.status_code == 200
Troubleshooting¶
"SMOKE_TEST_STAGE must be set"¶
Run via just verify-deployment <stage> instead of running pytest directly, or set the environment variable:
"Failed to get service URL"¶
The service isn't deployed or you don't have permission to describe it:
"Authenticated request was rejected by IAM"¶
Your gcloud account doesn't have the Cloud Run Invoker role on the worker service:
# Grant yourself invoker access (if you're an admin)
gcloud run services add-iam-policy-binding $GCP_APP_NAME-worker-dev \
--region=us-west1 \
--member="user:you@example.com" \
--role="roles/run.invoker"
Worker returns 200 instead of 403¶
The worker service may have been deployed with --allow-unauthenticated. This is a security issue - redeploy with:
The deploy script uses --no-allow-unauthenticated by default.