How to extend the audit log¶
The audit log (validibot/audit/) is an append-only record of
configuration changes, security events, and admin actions. This
page covers the moves a developer makes day-to-day:
- adding a new action code
- extending the field whitelist for an existing model
- adding audit capture for a new model
- writing an ad-hoc entry from service code
It does not cover the cross-cutting architecture — that's the four-pillar observability taxonomy that the audit log is one Pillar of.
Mental model¶
Every audit entry has four moving parts:
- An
AuditActioncode (thinkWORKFLOW_UPDATED,LOGIN_FAILED). - An
AuditActorrow (who, with IP / user-agent; purgeable for GDPR). - A target (
target_typelabel +target_idstring + humantarget_repr). - An optional whitelisted
changesdiff + free-formmetadata.
The write path is always AuditLogService.record(...). Capture
happens via three layers — direct service calls, signal handlers, and
the admin LogEntry bridge — but they all converge on the service.
Add a new action code¶
Actions are a TextChoices enum in
validibot/audit/constants.py::AuditAction. Adding one is a
one-line edit plus a retention-window entry if the new action needs a
different default than the generic "config change" tier.
# validibot/audit/constants.py
class AuditAction(TextChoices):
# ... existing actions ...
WEBHOOK_INVOKED = "webhook_invoked", _("Webhook Invoked")
# Optional — only if the default 2-year cold retention is wrong:
RETENTION_COLD_DAYS: dict[AuditAction, int] = {
# ... existing ...
AuditAction.WEBHOOK_INVOKED: 365, # 1 year cold, like LOGIN_*
}
No migration is required — TextChoices stores the raw string in
the action column. Existing entries keep working.
Extend the whitelist for an existing model¶
AUDITABLE_FIELDS in constants.py declares which columns the
audit layer is allowed to snapshot into changes. Anything outside
the list is either silently ignored (the snapshot helper reads only
whitelisted attributes) or recorded as <redacted> (the service's
sanitiser when a caller passes a non-whitelisted field explicitly).
AUDITABLE_FIELDS = {
"workflows.Workflow": (
"name",
"description",
"is_public",
"agent_access_enabled",
"agent_public_discovery",
"new_field_to_audit", # ← add here
),
# ...
}
The rule of thumb:
- Add a field if it's config-level and a change is forensically interesting (workflow name, API-key scopes, org role).
- Don't add fields that carry PII beyond what the actor layer already covers (email/name), never add secrets (tokens, passwords, webhook-signing keys), never add customer content (validation payloads, file uploads).
Audit a new model¶
The model-audit registry in validibot/audit/model_audit.py drives
pre_save / post_save / pre_delete dispatch. Registering a model:
# validibot/audit/model_audit.py, register_builtin_model_audits():
from validibot.notifications.models import WebhookEndpoint
model_audit_registry.register(
WebhookEndpoint,
create=AuditAction.WEBHOOK_CREATED,
update=AuditAction.WEBHOOK_UPDATED,
delete=AuditAction.WEBHOOK_DELETED,
)
Then add an entry to AUDITABLE_FIELDS for the new model's
_meta.label so the diff capture knows which fields to snapshot.
That's it. The generic pre_save / post_save / pre_delete receivers
already attached by AuditConfig.ready() pick up the new
registration. No per-model signal wiring needed.
Registering only some events¶
A model that should only audit certain lifecycle events can pass
None for the rest. For example, Membership audits updates
only (join/leave is captured via dedicated invite/removal hooks):
Write an entry from service code¶
When a signal doesn't fit (e.g. a business rule triggers the event, not a DB row change), call the service directly:
from validibot.audit.constants import AuditAction
from validibot.audit.context import get_current_actor_spec
from validibot.audit.context import get_current_request_id
from validibot.audit.services import AuditLogService
AuditLogService.record(
action=AuditAction.USER_ERASURE_REQUESTED,
actor=get_current_actor_spec(),
target=target_user,
metadata={"requested_via": "admin_ui"},
request_id=get_current_request_id(),
)
get_current_actor_spec() and get_current_request_id() read the
per-request context installed by AuditContextMiddleware. Outside
a request (Celery task, management command) they return empty
defaults, which is the correct outcome — the entry is attributed to
the system rather than a forged user identity.
Confirm an entry landed¶
Three ways:
- Django admin —
/admin/audit/auditlogentry/. Read-only; searchable by actor email, target id, request id. - Pro UI —
/app/audit/when the deployment hasAUDIT_LOG. Org-scoped list + detail + filters + CSV / JSONL export (see below). - Shell:
from validibot.audit.models import AuditLogEntry
AuditLogEntry.objects.filter(
action="workflow_updated",
).order_by("-occurred_at")[:5]
Filters and exports (Pro UI)¶
The list view at /app/audit/ accepts these GET params, defined
in validibot.audit.forms.AuditLogFilterForm:
| Param | Type | Behaviour |
|---|---|---|
action |
AuditAction value |
Exact match; unknown values are a form error. |
actor |
string | icontains match on both actor.email (captured at write time) and the live actor.user.email. |
target_type |
string | Exact match on app.Model label (e.g. workflows.Workflow). |
date_from / date_to |
YYYY-MM-DD |
Start-of-day to end-of-day in the server timezone. Reversed ranges produce a form error. |
Filters stack as logical AND. The same querystring drives the
/app/audit/export/?format=<csv|jsonl> endpoint — the "Export CSV"
and "Export JSONL" buttons on the list page carry the current
filters through so "export current view" does what you'd expect.
Export format. Both formats emit the same flat row shape: one
record per AuditLogEntry with actor fields denormalised into the
row. CSV uses json.dumps for the two dict-shaped columns
(changes, metadata) so a downstream pandas.read_csv can
round-trip the nested structure. JSONL emits one JSON object per
line, compatible with streaming jq . and BigQuery ingest.
Streaming. Both formats use StreamingHttpResponse with
queryset.iterator(chunk_size=500) so memory footprint stays
bounded even for multi-year exports.
Rate limit. The export endpoint is capped at 10 requests per
hour per organisation, keyed by org id — the budget is shared
across every admin on the team. Over the limit returns 429 with
Retry-After: 3600. Rationale: make bulk scraping an unattractive
exfiltration channel while leaving normal filtered exports
unaffected.
Retention and archival¶
The audit table grows monotonically — without a retention policy, a
busy org will eventually push it into the "this query is slow now"
bracket. The enforce_audit_retention management command handles
pruning; a pluggable :class:~validibot.audit.archive.AuditArchiveBackend
decides what (if anything) happens to entries before they're deleted.
How it runs¶
The command is wired into the scheduled-task registry:
| Registry entry | Celery task name | Schedule | API endpoint |
|---|---|---|---|
enforce-audit-retention |
validibot.enforce_audit_retention |
30 2 * * * (daily 02:30) |
/api/v1/scheduled/enforce-audit-retention/ |
Running setup_validibot (or just self-hosted bootstrap)
picks up the registry entry automatically via
sync_schedules --backend=celery. No data migration, no manual
PeriodicTask row to create. See
configure-scheduled-tasks.md for the
full registry contract.
On GCP the same registry entry is reconciled with Cloud Scheduler
(sync_schedules --backend=cloud-scheduler), which posts to the
declared API endpoint on the worker service.
Settings matrix¶
| Setting | Default | What it controls |
|---|---|---|
AUDIT_HOT_RETENTION_DAYS |
90 |
Rows older than this are candidates for archive + delete. |
AUDIT_RETENTION_ENABLED |
True |
Kill-switch. When False the scheduled task becomes a logged no-op — useful during incident investigation. |
AUDIT_ARCHIVE_BACKEND |
"validibot.audit.archive.NullArchiveBackend" |
Dotted path to the backend class. Must satisfy the :class:AuditArchiveBackend protocol. |
AUDIT_ARCHIVE_FILESYSTEM_BASE_PATH |
"" |
Only used by the filesystem backend. Must be set when that backend is configured; the backend constructor raises otherwise. |
All four settings live in config/settings/base.py near the
audit block and read from env vars so deployments can override
without forking settings.
What the shipped backends do¶
| Backend | What it does | Who uses it |
|---|---|---|
NullArchiveBackend |
Returns a verified receipt naming every input id without writing anything. Retention still prunes the table, but the rows are gone for good. | Community deployments that only want "stop the table from growing". The default. |
FilesystemArchiveBackend |
Writes org_<id>/YYYY/MM/DD.jsonl.gz partitions under AUDIT_ARCHIVE_FILESYSTEM_BASE_PATH with a SHA-256 sidecar. Atomic write (tempfile + fsync + rename). |
Self-hosted Pro deployments with a persistent volume. Reference implementation of the contract. |
GCSArchiveBackend (commercial add-on) |
Same file format as the filesystem backend, written to a CMEK-encrypted GCS bucket; verification re-reads the object and compares SHA-256. Shipped as part of the hosted offering. | The Validibot Cloud deployment. |
The cloud backend is a layer above the community scaffolding — the retention command doesn't know or care which backend it's driving.
Verified-upload-before-delete invariant¶
This is the one contract the command enforces and tests cover explicitly:
# Simplified pseudocode of the inner loop.
receipt = backend.archive(chunk) # may raise → command aborts
if receipt.archived_ids:
AuditLogEntry.objects.filter(pk__in=receipt.archived_ids).delete()
# Rows not in receipt.archived_ids stay in the DB for the next run.
A backend that fails to archive row 42 simply omits it from
receipt.archived_ids — row 42 survives and gets retried on the
next scheduled run. A backend that raises aborts the run without
deleting anything; the scheduler's retry picks it up later.
Writing your own backend¶
Implement the protocol and point the setting at it. Nothing else to register:
# my_project/audit_archive.py
from collections.abc import Iterable
from validibot.audit.archive import ArchiveReceipt
from validibot.audit.models import AuditLogEntry
class S3ArchiveBackend:
"""Example: write gzipped JSONL to S3 with SHA-256 verification."""
def __init__(self) -> None:
# Read settings / env here — the command instantiates with
# zero args, so any config must come from Django settings or
# os.environ.
...
def archive(self, entries: Iterable[AuditLogEntry]) -> ArchiveReceipt:
materialised = list(entries)
# 1. Serialise + upload.
# 2. Re-read + verify checksum.
# 3. Return a receipt naming only the ids that verified.
return ArchiveReceipt(
archived_ids=[e.pk for e in materialised],
location="s3://audit-archive/...",
verified=True,
)
Raise on unrecoverable errors; return a partial archived_ids
list for per-row failures. Do not return verified=False —
the command treats that the same as "nothing archived" but it's
a foot-gun because the backend has the only authoritative view of
whether the bytes actually landed.
CLI flags¶
The command accepts a few overrides for ad-hoc operator use:
| Flag | Purpose |
|---|---|
--dry-run |
Count eligible rows and report. No backend call, no delete. Output is prefixed with [DRY-RUN] so logs are unambiguous. |
--retention-days N |
Override AUDIT_HOT_RETENTION_DAYS for one invocation. Useful for a one-off cleanup after an incident (--retention-days 30 to free space). |
--chunk-size N |
Override the default 500-row chunks. Lower for a backend with slow writes; higher for a fast local filesystem. |
--limit N |
Stop after processing N rows. Testing only — production runs should let the whole eligible set through. |
Manual invocation (host or Docker):
# Dry-run with the configured backend.
docker compose exec web python manage.py enforce_audit_retention --dry-run
# Ad-hoc narrow window after an incident.
docker compose exec web python manage.py enforce_audit_retention --retention-days 30
Security guardrails — tests you must write¶
For any new capture point, write at least three tests:
- Positive capture — trigger the event, assert exactly one entry
appears with the expected
action. - Sanitisation — pass a plausibly dangerous field in
changes(credential, token, secret URL) and assert it lands as<redacted>. The regression guard here prevents OWASP-grade credential leaks if someone later widensAUDITABLE_FIELDSwithout thinking. - Actor attribution — verify the entry's
actor.useris set (for authenticated events) orNone(for system / failed-auth events), never the wrong user.
See validibot/audit/tests/test_signals.py for working examples
that follow this pattern.
Don't do¶
- Don't mutate an audit entry post-hoc. Rows are append-only
outside the Phase-3 erasure-sanitisation workflow (itself audited
via
AUDIT_ENTRY_SANITISED). If you need to correct something, write a new entry describing the correction. - Don't use the audit log for product analytics. Use
validibot/tracking/— it's cheaper, retention-tuned for hot dashboard queries, and ships in community for every tier. - Don't skip the field whitelist. If a new event's
changesdiff doesn't fit any existing whitelist entry, add a whitelist entry for the model. Do not bypass the sanitiser — that's the barrier between "structured audit log" and "ad-hoc secret dump".
See also¶
- Audit archive (GCS) — operational guide for the archive backend that complements this how-to.