Retries and Idempotency

Any background work that can fail should be designed with retry behavior in mind.

Idempotent work can run more than once without corrupting state or duplicating irreversible side effects.

Where Retries Appear

Retries can appear in:

Design the workflow, not just the transport.

A job handler should be safe when the same payload is delivered more than once.

Common techniques:

Do not assume event subscriber errors are durable retry signals.

If a subscriber must perform retryable work, dispatch a job from the subscriber and let the queue own worker lifecycle and retry behavior.

Schedules should tolerate overlap, missed runs, and reruns.

Use stable schedule names and explicit locking or overlap protection when the work must not run concurrently.

Be explicit when work sends email, charges money, writes files, calls external APIs, or publishes additional events.

Ask:

Common mistakes

Do not assume retries are safe by default.
Do not use events as the retry system for critical work.
Do not let anonymous callbacks hide operational identity.
Do not perform irreversible external side effects before durable state is ready.
Do not ignore shutdown behavior for long-running jobs.