Changelog for Oban Pro v1.3

This release is entirely dedicated to Smart engine optimizations, from slashing queue transactions to boosting bulk insert performance.

📮 Async Tracking

Rather than synchronously recording updates (acks) in a separate transaction after jobs execute, the Smart engine now bundles acks together to minimize transactions and reduce load on the database.

Async tracking, combined with the other enhancements detailed below, showed the following improvements over the previous Smart engine when executing 1,000 jobs with concurrency set to 20:

  • Transactions reduced by 97% (1,501 to 51)
  • Queries reduced by 94% (3,153 to 203)

That means less Ecto pool contention, fewer transactions, fewer queries, and fewer writes to the oban_producers table! There are similar, albeit less flashy, improvements over the Basic engine as well.

Notes and Implementation Details

  • Acks are stored centrally per queue and flushed with the next transaction using a lock-free mechanism that never drops an operation.

  • Acks are grouped and executed as a single query whenever possible. This is most visible in high throughput queues.

  • Acks are preserved across transactions to guarantee nothing is lost in the event of a rollback or an exception.

  • Acks are flushed on shutdown and when the queue is paused to ensure data remains as consistent as the previous synchronous version.

  • Acking is synchronous in testing mode, when draining jobs, and when explicitly enabled by a flag provided to the queue.

See the Smart engine's async tracking section for more details and instructions on how to selectively opt out of async mode.

v1.3.2 — 2024-01-19

Bug Fixes

  • [Smart] Ensure global queues keep running with ack_async: false.

    Global queues that are marked with ack_async: false must refresh the in-memory producer record between job fetching to keep the queue running. Otherwise, tracked jobs linger in the producer record despite successful acking.

  • [Smart] Prevent a race condition while pausing from stopping global queues.

    Pausing a global queue while there are pending acks could trigger a write-after-read race condition that lost tracking changes. Eventually, leaked changes could prevent the queue from fetching new jobs because it looked like the global limit was met.

  • [Smart] Always split completed ack queries for recorded jobs.

    Jobs with different recorded output could mistakenly be written with a single query if they completed within a few ms of each other. This changes the grouping mechanism to only bundle simple completions, never recorded completions.

v1.3.1 — 2024-01-17

Bug Fixes

  • [Smart] Default to synchronous acking when Oban is in a testing mode.

    Acking should always be synchronous during tests to prevent flickering failures from race conditions. Previously, acking relied on a failed registry lookup to switch to synchronous mode, which wasn't accurate enough.

  • [Smart] Default to synchronous acking for drain_jobs/2 and related test helpers.

    Draining runs synchronously in the test process, but not in testing mode. This explicitly disables ack_async when draining jobs.

  • [DynamicPartitioner] Only sub-partition by date in non-test environments.

    To prevent testing errors after migration, the completed, cancelled, and discarded states are sub-partitioned by date only in :dev and :prod environments.

    It's possible to enable date partitioning in other production-like environments with the new date_partition? flag.

  • [DynamicPartitioner] Rename existing args and meta indexes to allow index recreation

    When renaming the existing table to oban_jobs_old the args and meta indexes weren't renamed. That prevented creating those indexes on the new partitioned table, because Postgres detects that those indexes already existed and so it skips their creation.

v1.3.0 — 2024-01-16

Enhancements

  • [Smart] Skip extra query to "touch" the producer when acking without global or rate limiting enabled. This change reduces overall producer updates from 1 per job to 2 per minute for standard queues.

  • [Smart] Avoid refetching the local producer's data when fetching new jobs.

    Async acking is centralized through the producer, which guarantees global and rate tracking data is up-to-date before fetching without an additional read.

  • [Smart] Optimize job insertion with fewer iterations.

    Iterating through job changesets as a map/reduce with fewer conversions improves inserting 1k jobs by 10% while reducing overall memory by 9%.

  • [Smart] Efficiently count changesets during insert_all.

    Prevent duplicate iterations through changesets to count unique jobs. Iterating through them once to accumulate multiple counts improved insertion by 3% and reduced overall memory by 2%.

  • [Smart] Acking cancelled jobs is done with a single operation and limited to queues with global limiting.

Bug Fixes

  • [Smart] Always merge acked meta updates dynamically.

    All meta-updating queries are dynamically merged with existing meta. This prevents recorded jobs from clobbering other meta updates made while the job executed.

  • [Smart] Safely extract producer uuid from attempted_by with more than two elements

  • [DynamicCron] Preserve stored opts such as args, priority, etc., on reboot when no new opts are set.

  • [Relay] Skip attempting relay notifications when the associated Oban pid isn't alive.