You can do some very neat things with this, especially since it’s implemented with Background Workers
> Care is taken that these extra processes do not interfere with other postmaster tasks: only one such process is started on each ServerLoop iteration. This means a large number of them could be waiting to be started up and postmaster is still able to quickly service external connection requests.
cron running on only one server is unreliable and "high availability" cron is complex. So, it's unbelievable how no such solution existed before. It might not be the most elegant way, but having a centralized cron (just like the database is centralized) helps a lot.
> So, it's unbelievable how no such solution existed before
I think a lot of folks solved this problem at the application level using background workers or job processing services.
For example, Celery with Python or Sidekiq with Ruby has had recurring job support for a really long time. Celery supported this since 2010. You could run distributed jobs (recurring or not) this way without using cron directly or requiring a custom DB extension.
IMO it feels much more natural to solve this problem at the app level with a background worker since you can run any task you want, not just DB queries. This also has the added benefit of letting you use your DB library of choice (ORM, etc.) since you can run any code you want within the job.
Although Celery itself is distributed, the cron manager (celery-beat) is not and will not coordinate with other nodes. If you have two nodes running celery-beat, you'll get two jobs dispatched every time a cron triggers. imo defeats the whole purpose, at that point I might as well use crontab.
Not sure it's the best example, having 0.01% of emails being sent twice is probably acceptable for most businesses. If it's a payment processing, it's not acceptable
> If it's a payment processing, it's not acceptable
In the payment processing company I'm working at we deal with that independently. We set some data on Redis with a TTL to signal some request has already been made in the last X minutes, and if the same request arrives again during the same period the cached response is served
That is really cool! No more crontab with the PostgreSQL CLI's execute and username (and potentially password) to do e.g. clean up tasks in the database.
Does anyone know which time config this will use? Is it the system time, UTC, or something else?
What if your 'business hour' was 0130[^] (business logic for some reason, rather than a working hour, presumably; hence quotes) - what do you want to happen when you go in to Summer time, or come out of it? Does everybody want the same thing? Is it even obvious what it probably does such that nobody's going to have a bad night?
[^] in the UK at least, clocks go forward skipping the hour 0100-0200, and repeat it when they go back. If that happens at a different time where you are (this hypothetical feature gets even more complicated! and) then I mean whatever appropriate time for my point to make sense.
Not entirely sure why you're being downvoted, but it's true.
Sometimes you need something to run at 9am daily, in your local time zone. If your local time zone shifts for daylight savings, you can't define it in UTC without changing it twice a year.
ho boy, before I break out "falsehoods programmers believe about time,"[0] nobody wants to manually have to rewrite the UTC time something's supposed to run at when the business hours keep changing relative to UTC for (at the most basic) Daylight Savings reasons.
> Care is taken that these extra processes do not interfere with other postmaster tasks: only one such process is started on each ServerLoop iteration. This means a large number of them could be waiting to be started up and postmaster is still able to quickly service external connection requests.
I have had the occasional need to (ab)use it for webhooks https://supabase.com/blog/2021/03/05/postgres-as-a-cron-serv...