Hacker News Clone

TexanFellerJan 25, 2026, 6:04 PM

Ofc I wouldn't us it for extremely high scale event processing, but it's great default for a message/task queue for 90% of business apps. If you're processing under a few 100m events/tasks per day with less than ~10k concurrent processes dequeuing from it it's what I'd default to.

I work on apps that use such a PG based queue system and it provides indispensable features for us we couldn't achieve easily/cleanly with a normal queue system such as being able to dynamically adjust the priority/order of tasks being processed and easily query/report on the content of the queue. We have many other interesting features built into it that are more specific to our needs as well that I'm more hesitant to describe in detail here.

j45Jan 25, 2026, 8:59 PM

Very few things dna start at an extremely high scale event processing.

There’s also an order of magnitude higher events when doing event based work in processing.

This seems like a perfectly reasonable starting and gateway points that can have things organized for when the time comes.

Most things don’t scale that big.

rbransonJan 25, 2026, 4:51 PM

Biggest thing to watch out with this approach is that you will inevitably have some failure or bug that will 10x, 100x, or 1000x the rate of dead messages and that will overload your DLQ database. You need a circuit breaker or rate limit on it.

rr808Jan 25, 2026, 6:11 PM

I worked on an app that sent an internal email with stack trace whenever an unhandled exception occurred. Worked great until the day when there was an OOM in a tight loop on a box in Asia that sent a few hundred emails per second and saturated the company WAN backbone and mailboxes of the whole team. Good times.

withJan 25, 2026, 10:52 PM

This is the same risk with any DLQ.

The idea behind a DLQ is it will retry (with some backoff) eventually, and if it fails enough, it will stay there. You need monitoring to observe the messages that can't escape DLQ. Ideally, nothing should ever stay in DLQ, and if it does, it's something that should be fixed.

shayonjJan 25, 2026, 5:11 PM

This! Only thing worse than your main queue backing off is you dropping items from going into the DLQ because it can’t stay up.

plaguuuuuuJan 26, 2026, 1:44 AM

Could one put the DLQ messages on a queue and have a consumer ingest into pg?

(The queue probably isnt down if you've just pulled a message off it)

j45Jan 25, 2026, 9:00 PM

It will happen eventually in any system.

No need to look down on PG because it makes it more approachable and is more longer a specialized skill.

exabrialJan 25, 2026, 5:23 PM

> FOR UPDATE SKIP LOCKED

Learned something new today. I knew what FOR UPDATE did, but somehow I've never RTFM'd hard enough to know about the SKIP LOCKED directive. Thats pretty cool.

scresswellJan 25, 2026, 8:41 PM

Yes, SKIP LOCKED is great. In practice you nearly always want LIMIT, which the article did not mention. Be careful if your selection spans multiple tables: only the relations you explicitly lock are protected (see SELECT … FOR UPDATE OF t1, t2). ORDER BY matters because it controls fairness and retry behaviour. Also watch ANALYZE: autoanalyze only runs once the dead to live tuple threshold is crossed, and on large or append heavy tables with lots of old rows this can lag, leading to poor plans and bad SKIP LOCKED performance. Finally, think about deletion and lifecycle: deleting on success, scheduled cleanup (consider pg_cron), or partitioning old data all help keep it efficient.

exabrialJan 26, 2026, 4:05 AM

I can see how that'd be extremely useful with LIMIT, especially with XA. Take a stride, complete it, or put it back for someone else.

Something I've still not mastered is how to prevent lock escalation into table-locks, which could torpedo all of this.

metanonsenseJan 25, 2026, 7:15 PM

only learned about SKIP LOCKED because ChatGPT suggested it to solve some concurrency problem I had. Great tool to learn such things.

indigo945Jan 25, 2026, 7:39 PM

Great tool that wrote the blog post in the OP also, so it's quite versatile.

withJan 25, 2026, 10:58 PM

Great application of first principles. I think it's totally reasonable also, at even most production loads. (Example: My last workplace had a service that constantly roared at 30k events per second, and our DLQs would at most have orders of hundreds of messages in them). We would get paged if a message's age was older than an hour in the queue.

The idea is that if your DLQ has consistently high volume, there is something wrong with your upstream data, or data handling logic, not the architecture.

branko_dJan 26, 2026, 3:31 AM

Why use string as status, instead of a boolean? That just wastes space for no discernable benefit, especially since the status is indexed. Also, consider turning event_type into an integer if possible, for similar reasons.

Furthermore, why have two indexes with the same leading field (status)?

kristovJan 25, 2026, 8:00 PM

Why use shedlock and select-for-update-skip-locked? Shedlock stops things running in parallel (sort-of), but the other thing makes parallel processing possible.

shooJan 25, 2026, 8:48 PM

re: SKIP LOCKED, introduced in postgres 9.5, here's an an archived copy [†] of the excellent 2016 2ndquadrant post discussing it

https://web.archive.org/web/20240309030618/https://www.2ndqu...

corresponding HN discussion thread from 2016 https://news.ycombinator.com/item?id=14676859

[†] it seems that all the old 2ndquadrant.com blog post links have been broken after their acquisition by enterprisedb

upmostlyJan 25, 2026, 9:07 PM

We just published a detailed walkthrough of this exact pattern with concrete examples and failure modes:

PostgreSQL FOR UPDATE SKIP LOCKED: The One-Liner Job Queue https://www.dbpro.app/blog/postgresql-skip-locked

It covers the race condition, the atomic claim behaviour, worker crashes, and how priorities and retries are usually layered on top. Very much the same approach described in the old 2ndQuadrant post, but with a modern end-to-end example.

victor106Jan 25, 2026, 9:54 PM

Love your product. Will you ever provide support of duckdb/motherduck? Wish there is a generic way you provided to add any database type

upmostlyJan 25, 2026, 10:28 PM

Thanks, glad you like it.

DuckDB is on our radar. In practice each database still needs some engine-specific work to feel good, so a fully generic plugin system is harder than it sounds. We are thinking about how to do this in a scalable way.

renewiltordJan 25, 2026, 5:38 PM

Segment uses MySQL as queue not even as DLQ. It works at their scale. So there are many (not all) systems that can tolerate this as queue.

I have simple flow: tasks are order of thousands an hour. I just use postgresql. High visibility, easy requeue, durable store. With appropriate index, it’s perfectly fine. LLM will write skip locked code right first time. Easy local dev. I always reach for Postgres for event bus in low volume system.

tantalorJan 26, 2026, 2:01 AM

This is logging.

bdangubicJan 26, 2026, 2:04 AM

Care to elaborate? I do not understand how is this logging, it is quite opposite of logging as once the retry works the DLQ gets wiped out - would assume you would like logging to be persistent with at least a little bit of retention?

nicoritschelJan 25, 2026, 6:55 PM

lol a FOR UPDATE SKIP LOCKED post hits the HN homepage every few months it feels like

whateveracctJan 25, 2026, 6:59 PM

and another CTO will use this meme as a reason to "just use Postgres" for far longer than they should lmao

throw_away_623Jan 25, 2026, 11:38 PM

I’ll take “just use Postgres” over “prematurely add three new systems” any day. Complexity has a cost too.

Using Postgres too long is probably less harmful than adding unnecessary complexity too early

whateveracctJan 25, 2026, 11:58 PM

It probably is, but I don't like to operate as if I will inevitably make giant mistakes. Sometimes there isn't a trade off - you can just be good lolol.

Both are pretty bad.

gytisgreitaiJan 25, 2026, 7:14 PM

Would be interesting to see the numbers this system processes. My bet is that they are not that high.

cpursleyJan 25, 2026, 7:04 PM

https://github.com/pgmq/pgmq

awesome_dudeJan 25, 2026, 10:08 PM

I think that using Postgres as the message/event broker is valid, and having a DLQ on that Postgres system is also valid, and usable.

Having SEPARATE DLQ and Event/Message broker systems is not (IMO) valid - because a new point of failure is being introduced into the architecture.

reactordevJan 25, 2026, 5:32 PM

Another day, another “Using PostgreSQL for…” thing it wasn’t designed for. This isn’t a good idea. What happens when the queue goes down and all messages are dead lettered? What happens when you end up with competing messages? This is not the way.

tonymetJan 25, 2026, 7:54 PM

Postgres is essentially a b-tree with a remote interface. Would you use a b-tree to store a dead letter queue? What is big O of insert & delete? what happens when it grows?

Postgres has a query interface, replication, backup and many other great utilities. And it’s well supported, so it will work for low-demand applications.

Regardless, you’re using the wrong data structure with the wrong performance profile, and at the margins you will spend a lot more money and time than necessary running it . And service will suffer.

quibonoJan 25, 2026, 11:25 PM

What would you use?

Using PostgreSQL as a Dead Letter Queue for Event-Driven Systems

Comments