Hacker News Clone

MaxGabrielJan 20, 2026, 11:17 PM

This might stem from the domain I work in (banking), but I have the opposite take. Soft delete pros to me:

* It's obvious from the schema: If there's a `deleted_at` column, I know how to query the table correctly (vs thinking rows aren't DELETEd, or knowing where to look in another table)

* One way to do things: Analytics queries, admin pages, it all can look at the same set of data, vs having separate handling for historical data.

* DELETEs are likely fairly rare by volume for many use cases

* I haven't found soft-deleted rows to be a big performance issue. Intuitively this should be true, since queries should be O log(N)

* Undoing is really easy, because all the relationships stay in place, vs data already being moved elsewhere (In practice, I haven't found much need for this kind of undo).

In most cases, I've really enjoyed going even further and making rows fully immutable, using a new row to handle updates. This makes it really easy to reference historical data.

If I was doing the logging approach described in the article, I'd use database triggers that keep a copy of every INSERT/UPDATE/DELETEd row in a duplicate table. This way it all stays in the same database—easy to query and replicate elsewhere.

postexitusJan 21, 2026, 10:29 AM

Soft deletes in banking are just a Band-Aid to the much bigger problem of auditability. You may keep the original record by soft deleting it, but if you don't take care of amends, you will still lose auditability. The correct way is to use EventSourcing, with each change to an otherwise immutable state being recorded as an Event, including a Delete (both of an Event and the Object). This is even more problematic from a performance sense, but Syncs and Snapshots are for that exact purpose - or you can back the main table with a separate events table, with periodic "reconstruct"s.

infamiaJan 21, 2026, 7:58 PM

> The correct way is to use EventSourcing, with each change to an otherwise immutable state being recorded as an Event, including a Delete (both of an Event and the Object).

Another great (and older) approach is adding temporal information do your traditional database, which gives immutability without the eventual consistency headaches that normally comes with event sourcing. Temporal SQL has their own set of challenges of course, but you get to keep 30+ years of relational DB tooling which is a boon. Event sourcing is great, but we shouldn't forget about other tools in our toolbelt as well!

postexitusJan 22, 2026, 2:43 PM

I am using Temporal tables in SQL Server right now - I agree it's a bit of best of both worlds; but they are also painful to manage. I believe there could be a better solution without sacrificing SQL tools.

taericJan 21, 2026, 2:29 PM

Isn't this, essentially, backing into double-entry accounting for all things banking? Which, fair, it makes sense.

postexitusJan 22, 2026, 2:42 PM

Good analogy, double-entry book keeping, generalized. (Nothing specific to banking btw)

taericJan 22, 2026, 3:15 PM

Fair that I shouldn't have said it was specific to banking.

gleennJan 21, 2026, 12:49 AM

If you're implementing immutable DB semantics maybe you should consider Datomic or alternatives because then you get that for free, for everything, and you also get time travel which is an amazing feature on top. It lets you be able to see the full, coherent state of the DB at any moment!

arter45Jan 21, 2026, 12:29 PM

My understanding is that Datomic uses something like Postgres as a storage backend. Am I right?

Also, it doesn't support non-immutable use cases AFAIK, so if you need both you have to use two database technologies (interfaces?), which can add complexity.

ndrJan 21, 2026, 2:58 PM

Datomic can use various storage services. Yes, pg is one option, but you can have DynamoDB, Cassandra, SQLServer and probably more.

> Also, it doesn't support non-immutable use cases AFAIK

What do you mean? It's append only but you can have CRUD operations on it. You get a view and of the db at any point in time if you so wish, but can support any CRUD use case. What is your concern there?

It will work well if you're read-heavy and the write throughput is not insanely high.

I wouldn't say it's internally more complex than your pg with whatever code you need to make it work for these scenarios like soft-delete.

From the DX perspective is incredibly simple to work on (see Simple Made Easy from Rich Hickey).

ndrJan 21, 2026, 3:06 PM

Also good real-world use case talk: https://www.youtube.com/watch?v=A3yR4OlEBCA

arter45Jan 21, 2026, 5:10 PM

Thanks, I'll look into it. My current setup for this kind of use cases is pretty simple. You essentially keep an additional field (or key if you're non relational) describing state. Every time you change state, you add a new row/document with a new timestamp and new values of state. Because I'm not introducing a new technology for this use case, I can easily mix mutable and non-mutable use cases in the same databases (arguably even in the same table/collection, although it probably makes little sense at least to me).

arnsholtJan 21, 2026, 11:49 AM

The core system at my previous employer (an insurance company) worked along the lines of the solution you outline at the end: each table is an append only log of point in time information about some object. So the current state is in the row with the highest timestamp, and all previous stars can be observed with appropriate filters. It’s a really powerful approach.

arter45Jan 21, 2026, 12:25 PM

So basically something like this?

(timestamp, accountNumber, value, state)

And then you just

SELECT state FROM Table WHERE accountNumber = ... ORDER BY timestamp DESC LIMIT 1

right?

arnsholtJan 21, 2026, 12:34 PM

Yeah, basically. The full system actually has more date stuff going on, to support some other more advanced stuff than just tracking objects themselves, but that's the overall idea. When you need to join stuff it can be annoying to get the SQL right in order to join the correct records from a different table onto your table of interest (thank Bob for JOIN LATERAL), but once you get the hang of it it's fairly straightforward. And it gives you the full history, which is great.

arter45Jan 21, 2026, 12:41 PM

Sounds cool! Do you keep all data forever in the same table? I assume you need long retention, so do you keep everything in the same table for years or do you keep a master table for, let's say, the current year and then "rotate" (like logrotate) previous stuff to other tables?

Even with indices, a table with, let's say, a billion rows can be annoying to traverse.

arnsholtJan 21, 2026, 1:46 PM

I wasn’t involved in the day to day operations of the system, but it had records going back to the 90s at least I think. I think data related to non accepted offers were deleted fairly quickly (since they didn’t end up being actual customers), but outside of that I think everything was kept more or less indefinitely.

ndrJan 21, 2026, 2:52 PM

This is also a recurring pattern when using bigtable.

ozimJan 21, 2026, 2:43 AM

DELETEs are likely fairly rare by volume for many use cases

I think one of our problems is getting users to delete stuff they don’t need anymore.

MaxGabrielJan 22, 2026, 11:47 PM

One thing to add about performance: it's also pretty easy in Postgres to index only non-soft deleted data.

I think this is likely unnecessary for most use cases and is mostly a RAM saving measure, but could help in some cases.

rawgabbitJan 21, 2026, 1:00 AM

I have worked with databases my entire career. I hate triggers with a passion. The issue is no one “owns” or has the authority to keep triggers clean. Eventually triggers become a dumping ground for all sorts of nasty slow code.

I usually tell people to stop treating databases like firebase and wax on/wax off records and fields willy nilly. You need to treat the database as the store of your business process. And your business processes demand retention of all requests. You need to keep the request to soft delete a record. You need to keep a request to undelete a record.

Too much crap in the database, you need to create a field saying this record will be archived off by this date. On that date, you move that record off into another table or file that is only accessible to admins. And yes, you need to keep a record of that archival as well. Too much gunk in your request logs? Well then you need to create an archive process for that as well.

These principles are nothing new. They are in line with “Generally Accepted Record Keeping Principles” which are US oriented. Other countries have similar standards.

indigo945Jan 21, 2026, 12:03 PM

What you describe is basically event sourcing, which is definitely popular. However, for OLAP, you will still want a copy of your data that only has the actual dimensions of interest, and not their history - and the easiest way to create that copy and to keep it in sync with your events is via triggers.

rawgabbitJan 21, 2026, 4:18 PM

Business processes and the database systems I described (and built) have existed before event sourcing was invented. I had built what is essentially event sourcing using nothing more than database tables, views, and stored procedures.

jackfranklynJan 21, 2026, 1:08 PM

The query complexity is the bit that catches teams off guard. You tell yourself "just add WHERE deleted_at IS NULL everywhere" but then you're six months in and someone's debugging why a report is showing ghost data because one query in a chain of 12 missed the filter.

Views help, but then you're maintaining parallel access patterns. And the moment you need to actually query deleted records (audit, support tickets, undo) you're back to bypassing your own abstractions.

Event sourcing solves this more cleanly but the operational overhead is real - most teams I've seen try it end up with a hybrid where core entities are event-sourced and everything else is just soft deleted with fingers crossed.

tucnakJan 21, 2026, 1:38 PM

I'm struggling to see your point. CREATE VIEW not only helps, yes, indeed it's oftentimes exactly all you need. If you have multiple access patterns, like having to "actually query deleted records" sometimes, somewhere, at some point, someone would have to maintain invariants on these access patterns. This is not rocket science. The heart of the matter is that SWE's cannot handle schema/basic SQL to save their lives, whilst analysts/BI guys/whomever actually somewhat well-versed in SQL, have very little grasp on the inner working of a database, and carry with themselves idiosyncrasies coming all the way back from the 90's.

The pot is calling the kettle black.

Forget about soft deletes for a hot minute. I can give you another super basic example where in my experience SWE's and BI guys both lose the plot: Type 2 slowly-changing dimensions. This is actually heavily related to soft deletes, and much more common as far as access patterns are concerned. Say, you want to support data updates without losing information unless specified by a retention policy. For argument's sake, let's say you want to keep track of edits in the user profile. How do you do it? If you go read up on Stackoverflow, or whatever, you will come across the idea that did more violence to schemas worldwide than anything else in existence, "audit table." So instead of performing a cheap INSERT on a normalised data structure every time you need to make a change, and perhaps reading up-to-date data from a view, you're now performing costly UPDATE, and additional INSERT anyway. Why? Because apparently DISTINCT ON and composite primary keys are black magic (and anathema to ORM's in general.) If you think on BI side they're doing any better, you think wrong! To them, DISTINCT ON is oftentimes a mystery no less. One moment, blink, there you go, back in the subquery hell they call home.

Databases are beautiful, man.

It's a shame they are not treated with more respect that they deserve.

manoDevJan 21, 2026, 1:59 PM

I believe this all stems from primordial SQL focusing on storage efficiency, and now it’s kinda hard to retrofit better data modeling ideas without better affordances.

If I started from scratch, I would get rid of UPDATE and DELETE (these would be only very special cases for data privacy), and instead focus on first class views (either batch copy or streaming) and retention policies.

lazideJan 21, 2026, 7:47 PM

It’s just a lot of overhead (in every way) if you’re just trying to store some rows and columns.

hnthrow0287345Jan 21, 2026, 1:09 AM

Maybe I'm shooting for the moon, but I'd like soft delete to be some kind of built-in database feature. It would be nice to enable it on a table then choose some built-in strategies on how it's handled.

Soft-delete is a common enough ask that it's probably worth putting the best CS/database minds to developing some OOTB feature.

CentigonalJan 21, 2026, 3:18 AM

Many data warehousing paradigms (e.g. Iceberg, Delta Lake, BigQuery) offer built-in "time travel," sometimes combined with scheduled table backups. That said, a lot of the teams I've worked with who want soft-delete also have other requirements that necessitate taking a custom approach (usually plain ol' SCD) instead of using the platform-native implementation.

refsetJan 21, 2026, 7:38 AM

> other requirements

In my experience, usually along the lines of "what was the state of the world?" (valid-time as-of query) instead of "what was the state of the database?" (system-time as-of query).

patatesJan 21, 2026, 6:04 AM

Trigger-based approach is the only one that really works in my experience. Partition the archive table in a way that makes sense for your data and you're good to go.

Some more rules to keep it under control:

Partition table has to be append-only. Duh.

Recovering from a delete needs to be done in the application layer. The archive is meant to be a historical record, not an operational data store. Also by the time you need to recover something, the world may have changed. The application can validate that restoring this data still makes sense.

If you need to handle updates, treat them as soft deletes on the source table. The trigger captures both the old state (before update) and continues normally. Your application can then reconstruct the timeline by ordering archive records by timestamp.

Needless to say, make sure your trigger fires BEFORE the operation, not AFTER. You want to capture the row state before it's gone. And keep the trigger logic dead simple as any complexity there will bite you during high-traffic periods.

For the partition strategy, I've found monthly partitions work well for most use cases. Yearly if your volume is low, daily if you're in write-heavy territory. The key is making sure your common queries (usually "show me history for entity X" or "what changed between dates Y and Z") align with your partition boundaries.

talesmm14Jan 20, 2026, 11:15 PM

I've worked at companies where soft delete was implemented everywhere, even in irrelevant internal systems... I think it's a cultural thing! I still remember a college professor scolding me on an extension project because I hadn't implemented soft delete... in his words, "In the business world, data is never deleted!!"

salomonk_murJan 21, 2026, 12:35 AM

But... It's true. Deleting data completely is an easy way to gimp and lobotomize your future analysis.

Storage is cheap. Never delete data.

ziml77Jan 21, 2026, 12:51 AM

I prefer audit tables. Soft deletes don't capture updates, audit tables do (you could make every update a delete and insert in a soft delete table, but that adds a lot of bloat to the table)

yxhuvudJan 21, 2026, 4:35 AM

Deleting data is also a very easy way to not get GDPR compliance issues. Data is a cost and a risk, and should be minimised to what is actually relevant. Storage is the least part of the cost.

phitoJan 21, 2026, 5:33 AM

Not an issue if you're not building SaaS

yxhuvudJan 21, 2026, 10:53 AM

Depends on your jurisdiction I suppose. If you are in EU it's a question if you have PII or not - if you are a SaaS or not is totally irrelevant.

sfn42Jan 21, 2026, 8:35 AM

Depends on the data in question. Some data is worth keeping, other data isn't.

mrkeenJan 21, 2026, 12:02 AM

No comment from the professor on modifications though?

dagssJan 21, 2026, 3:17 PM

I just long for DBs to evolve from "stateful" to "stateless". CQRS at the DB level.

* All inserts into append only tables. ("UserCreatedByEnrollment", "UserDeletedBySupport" instead of INSERT vs UPDATE on a stateful CRUD table)

* Declare views on these tables in the DB that present the data you want to query -- including automatically maintained materialized indices on multiple columns resulting from joins. So your "User" view is an expression involving those event tables (or "UserForApp" and "UserForSupport"), and the DB takes care of maintaining indices on these which are consistent with the insert-only tables.

* Put in archival policies saying to delete / archive events that do not affect the given subset of views. ("Delete everything in UserCreatedByEnrollment that isn't shown through UserForApp or UserForSupport")

I tend to structure my code and DB schemas like this anyway, but lack of smoother DB support means it's currently for people who are especially interested in it.

Some bleeding edge DBs let you do at least some of this efficient and user-friendly. I.e. they will maintain powerful materialized views and you don't have to write triggers etc manually. But I long for the day we get more OLTP focus in this area not just OLAP.

jperrasJan 21, 2026, 3:32 PM

This is just… event sourcing?

https://martinfowler.com/eaaDev/EventSourcing.html

dagssJan 21, 2026, 3:41 PM

Yes it is.

My point is that event sourcing would have been a lot less painful if popular DBs had builtin support for it in the way I describe.

If you go with event sourcing today you end up with having to do a lot of things that the DB could have been able to handle automatically, but there's an abstraction mismatch.

(I've worked with 3-4 different strategies for doing event sourcing in SQL DBs in my career)

rorylaitilaJan 20, 2026, 11:30 PM

Databases store facts. Creating a record = new fact. "Deleting" a record = new fact. But destroying rows from tables = disappeared fact. That is not great for most cases. In rare cases the volume of records may be a technical hurdle; in which case, move facts to another database. The times I've wanted to destroy large volume of facts is approximately zero.

pixl97Jan 21, 2026, 1:57 AM

When you start thinking of data as a potentially toxic asset with a maintenance cost to ensure it doesn't leak and cause an environmental disaster, it becomes more likely that you'd want to get rid of large volumes of facts.

dparkJan 21, 2026, 1:46 AM

Unless your database is immutable, every changed a record causes a “disappeared fact”.

There are many legitimate reasons to delete data. The decision to retain data forever should not be taken lightly.

pmontraJan 21, 2026, 7:49 AM

Yes. Another way to look at databases is that they store the state at given time. We can augment tables with valid_from, valid_to columns to retrieve the state at a particular time. In that case there is never a DELETE, only INSERTs and UPDATEs of the valid_to column. Maybe this is what you mean with immutable database.

The problems are mostly the same as with soft delete: valid_to is more or less the same as deleted_at, which we probably need anyway to mark a record as deleted instead of simply updated. Furthermore, there are way more records in the db. And what about the primary key? Maybe those extra records go to an history table to keep the current table slim and with a unique primary key which is not augmented by some artificial extra key. There are a number of possible designs.

keithluuJan 21, 2026, 6:55 AM

Agreed. In fact I believe there should be 2 main operations in a data store: retrieve and insert. For this to actually work in practice, you probably need different types of data stores for different phases of data. Unfortunately few people have a good understanding of the Data life cycle.

jamilbkJan 20, 2026, 11:47 PM

At Firezone we started with soft-deletes thinking it might be useful for an audit / compliance log and quickly ran into each of the problems described in this article. The real issue for us was migrations - having to maintain structure of deleted data alongside live data just didn't make sense, and undermined the point of an immutable audit trail.

We've switched to CDC using Postgres which emits into another (non-replicated) write-optimized table. The replication connection maintains a 'subject' variable to provide audit context for each INSERT/UPDATE/DELETE. So far, CDC has worked very well for us in this manner (Elixir / Postgrex).

I do think soft-deletes have their place in this world, maybe for user-facing "restore deleted" features. I don't think compliance or audit trails are the right place for them however.

d0100Jan 21, 2026, 7:06 AM

In simple projects where database is only changed via an API, we just audit the API instead. It's easier to display and easier to store than tracking each DB change a single transaction does

devilsdataJan 22, 2026, 8:49 PM

That's pretty elegant, compared to a lot of the solutions in this thread. Honestly, it sounds like the what I'll be recommending. Using a logging tool to output JSON events.

But what happens if you need to manually update a record?

whalesaladJan 20, 2026, 10:53 PM

A good solution here (can be) to utilize a view. The underlying table has soft-delete field and the view will hide rows that have been soft deleted. Then the application doesn't need to worry about this concern all over the place.

elyoboJan 20, 2026, 11:07 PM

postgres with rls to hide soft deleted records means that most of the app code doesn't need to know or care about them, still issues reads, writes, deletes to the same source table and as far as the app knows its working

RonsenshiJan 21, 2026, 2:20 AM

I would also say that most modern ORMs and frameworks also either come with soft delete feature (with automatic filtering on all queries) as part of the package or there are third-party libraries available for ORMs adding this functionality without the hassle of dealing with views (maybe it's me, but I've never had good experience with DB views).

maxchehabJan 20, 2026, 10:59 PM

How do you handle schema drift?

The data archive serialized the schema of the deleted object representative the schema in that point in time.

But fast-forward some schema changes, now your system has to migrate the archived objects to the current schema?

buchanaeJan 20, 2026, 11:16 PM

In my experience, archived objects are almost never accessed, and if they are, it's within a few hours or days of deletion, which leaves a fairly small chance that schema changes will have a significant impact on restoring any archived object. If you pair that with "best-effort" tooling that restores objects by calling standard "create" APIs, perhaps it's fairly safe to _not_ deal with schema changes.

Of course, as always, it depends on the system and how the archive is used. That's just my experience. I can imagine that if there are more tools or features built around the archive, the situation might be different.

I think maintaining schema changes and migrations on archived objects can be tricky in its own ways, even kept in the live tables with an 'archived_at' column, especially when objects span multiple tables with relationships. I've worked on migrations where really old archived objects just didn't make sense anymore in the new data model, and figuring out a safe migration became a difficult, error-prone project.

tracker1Jan 20, 2026, 11:52 PM

I like having archive/history tables. I often do similar with job queues when persisting to a database, in this way the pending table can stay small and avoid full scans to skip the need for deleted records...

Aside, another idea that I've kicked forward for event driven databases is to just use a database like sqlite and copy/wipe the whole thing as necessary after an event or the work that's related to that database. For example, all validation/chain of custody info for ballot signatures... there's not much point in having it all online or active, or even mixed in with other ballot initiatives and the schema can change with the app as needed for new events. Just copy that file, and you have that archive. Compress the file even and just have it hard archived and backed up if needed.

andy_pppJan 21, 2026, 2:28 PM

Could Postgres provide a mechanism where delete works as you'd expect but you can add WITH DELETED keyword to a SELECT and it returns everything even deleted records? I guess migrations are still an issue if you want to change the structure of the DB but maybe you could provide these as part of the database too - so INSERT INTO table(col1, col2, newCol...) FROM DELETED (col1, col2, newDataNotInDeleted) WHERE id = 123 CASCADE; or something like this.

There should be a preferred way to handle this as these are clearly real issues that the database should help you to deal with.

3rodentsJan 21, 2026, 12:09 AM

Soft deletes are an example of where engineers unintentionally lead product instead of product leading engineering. Soft delete isn’t language used by users so it should not be used by engineers when making product facing decisions.

“Delete” “archive” “hide” are the type of actions a user typically wants, each with their own semantics specific to the product. A flag on the row, a separate table, deleting a row, these are all implementation options that should be led by the product.

nottorpJan 21, 2026, 9:29 AM

Why deleted_at?

We have soft_deleted as boolean which excludes data from all queries and last_updated which a particular query can use if it needs to.

If over 50% of your data is soft deleted then it's more like historical data for archiving purposes and yes, you need to move it somewhere else. But then maybe you shouldn't use soft delete for it but a separate "archive" procedure?

hapidjusJan 21, 2026, 9:40 AM

Are you asking why we wouldn’t use 'last_updated' to store when the record was deleted?

One reason is that you might want to know when it was last updated before it was deleted.

nottorpJan 21, 2026, 10:04 AM

No, more like why you'd use a more expensive filter to hide soft deleted data, instead of just a flag.

masklinnJan 21, 2026, 10:42 AM

Checking whether `deleted_at is null` should be extremely cheap, and it avoids the duplication and desynchronisation of having both “deleted” and “deleted_at”.

nottorpJan 21, 2026, 11:24 AM

Yes, if your database has null. I know this is about postgres, but a lot of stuff is nosql now.

indigo945Jan 21, 2026, 12:11 PM

Even in MongoDB, you can can index `null` values, so I don't understand in what database system this would be a problem.

alkonautJan 21, 2026, 9:42 AM

Can't most db systems just create a view over the data where archived_at is null, and this view is the table you use for 99% of your business needs (except auditing, undelete, ...)?

arethuzaJan 21, 2026, 9:52 AM

I'd go for two views - one, as you describe, that gives you the "active" records and another that gives you the "inactive" records.

ntonozziJan 20, 2026, 11:32 PM

I've given up on soft delete -- the nail in the coffin for me was my customers' legal requirements that data is fully deleted, not archived. It never worked that well anyways. I never had a successful restore from a large set of soft-deleted rows.

zahlmanJan 20, 2026, 11:36 PM

> customers' legal requirements that data is fully deleted

Strange. I've only ever heard of legal requirements preventing deletion of things you'd expect could be fully deleted (in case they're needed as evidence at trial or something).

jandrewrogersJan 21, 2026, 12:03 AM

While not common, regulations requiring a hard delete do exist in some fields even in the US. The ones I familiar with are effectively "anti-retention" laws that mandate data must be removed from the system after some specified period of time e.g. all data in the system is deleted no more than 90 days after insertion. This allows compliance to be automated.

The data subject to the regulation had a high potential for abuse. Automated anti-retention limits the risk and potential damage.

pessimizerJan 21, 2026, 2:28 AM

You're thinking of "legal requirements" as requirements that the law insists upon rather than requirements that your legal department insists upon. You often want to delete records unrecoverably as soon as legally possible; it's likely why you wrote your data retention policy.

SchemaLoadJan 21, 2026, 12:22 AM

I had an integration with a 3rd party where their legal contract required we hard delete any data from them after a year. Presumably so we couldn't build a competing product using their dataset with full history.

ntonozziJan 20, 2026, 11:42 PM

Many privacy regulations enforce full deletion of data, including GDPR: https://gdpr-info.eu/.

theLiminatorJan 20, 2026, 11:29 PM

Privacy regulations make soft delete unviable in many of the cases where it's useful.

wavemodeJan 20, 2026, 11:57 PM

Soft deletion and privacy deletion serve different purposes.

If you leave a comment on a forum, and then delete it, it may be marked as soft-deleted so that it doesn't appear publicly in the thread anymore, but admins can still read what you wrote for moderation/auditing purposes.

On the other hand, if you send a privacy deletion request to the forum, they would be required to actually fully delete or anonymize your data, so even admins can no longer tie comments that you wrote back to you.

Most social media sites probably have to implement both of these processes/systems.

SchemaLoadJan 21, 2026, 12:20 AM

Imo there should be some retention period for moderation but then hard deletion after that. Why would a moderator need to look up a deleted post a year after it was deleted?

strkenJan 21, 2026, 12:39 AM

"Hi SchemaLoad, I'm Officer John from the Department of Not Letting Children Be Abused. I'm following up on something one of your users posted three years ago. Can you tell me the IP address(es) associated with the following deleted posts: A B C D"

antonvsJan 21, 2026, 12:53 AM

“Hi Officer John, that data is deleted and is no longer possible to access.”

Unless there’s a regulatory requirement (which there currently isn’t in any jurisdiction I’ve heard of), that’s a perfectly acceptable response.

SchemaLoadJan 21, 2026, 1:26 AM

You'd be required to show what you have but you aren't required to store everything forever just in case someone years later asks for it. Would be like showing up to fingerprint the scene 3 years after and being surprised it's too late.

direwolf20Jan 21, 2026, 9:48 AM

Think of the children! We can't have privacy because children might be abused if we have privacy!

strkenJan 22, 2026, 12:51 AM

This argument applies equally to anything else that needs digital forensics, like SBF's personal banking history, or which user deployed a crypto-miner to some random staging server back in 2023.

sedatkJan 20, 2026, 11:30 PM

The opposite is true in countries where there are data retention laws. Soft-delete is mandatory in those cases.

bux93Jan 21, 2026, 10:11 AM

In practice when I discuss retention requirements in my country (EU), the issue is the _maximum_ retention limit - after which data must be deleted. A minimum retention limit (e.g. business records for tax purposes) is almost never an issue. Systems that need soft-delete, bi-temporal state, etc. typically already have it, whereas actually deleting stuff is an afterthought.

I guess I'm saying the former is usually a functional requirement in the first place, and the latter is a non-functional (compliance) requirement.

LorenPechtelJan 21, 2026, 12:11 AM

The % of records that are deleted is a huge factor.

You keep 99%, soft delete 1%, use some sort of deleted flag. While I have not tried it whalesalad's suggestion of a view sounds excellent. You delete 99%, keep 1%, move it!

da_chickenJan 21, 2026, 1:58 AM

A view only makes sense if your RDBMS supports indexed views or the query engine is otherwise smart enough to pierce the view definition. Not all of them can do those things.

clickety_clackJan 21, 2026, 12:14 AM

We have soft delete, with hard delete running on deletions over 45 days old. Sometimes people delete things by accident and this is the only way to practically recover that.

tbrownawJan 21, 2026, 1:19 PM

There are tables at $dayjob with both (begin, end) and also (incept, expire) fields. It's "on such-and-such date, X was true", but also allows for "as-of Z date, we believed that...".

Also you can have most data being currently unused even without being flagged deleted. Like if I go in to our ticketing system, I can still see my old requests that were closed ages ago.

cjJan 20, 2026, 10:32 PM

We deal with soft delete in a Mongo app with hundreds of millions of records by simply moving the objects to a separate collection (table) separate from the “not deleted” data.

This works well especially in cases where you don’t want to waste CPU/memory scanning soft deleted records every time you do a lookup.

And avoids situations where app/backend logic forgets to apply the “deleted: false” filter.

vjvjvjvjghvJan 20, 2026, 10:40 PM

I guess that works well with NoSQL. In a relational database it gets harder to move record out if they have relationships with other tables.

tempest_Jan 20, 2026, 10:54 PM

Eh you could implement this pretty simply with postgres table partitions

buchanaeJan 20, 2026, 10:57 PM

Ah, that's an interesting idea! I had never considered using partitions. I might write a followup post with these new ideas.

tempest_Jan 20, 2026, 11:08 PM

There are a bunch of caveats around primary keys and uniqueness but I suspect it could be made to work depending on your data model.

t1234sJan 21, 2026, 5:32 PM

I can see a hybrid approach working where you use a deleted_at column for soft delete, then have a process that moves this data after X days to an archive and hard deletes from the main database. This makes undeletes in the short term simple and keeps all data if needed in the future.

iterateoftenJan 21, 2026, 12:22 AM

I used to be pretty adamant about implementing soft delete for core business objects.

However after 15 years I prefer to just back up regularly, have point in time restores and then just delete normally.

The amount of times I have “undeleted” something are few and far between.

lelanthranJan 21, 2026, 4:59 AM

> I used to be pretty adamant about implementing soft delete for core business objects.

> However after 15 years I prefer to just back up regularly, have point in time restores and then just delete normally.

> The amount of times I have “undeleted” something are few and far between.

Similar take from me. Soft deletes sorta makes sense if you have a very simply schema, but the biggest problem I have is that a soft delete leads to broken-ness - some other table now has a reference to a record in the target table that is not supposed to be visible. IOW, DB referential integrity is out the window because we can now have references to records that should not exist!

My preferred way (for now, anyway) is to copy the record to a new audit table and nuke it in the target table in a single transaction. If the delete fails we can at least log the fact somewhere that some FK somewhere is preventing a deletion.

With soft deletes, all sorts of logic rules and constraints are broken.

stevefan1999Jan 21, 2026, 10:12 AM

That's why adding a DELETE FROM ... RETENTION UNTIL <date> for SQL would be very nice, combining both hard and soft delete with an internal TTL to reduce the impact

cyberaxJan 21, 2026, 12:04 AM

Soft deletes + GC for the win!

We have an offline-first infrastructure that replicates the state to possibly offline clients. Hard deletes were causing a lot of fun issues with conflicts, where a client could "resurrect" a deleted object. Or deletion might succeed locally but fail later because somebody added a dependent object. There are ways around that, of course, but why bother?

Soft deletes can be handled just like any regular update. Then we just periodically run a garbage collector to hard-delete objects after some time.

nemothekidJan 20, 2026, 10:41 PM

The trigger architecture is actually quite interesting, especially because cleanup is relatively cheap. As far as compliance goes, it's also simply to declare that "after 45 days, deletions are permanent" as a catch all, and then you get to keep restores. For example, I think (IANAL), the CCPA gives you a 45 day buffer for right to erasure requests.

Now instead of chasing down different systems and backups, you can simply set ensure your archival process runs regularly and you should be good.

hirvi74Jan 21, 2026, 3:20 PM

I would never recommend my method for every type of application nor perhaps even most. However, I have had great success with not using soft deletes at all. I just write the records to a duplicate table then hard delete the records from the main table.

Of course, in a system with 1000s of tables, I would not likely do this. But for simpler systems, it's been quite a boon.

moringJan 21, 2026, 2:39 PM

Both the article and many comments here seem to miss that UPDATE deletes data -- the previous value of the field being updated -- which is a serious problem if soft-delete is your tool to keep old data. If you actually want historical data, you'll need logs or go straight to event sourcing.

pjs_Jan 20, 2026, 11:52 PM

Tried implementing this crap once. Never again

cess11Jan 21, 2026, 11:47 AM

I don't know, pruning based on age and restoring by writing a new row based on the soft deleted one seems less complex than the cascade handling in the trigger solution.

IgorPartolaJan 21, 2026, 12:46 AM

I have a love/hate relationship with soft deleted. There are cases where it’s not really a delete but rather a historical fact. For example, let’s say I have a table which stores an employee’s current hourly rate. They are hired at say $15/hour, then go to $17 six months later, then to $20/hour three months later. All of these three things are true and I want to be able to query which rate the employee had on a specific date even after their rate had changed. When I have a starts_on and an ends_on dates and the latter is nullable, with some data consistency logic I can create a linear history of compensation and can query historical and current data the same exact way. I also get

But this is such a huge PITA because you constantly have to mind if any given object has this setup or not and what if related objects have different start/end dates? And something like a scheduled raise for next year to $22/hour can get funny if I then try to insert that just for July it will be $24/hour (this would take my single record for next year and split it into two and then you gotta figure out which gets the original ID and which is the new row.

Another alternative to this is a pattern where you store the current state and separately you store mutations. So you have a compensation table and a compensation_mutations table which says how to evolve a specific row in a compensation table and when. The mutations for anything in the future can be deleted but the past ones cannot which lets you reconstruct who did what, when, and why. But this also has drawbacks. One of them is that you can’t query historical data the same way as current data. You also have to somehow apply these mutations (cron job? DB trigger?)

And of course there are database extensions that allow soft deletes but I have never tried them for vague portability reasons (as if anyone ever moved off Postgres).

nerdponxJan 20, 2026, 11:28 PM

One thing that often gets forgotten in the discussions about whether to soft delete and how to do it is: what about analysis of your data? Even if you don't have a data science team, or even a dedicated business analyst, there's a good chance that somebody at some point will want to analyze something in the data. And there's a good chance that the analysis will either be explicitly "intertemporal" in that it looks at and compares data from various points in time, or implicitly in that the data spans a long time range and you need to know the states of various entities "as of" a particular time in history. If you didn't keep snapshots and you don't have soft edits/deletes you're kinda SoL. Don't forget the data people down the line... which might include you, trying to make a product decision or diagnose a slippery production bug.

cadamsdotcomJan 21, 2026, 7:39 AM

Why not use a trigger to prevent unarchiving?

And perf problems are only speculative until you actually have them. Premature optimization and all that.

piratebroadcastJan 21, 2026, 9:27 PM

thoughtbot wrote about this a while back https://thoughtbot.com/blog/the-hard-truth-about-soft-deleti...

JohnLeitchJan 21, 2026, 4:42 AM

My brother's now ex-wife learned the hard way about the challenges of soft delete. Too bad about the contents of that SQLite database, but his knowing was for the better.

gizzlonJan 21, 2026, 6:16 AM

Chrome?

JohnLeitchJan 21, 2026, 3:55 PM

Without disclosing too much, it was an app that stored text messages.

iamleppertJan 21, 2026, 2:56 PM

There is another solution I use all the time: move deleted records to their own table. You probably don't need to do this for all tables. It allows you to not pepper your codebase with where clauses or statuses, everything works as intended, and you can easily restore records deleted by mistake, which is the original intent anyways. You can easily set this up by using a trigger at the database level in almost every database, that just works.

MarginalGainzJan 21, 2026, 12:04 PM

The hidden cost we battle in e-commerce isn't just DB storage/performance, it's Search Index Pollution. We treat 'availability' as a complex state machine (In Stock, Backorder, Discontinued-but-visible, Soft Deleted). Trying to map this logic directly into a Postgres query with WHERE deleted_at IS NULL works for CRUD, but it creates massive friction for discovery.

We found that strict CQRS/Decoupling is the only way to scale this. Let the operational DB keep the soft-deletes for audit/integrity (as mentioned by others), but the Search Index must be a clean, ephemeral projection of only what is currently purchasable.

Trying to filter soft-deletes at query time inside the search engine is a recipe for latency spikes.

ctxcJan 21, 2026, 2:40 PM

And why would one do that? For marginal gainz?

MORPHOICESJan 21, 2026, 11:35 AM

[dead]

BarathkannaJan 21, 2026, 9:01 AM

TLDR: Soft deletes look easy, but they spread complexity everywhere. Actually deleting data and archiving it separately often keeps databases simpler, faster, and easier to maintain.

The challenges of soft delete

Comments