SQLite as an Application File Format

https://sqlite.org/appfileformat.html

Comments

incanus77Nov 28, 2025, 4:50 PM
I did this for MBTiles, for storing (at the time, raster) map tiles at Mapbox. I was working on the iPad wing of R&D early in the company and we were focusing on offline mapping for the iPad. Problem was, moving lots of tiny map tiles (generally 256px square PNGs) was tedious over USB and network. We had a thing called Maps on a Stick for moving things around by USB, but it just didn’t scale well to the iPad interface & file transfer needs.

Bundled the tiles into SQLite (I was inspired by seeing Dr. Hipp speak at a conference) and voila, things both easy to move and to checksum. Tiles were identified by X & Y offset at a given (Z)oom level, which made for super easy indexing in a relational DB like SQLite. On the iPad, it was then easy to give map bundles an application icon, associated datatype from file extension, metadata in a table, etc. At the time, I was fairly intimidated by the idea of creating a file format, but databases, I knew. And then making some CLI tools for working with the files in any language was trivial after that.

jeffypooNov 28, 2025, 5:30 PM
absolutely adore the mbtiles format! thank you for creating that.
incanus77Nov 29, 2025, 1:07 AM
Thanks. Shortly afterwords, I started collaborating with a few folks and it definitely was a team effort to push it forward (icon design, UTFGrid spec, node-sqlite & server-side adoption...). It's still a very satisfying solve for the problem.
out_of_protocolNov 28, 2025, 10:09 PM
Also, tested .zip vs .tar vs sqlite vs file system. Of this bunch, sqlite was the most compact format with minimal overhead.
lateforworkNov 28, 2025, 4:47 PM
Most application's file formats are structured as a tree, not as flat tables. If your application's data is flat tables or name-value pairs then SQLite is an obvious choice. But if it is tree structured then it is less obvious. You can still save your tree in JSON format as a blob in a SQLite table but in this case the benefits are fewer. But if in addition to the JSON you have images or other binary data then once again SQLite offers benefits, because each of those binary files can be additional rows in the SQLite table. This is far easier to handle than storing them in ZIP format.
packetlostNov 28, 2025, 5:35 PM
Maybe not as obvious for those without formal education in """database normalization""" but it's pretty trivial to convert from a tree structure to a flat table structure using foreign key relations. Recursive queries aren't even that difficult in SQLite, so self-referential data can be represented cleanly too, if not a bit more difficult to write. IME most applications "tree structures" aren't self-referential and are better formalized as distinct entities with one-to-one relationships (ie. a subtree gets a table).

There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.

doglineNov 29, 2025, 3:42 AM
Until just now, I've been trying to figure out why people think that JSON is necessary in the database? Yes, lots of data is hierarchical, and you just normalize it into tables and move on. The fact that some people don't work this way, and would like to put this data as it stands into a JSON tree hadn't occurred to me.

What problem does normalization solve? You don't have to parse and run through a tree every time you're looking for data. You would, however, need to rebuild the tree through self joins or other references in other cases, I suppose. It depends how far you break down your data. I understand that we all see data structures a bit differently, however.

yellowappleNov 29, 2025, 10:04 PM
> There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.

SQLite at least provides functions to make the “querying” part of that straightforward: https://sqlite.org/json1.html

lateforworkNov 28, 2025, 11:55 PM
What problem are you trying to solve with this approach? Unless your document is huge and you need the ability to read or update portions of it, it is better to just read and write JSON.
packetlostNov 29, 2025, 1:25 AM
There's a laundry list of benefits that all add up, not like one specific killer feature. Some applications really do have very complex configuration needs, but it's sorta situation dependent on whether embedding a scripting language or a database is the right solution (for really simple cases I'm more likely to reach for TOML).

An incomplete list of benefits of using SQLite:

- Runtime config changes for free

- type safety

- strong migration support

- incorrect configurations can be unrepresentable (or at least enforced with check constraints)

- interactable from text-based interfaces and strong off-the-shelf GUI support

dardeaupNov 29, 2025, 4:14 PM
Type safety as a benefit of SQLite? For me type safety is a negative of SQLite. Being able to store a different type that what the column is declared to store is a bug (not a feature). I also find the lack of DATE and DATETIME/TIMESTAMP to be less than ideal.
HelloNurseDec 2, 2025, 9:21 AM
We are talking about an application file format, so "type errors" are about who's right: the application (even better, multiple equally right implementation of a specification) or random hackers altering the file in incorrect ways.

Loose type checks, e.g. NOT NULL columns of "usually" text, are loose only compared to typical SQL table definitions; compared to the leap forward of using abstract tables and changing them with abstract SQL instead of using text or byte buffers and making arbitrary changes, enforcing data types on columns would be only a marginal improvement.

packetlostNov 29, 2025, 4:57 PM
I pretty much always store date/times as Unix epoch integers. Also use STRICT tables and set the PRAGMA to enforce foreign key constraints.
lateforworkNov 29, 2025, 1:32 AM
Most frameworks can serialize and deserialize JSON from strongly typed classes. For example, Newtonsoft in .NET. The rest isn't worth the effort for most people. Your scenario may be unusual.
packetlostNov 29, 2025, 2:46 AM
I've certainly had some unusual contents in the past where we had approximately 10,000 configurable properties on the system, but we didn't use SQLite for that. Regardless, you ignored 3 of the 4 (I'll ignore the last one, it applies to JSON too) other points I made. My use cases aren't that weird and I'm not saying reach for SQLite every time, it's one option out of many. Migrations and runtime configuration change alone justify it for me in many cases.
somatNov 28, 2025, 5:40 PM
I am not really classically trained on the subject but I think this is the idea behind relational storage, it is to have better extraction options, you don't have to treat your data as a single document at a time.

Naively, most data looks hierarchical and the instinctive reaction is to make your file format match. But if you think of this as a set of documents stacked on top of each other if you take the data as a bunch of 90 degree slices down through the stack now your data is relational, you loose the nice native hierarchical format, but you gain all sorts of interesting analysis and extraction options.

It is too bad relational data types tend to be so poorly represented in our programming languages, generally everything has to be mapped back to a hierarchical type.

robrenaudNov 28, 2025, 5:00 PM
I had some json data that I wanted an annotation interface for. So I asked codex to put it into sqlite and make a little annotation webserver. It worked quickly/easily and without hassle. Sqlite supports queries over json-like objects.

Maybe a very simple document oriented db would have been better?

My biggest gripe is that the sqlite cli is brutally minimal (makes sense given design), but I probably should have been using a nicer cli.

fauigerzigerkNov 28, 2025, 6:39 PM
What do you mean by "json-like objects"?

My issue with SQLite's JSON implementation is that it cannot index arrays. SQLite indexes can only contain a single value per row (except for fulltext indexes but that's not what I want most of the time). SQLite has nothing like GIN indexes in Postgres.

elephantumNov 28, 2025, 4:50 PM
You do know, that you can create more than one table in SQLite and have references from one to another? Even recursive references work
renegat0x0Nov 28, 2025, 2:51 PM
I think I use SQLite like that (to some extent):

- https://github.com/rumca-js/Internet-Places-Database

For UI I use HTML, because it already provides components with bootrap, and everybody can use it without installation of any software.

All data comes from a single SQLite that is easy read, and returns data.

My database is really big, so it takes time to browse it, I wanted to provide more meaningful way to limit scope of searching

scary-sizeNov 28, 2025, 3:49 PM
Actually used it for a desktop blogging app a few years ago. It was great! I could set up a blog skeleton, send the file to a family member. They could focus on writing content and hitting deploy.

https://blog.project-daily.com/pages/file-format_3705.html

stavarottiNov 28, 2025, 12:17 PM
gjvcNov 28, 2025, 2:01 PM
[flagged]
lmshnNov 28, 2025, 2:14 PM
The previous discussion usually has useful and interesting conversations that are nice to revisit.
gjvcNov 28, 2025, 3:41 PM
[flagged]
wat10000Nov 28, 2025, 2:35 PM
It’s a helpful link, not a criticism.
gjvcNov 28, 2025, 3:41 PM
[flagged]
wat10000Nov 28, 2025, 4:08 PM
It makes it easy for me to find additional commentary for things I’m interested in, and I appreciate that. There’s no policing going on. If you find it off-putting, that’s your problem.
gjvcNov 28, 2025, 4:16 PM
as i said and you ignored, all it does is put people off from having fresh conversations.
macintuxNov 28, 2025, 5:52 PM
You do realize that dang himself frequently aggregates related discussions, and thanks people for doing so.

And that the previous discussions of the same URL are readily available at the top of the topic, via the "past" link.

So either HN itself is actively discouraging discussions, which seems unlikely, or your perception of this is askew.

gjvcNov 29, 2025, 6:58 PM
[flagged]
macintuxNov 29, 2025, 8:38 PM
Fair enough, I apologize.
wat10000Nov 28, 2025, 6:11 PM
I directly addressed that in my last sentence.
joelwallisNov 28, 2025, 4:09 PM
SQLite is abolutely amazing as an app format! I couldn't list how many tools are available to read SQLite data, or how easy and friendly they are. Even its CLI does wonders when you're dealing with data with it. SQLite has been around for 20+ years and is one of the most heavily tested softwares in the world.

SQLite is very simple, yet very reliable and powerful. Using SQLite as file format might be the best decision an engineer can take when it comes to future-proofing preservation of data.

zkmonNov 29, 2025, 10:17 AM
I have used SQLite file as the application itself. Almost. The tables would store the application features, UIs and logic. A generic kernel would bring up the application from the database.
kianNNov 28, 2025, 4:25 PM
This approach has really helped me out in my work. I do something very similar using DuckDB to slurp output files anytime I write a custom hierarchical model. The single sql queryable file simplified my storage and analytics pipeline. I imagine SQLite would be especially ideal where long term data preservation is critical.
spdegabrielleNov 28, 2025, 5:23 PM
I think the developers had the same idea https://fossil-scm.org/
rtyu1120Nov 28, 2025, 3:15 PM
Bit unrelated rant but I'm still not sure why ZIP has been adopted as an Application File Format rather than anything else. It is a remanent of a DOS era with questionable choices, why would you pick it over anything else?
amiga386Nov 28, 2025, 3:38 PM
- archiver format to stow multiple files in one; your actual files (in your choice of format(s)) go inside

- files can be individually extracted, in any order, from the archive

- thousands of implementations available, in every language and every architecture. no more than 32KiB RAM needed for decompression

- absolutely no possibility of patent challenges

HelloNurseNov 28, 2025, 4:17 PM
Also architecturally suitable for the common case of collecting heterogeneous files in existing and new formats into a single file, as opposed to designing a database schema or a complex container structure from scratch.

Any multi-file archive format would do, but ZIP is very portable and random access.

crazygringoNov 28, 2025, 4:24 PM
If all you need is a bag of named blobs and you just want quick reasonable compression supported across all platforms, why not?

If you don't need any table/relational data and are always happy to rewrite the entire file on every save, ZIP is a perfectly fine choice.

It's easier than e.g. a SQLite file with a bunch of individually gzipped blobs.

tetracaNov 28, 2025, 3:24 PM
Because Windows can view and extract them out of the box without installing any additional applications. If it supported anything better out of the box I'd guess people would use that instead.
lvhNov 28, 2025, 3:43 PM
"The operating system makes it easy to mess with" doesn't seem like a particularly useful property for application file formats.
TeMPOraLNov 28, 2025, 3:50 PM
It was, back when software development was run by hackers and not suits and security people. Easy access was a feature for users, too; back in those days, software was a tool that worked on data, it didn't try to own the data.
dahartNov 28, 2025, 4:51 PM
ZIP isn’t an application format, it’s a container, no? You store files with any format in a .zip, and that’s what applications do - they read files with other formats out of the .zip. What are your goals; what else would you pick, and why? What are the questionable choices you refer to?
amiga386Nov 28, 2025, 5:26 PM
I suspect he means the choices of putting the central directory headers at the end of the file, as well as having local file headers as you read through the file, which allows for ambiguity.

Alternatively, he could mean that, for the purposes of archiving, ZIP is very far behind the state of the art (no solid compression, old algorithms, small windows, file size limits without the ZIP64 extensions, and so on, most of which are not relevant to using ZIP as a container format)

dahartNov 28, 2025, 6:56 PM
Thanks, makes sense. Are the headers even an issue when using ZIP as a container? Are there superior alternatives in practice?

I’ve reached for ZIP for application containers because it’s really easy, not because of design choices that affect me. Typically the compression is a convenient byproduct but not a requirement, and file size limits could be an issue, perhaps, but isn’t something I’ve ever hit when using ZIP for application data. File size limits is something I’ve hit when trying to archive lots of files.

Using ZIP for build pipelines that produce a large number of small files is handy since it’s often faster than direct file I/O, even on SSDs. In the past was much faster than spinning media, especially DVDs. These days in Python you can unzip to RAM and treat it like a small file system - and for that file size limits aren’t an issue in practice.

thijsonNov 28, 2025, 4:52 PM
AMD/Xilinx Vivado uses ZIP format to compress design checkpoints. They just give them a .dcp extension though.
mikkupikkuNov 28, 2025, 3:25 PM
It works well enough. What could, for instance, epubs gain by having another base format instead?
gus_massaNov 28, 2025, 3:32 PM
I think most format use "gzip" instead of "zip".
johannes1234321Nov 28, 2025, 5:22 PM
gzip and tar+gzip aren't good options for application data compared to zip.

zip is used for Java jar files, OpenOffice documents and other cases.

The benefit is that individual files in the archive can be acces individually. A tgz file is a stream which can (without extra trickery) only be extracted from begin to end with no seeking to a specific record and no way to easily replace a single file without rewriting everything.

tgz is good enough for distributing packages which are supposed to be extracted at once (a software distribution)

conradludgateNov 28, 2025, 4:11 PM
gzip is not an archive container. You're thinking of .tar.gz which is a "tape archive" format which is compressed using gzip. Zip is by itself both a compression and an archive format, and is what documents like epub or docx use
gus_massaNov 28, 2025, 4:22 PM
You are right, but other documents like .ggb (GeoGebra files) or .mbz (Moodle backups) use the .tar.gz method. I even wrote programs to opened them, make a few tweaks and save the new version in another compatible file.
abhashanand1501Nov 28, 2025, 4:44 PM
We are developing using sqlite to transfer configurations from uat to production environment. Since the configurations are already saved in a postgres table in uat, moving some configs from uat to production an sqlite file is very easy. since it's a binary format, we are also saved from any inadvertent edits by people doing production deployment.

Also, another usecase is to export data from production to uat for testing some scenarios, it can be easily encoded in a sqlite file.

asklNov 28, 2025, 3:12 PM
Somehow my first thought from the title was using sqlite as a format for applications. So like a replacement for ELF. I think this idea is both fascinating and horrifying.
trwsNov 28, 2025, 3:18 PM
I worked @fzakaria on developing that idea. It actually worked surprisingly well. The benefits are mostly in the ability to analyze the binary afterward though rather than any measurable benefit in load time or anything like that though. I don’t have the repo for the musl-based loader handy, but here’s the one for the virtual table plugin for SQLite to read from raw ELF files: https://github.com/fzakaria/sqlelf
yellowappleNov 29, 2025, 10:19 PM
I've been pondering something similar as a modern approach to fat binaries, basically around a table like

    CREATE TABLE functions (name TEXT, arch TEXT, body BLOB);
The advantage would be that binaries could be partially fattened, i.e. every function would have at least one implementation in some cross-platform bytecode (like WASM), and then some functions would get compiled to machine code as necessary, and then the really-performance-dependent functions would have extra rows for different combinations of CPU extensions or compiler optimization levels or whatever — and you could store all of these in the same executable instead of having a bunch of executables for each target.

As a bonus, it'd be possible to embed functions' source code into the executable directly this way, whether for development purposes (kinda like how things are sometimes done in the old-school Smalltalk and Lisp worlds) or for debugging purposes (e.g. when printing stack traces).

giancarlostoroNov 28, 2025, 4:07 PM
Forget elf, imagine having a SQLite file that stores elf, exe and DMG binaries. I would not mind working on something like this.
actionfromafarNov 28, 2025, 5:40 PM
Not that at all, but interesting in its own right - https://pypi.org/project/sqlelf/ explore ELF via SQL.
giancarlostoroNov 28, 2025, 10:16 PM
Yeah I'm thinking of like "appimage" but you can use it to run on any platform.
gjvcNov 28, 2025, 3:44 PM
wonder if this would make hot-swap functions easier, if every function had its own section and every section was in the db
kstrauserNov 28, 2025, 4:24 PM
I think we could call it Library Internal Sequel Procedures.
yreadNov 28, 2025, 4:20 PM
Or a replacement for Access
ejstemblerNov 28, 2025, 5:38 PM
The Acorn macOS app uses SQLite in a similar way: https://flyingmeat.com/acorn/docs/technotes/ACTN002.html
itopaloglu83Nov 28, 2025, 5:55 PM
Recently reverse engineered the Money Pro backup format, it's a binary file with SQLite with some additional XML information backed in. It feels like they're purposefully making it harder for users to export their data in a useful format, especially after the changes they made to their financial model.
incanus77Nov 28, 2025, 6:12 PM
Yes! Gus (the developer) also has made & maintained FMDB for many years, a nice Cocoa wrapper for the SQLite bindings.

https://github.com/ccgus/fmdb

jansommerNov 28, 2025, 4:33 PM
Something to consider when using SQLite as a file format is compression (correct me if I'm wrong!). You might end up with a large file unless you consider this, and can't/won't just gz the entire db. Nothing is compressed by default.
crazygringoNov 29, 2025, 3:04 PM
Sure. But if you have reasonably small files just compress the whole file, like MS Office or EPUB files do.

Or if your files are large and composed of lots of blobs, then compress those blobs individually.

Whereas if your files are large and truly database-y made of tabular data like integers and floats and small strings, then compression isn't really very viable. You usually want speed of lookup, which isn't generally compatible with compression.

lateforworkNov 28, 2025, 4:37 PM
It can be compressed, see https://sqlite.org/sqlar.html
nh2Nov 28, 2025, 5:51 PM
Please do not use second resolution mtime (cannot represent the high accuracy mtime that modern OSs use, so packing and unpacking , or causes differences eg in rsync), or build anything new using DEFLATE (it is slow and cannot really be made fast).
mort96Nov 29, 2025, 3:41 PM
This seems completely orthogonal? This is an alternative to zip and tar built on SQLite:

> An "SQLite Archive" is a file container similar to a ZIP archive or Tarball but based on an SQLite database.

Your parent comment said that when you're using SQLite as an application format, the content in the database don't get compressed. These two things have nothing to do with each other.

jansommerNov 28, 2025, 4:53 PM
Archive Files is for blobs as far as I understand. All your other data remains uncompressed?
euroderfNov 28, 2025, 5:12 PM
There seems to be no single software solution "out there" for mounting an SQLite DB (or an SQLite archive) as a file system, with or without per-record relative paths.
forgotpwd16Nov 28, 2025, 5:44 PM
There's FUSE-using Sqlitefs & WebDAV-using Wddbfs.
euroderfNov 28, 2025, 7:32 PM
FUSE on Mac seems to be a kernel/permissions mess.
spdegabrielleNov 28, 2025, 5:15 PM
Is there a software solution to mounting any DB as a filesystem?
crazygringoNov 29, 2025, 3:06 PM
Why would you want to do that?
euroderfNov 29, 2025, 4:29 PM
Convenience? Not cluttering up a directory with a transient file tree?
crazygringoNov 29, 2025, 5:39 PM
But why would you want to use SQLite for that?

On a Mac, you'd e.g. use and mount a disk image if you wanted to create a filesystem inside of a file. Windows has virtual hard drives, and you can do that kind of thing on Linux too.

I don't understand why you'd ever want to use a relational database for that. It's a completely different paradigm.

Although I also don't really understand why you're worried about cluttering up a directory. And if it's transient, it's that when temp dirs are for?

euroderfNov 29, 2025, 6:05 PM
> I don't understand why you'd ever want to use a relational database for that. It's a completely different paradigm.

Well, it might be a relation DB or else a zipfile. Why couldn't I encapsulate a file tree in a single file ? Maybe it's tens of thousands of quite small files.

crazygringoNov 29, 2025, 8:34 PM
You can put tens of thousands of files in a single file lots of ways that are expressly designed for that. You don't need SQLite for that.

So why would you want to use SQLite for that is my question? Mounting a database or a table as a filesystem doesn't make much sense to me. There's a very poor fit between the two paradigms. What does a subdirectory mean in a database? What does a foreign key or set of columns mean in a filesystem?

euroderfNov 30, 2025, 12:33 AM
Maybe you misunderstand the scenario. An SQLite DB can have records where each record contains a path. This column can be used to emulate a hierarchical tree-type filesystem. There's a few different ways to represent the path information, and the parent-child connectivity among records.
crazygringoNov 30, 2025, 1:40 AM
Ok. Again, why? Why would you want to use a relational database as a filesystem rather than a file format explicitly designed for that, for mounting?
euroderfNov 30, 2025, 8:47 AM
If you mean (for example) a zipfile, AFAICT there's not a whole lot of difference between them when used in this capacity.
crazygringoNov 30, 2025, 1:15 PM
No, you asked about mounting specifically. And I replied:

> On a Mac, you'd e.g. use and mount a disk image if you wanted to create a filesystem inside of a file. Windows has virtual hard drives, and you can do that kind of thing on Linux too.

So why wouldn't you use one of these if you need mounting? They're literally made for it.

I continue to not understand why you would want to mount a SQLite database instead of using one of these.

psnehanshuNov 28, 2025, 4:17 PM
I see no downside in using sqlite as an application file format.
chiiNov 29, 2025, 4:07 AM
The only "downside" is that the format is an open spec, which allows anyone to modify the contents without going through the specific application. And it's only a downside if you are using the format as an obfuscation to prevent third-party compatibility/reverse engineering, or to lock in customers.
crazygringoNov 29, 2025, 3:09 PM
Yup. You can strip headers from the file though and keep them in your application though, to keep the file from being easily usable. And/or encrypt it.
nlyNov 29, 2025, 4:03 PM
SQLCipher + a hard-coded or generated key in your app.
seanalltogetherNov 28, 2025, 3:56 PM
I remember someone mentioning the Acorn image editor on Mac uses sql files to store image data. It probably makes backwards compatibility much easier to work with.
dchestNov 28, 2025, 4:06 PM
It does, here's a schema from an image I just saved with the latest version. Pretty simple.

  CREATE TABLE image_attributes ( name text, value blob);
  CREATE TABLE layers (id text, parent_id text, sequence integer, uti text, name text, data blob);
  CREATE TABLE layer_attributes ( id text, name text, value blob);
Also, document-based apps that use Apple's Core Data framework (kinda ORM) usually use SQLite files for storage.
setrNov 28, 2025, 5:14 PM
Messages uses it too on Mac; was using it to do some convoluted text search on my history
dchestNov 28, 2025, 9:20 PM
Not as an application file format discussed in the link, though. Lots of software use it as a database (as intended) it's also a base for Apple's Core Data.
jrochkind1Nov 28, 2025, 2:33 PM
Searched for this topic:

> and is backwards compatible to its inception in 2004 and which promises to continue to be compatible in decades to come.

That is pretty amazing. You could do a lot worse.

dist-epochNov 28, 2025, 2:46 PM
Same as .zip, .xml, .json and many others.

Doesn't mean that whatever the app stores inside will remain backward compatible which is the harder problem to solve.

jrochkind1Nov 29, 2025, 3:37 PM
Right, but none of those are the working use file formats for a relational database.
QuadrupleANov 28, 2025, 3:28 PM
Still helpful!
SoKamilNov 28, 2025, 7:56 PM
I remember when I was a child I used to open WinRAR and try to open random files in games and programs to find some „hidden” assets.
ianberdinNov 28, 2025, 8:59 PM
Sold. Absolutely.

I always wonder when people can sell ideas or products so effectively.

born-jreNov 28, 2025, 5:21 PM
i am taking it to new new extreme > https://github.com/blue-monads/potatoverse
actionfromafarNov 28, 2025, 5:38 PM
Planning on putting a license on it? I habitually ignore repos without a LICENSE file in them.
born-jreNov 28, 2025, 5:50 PM
oh yeah, added now