Database-per-tenant is great for read-heavy apps - most tenants are small and their tables don't have a lot of records, so even very complex joins are very fast.
The main problem is that release times can increase substantially, because you have to migrate lots of individual databases one by one, and sometimes schema/data drift can occur between DBs, and then your release stops midway and you have to figure out why in some tenants the feature works, and in some it breaks... So it's not free launch, the complexity just shifts to a different place.
This one-by-one migration nicely contains problems though if something goes wrong - only one customer affected at a time (or whatever your batch size was) I’ve done deployments this way pretty much everywhere.
I'm curious, how do you handle being in a partial state?
It seems like it could be operationally tricky to have some users migrated, some pending, and some in an error state.
On the other hand, I could imagine doing lazy migrations as user or tenants sign in as a nice way to smooth out the increased effort in migrating if you can engineer for it.
When not per tenant you still have this problem, and typically resolve by releasing database migrations separately to the code that uses changes (ensuring backwards compatibility).
I guess per-tenant you would handle similarly but build some tooling to monitor the state of migrations. It might even be easier in some ways, as you might be more able to take table locks when migrating per-tenant.
We've got a desktop B2B application which we also run with per-tenant DB. This makes the setup equal between our hosted service and on-prem.
When updating, schema changes goes first and when successful the new version is allowed to be executed by users. We have small launcher app that the users run which starts the actual application to handle this.
Our DB changes are always backwards compatible in the sense that the old version should operate correctly on an updated DB.
Incompatible changes like renaming a column we handle by rolling out the new column first, then wait till the version still using the old column is retired before removing the column from the schema, or similar strategies.
This way partial updates are safe, as the old app will just continue to use the partially updated DB as normal.
We're migrating to a web app, and hope to have something similar. Plan is we'll update the tenant DB and once done we'll change the app routing so that new user sessions for that tenant goes to new app instances running the updated version.
Good point. Also: nothing makes it impossible to apply migrations to multiple tenants in parallel, this is pretty much only IO. I didn't have to since my tenant numbers were very low, but very doable.
App per tenant would be nice but in practice mostly I’ve seen shared app instances with a careful dance of “schema changes can’t break app” and “app can’t use new schema until it is 100% rolled out” which is super annoying.
I think its ok if you have a traditional server approach, but in the cloud this is a really great way to make yourself pay a lot for a little.
Most systems have a minimum amount of traffic they'll let you pay for, most object stores are not setup to stream small updates in a transactionally consistent way, there's a ton of complexity "in the small" as it were.
Depends on the cloud and their model. Azure/Google have MySQL/PostGres options where you get a server they manage OS/Database software for but you can run as many databases as hardware will allow.
Some of other cloud databases don't charge you for database but simply for usage so in that case, usage = customer revenue so cost should scale as you do.
I worked as Ops person for a company that was database per user and that's we did. Every database had table with its current schema. On login, that current schema would be checked, if it was not up to date, RabbitMQ message was sent and database schema updater would update their database. User would get a screen saying "Updating to latest version" and it was just checking every 15 seconds to see if schema field updated. Most of time, it was done in less than 15 seconds.
For more important customers, you could always preemptively run their database schemas updates for them so they didn't see update screen.
Interesting! I use a budgeting app that has curious loading screens when you first log in. Sometimes it flies by, sometimes it takes a bit. It doesn’t appear to be related to data import, as data isn’t in sync when you log in. I wonder if what I’m seeing is a tenant migration technique like you describe.
Database per tenant - barring configuration errors to a connection string or something along those lines - means you won’t ever accidentally leak over other customers data
With Row level security, as good as it is, there is always a chance of leaking happening, due to bugs, user error or the like
Database per tenant also makes migrations safer, and allows easier versioning of an application, for example if you run a SaaS you may put your V2 application into permanent maintenance mode and allow existing customers to use it in perpetuity, while upgrading customers and new customers are put on the V3 platform. This is infinitely easier in practice when the database is per tenant
Many of our clients eventually want to host our app on-premises, so moving it to an on-premises environment is quite easy with the database-per-tenant approach. Just copy the database as is.
>> With Row level security, as good as it is, there is always a chance of leaking happening, due to bugs, user error or the like
That’s not correct. It’s hard isolation. In effect, RLS automatically applies a where clause preventing developers inadvertently accessing the wrong data.
If you don’t set the Postgres environment variable then no data is returned.
All the hard isolation without the schema sync pain.
I really like SQLite but I wonder if what we are seeing here is a need for conventional OLTP databases to be able to unload parts of indexes from memory.
Because essentially what’s happening with database per user is that we don’t keep anything in memory for users who are inactive, or only active on another instance, for essentially the same schema (or maybe a couple versions of a schema).
This is starting to smell like the JSON situation with Mongo. Where Postgres is now twice as fast as Mongo at the thing Mongo is supposed to be an expert in.
I wouldn't say so, with use cases I saw the databases of tenants would almost never exceed available memory. A database could be loaded into memory in its entirety for the duration of the request, provided the disk IO on the server is cooperating (i.e. "does not degrade to floppy drive speed once you are out of IOPS credit").
> Because essentially what’s happening with database per user is that we don’t keep anything in memory for users who are inactive, or only active on another instance, for essentially the same schema (or maybe a couple versions of a schema).
The benefits of prefetching and page-level abstractions are certainly reduced when every tenant is guaranteed to live in a different file.
I would agree that you should have to deal with an uphill battle when arguing for a DB-per-customer architecture. There needs to be a special reason to not simply add a customer/tenant id column in each table.
I think one good argument for placing a tenant per database file is that it makes total destruction of their information possible without any impact to other tenants. If you don't require immediate destruction of the tenant's information (i.e., can wait for background cleanup activities), then I would begin to push back.
I would say most people don't need a database per tenant and that is definitely not the norm. There are specific cases that you would need to negate the drawbacks such as migrations and schema drift.
I think the lack of gems/libraries/patterns is proof of this. Just because you can doesn't mean you should.
Not saying there's no reason to ever do it, proceed with caution and know for a fact you need db per tenant.
Seems like it’s actually really convenient as independent customer data ends up being nicely isolated. Deleting churned customer data is as trivial as deleting their database.
> I think lack of gems/libraries/patterns is proof of this
This would effectively disqualify any new pattern from emerging. There have been many new patterns that have challenged the consensus that ended up becoming dominant.
Coincidentally I'm working on FeebDB[0], which is similar but for Elixir instead. It can be seen as a replacement to Ecto (which won't work well when you have thousands of databases).
Mostly as a fun experiment, but also from the realization that every place I worked at in the past (small/medium-sized B2B startups) would greatly benefit from such architecture.
Yes, there are massive trade-offs to this approach, and the concerns raised in the comment section are valid. This doesn't mean the database-per-tenant is never worth it. There's a sweet spot for it, and if it fits your business/application, I personally would consider it a technical advantage over competitors.
My goal with FeebDB is to eliminate or reduce the common pain points of database-per-tenant, including:
- ensure there is a single writer per database.
- improved connection management across all tenants (e.g. only keep open at most 1000 DB connections, similar to an LRU cache).
- on-demand migration (all shards are migrated on application startup, but if a shard that hasn't migrated yet receives a request, it will first perform the migration and then serve the request),
- on-demand backups and replication (e.g. the library knows which shards were updated in the last X minutes, so it can trigger Litestream or similar on demand).
- support for enumeration of databases (performing map/reduce/filter operations across multiple DBs)
- support for clustered deployment with "pinned" tenants (for now I'm assuming the IOPS of a single beefy server should be enough for all use cases, but once that's no longer sufficient you can have "shards of shards")
This is how we did it at mailchimp. I think this is ignored or overlooked because this means devs might have to care a bit more about operations or the company has to care more.
That is what most tenanting scale-ups do. "Jumbo"-tenants get relocated either to separate partitions, or to partitions which are more sparsely populated.
I used something similar in the past. Loved it. User wants their data? Bang! Here's the entire database. User deletes their account? `rm username.sql` and you're done. Compliance was never easier.
When data can be this isolated from each other and you don't have any scaling issues within a single tenant it's pretty hard to make a wrong design choice. Almost anything will work.
wonder if active record couldn’t borrow some pattern from coredata with its managedobjectcontext property associated to every object. This lets you work with objects transparently, not having to carry a « db » property everywhere, and the framework always knowing where the object come from, it can trigger an error if you’re trying to do something unholy by mixing them.
I bet it would not be very hard to write that as an extension for proof of concept. There have been some good flexible operations added for multi-database stuff in activerecord over the past couple versions.
Could anyone who runs Rails with sqlite in production share some scale numbers like r/w per second, # rows in the db, and vps size? I have used it at trivial scale for SolidQueue and SolidCache, but I can't find experience reports for the primary OLTP db.
Kind of, yes. If you are going to be doing this at scale, you will either need to pin servers to shards of users, or retrieve the DB from some durable storage "just in time", or... use a network mount. Note also, that with small-enough databases you can do the same thing Apple does - fetch your entire tenant from the "big datastore", set a lock on it, do the thing that you want to do - and then write it back into the "big datastore" at the end.
Wonder how many sqlite databases would be too many. At one point I assume not all databases can be kept opened at all time. what sort of overhead would there be serving a tenant not opened up yet? there has to be caches etc. not warmed up causing lots of disk IO
At some level, it doesn't make a big difference if you've got a file open or not once the file's data falls out of the disk cache, you'll have the same kinds of latency to get it back. Sure, you'll avoid a round or two of latency pulling in the filesystem data to get to the file, but probably not a big deal on SSD.
Chances are, the popular databases stay in cache and are quick to open anyway; and the unpopular ones are rarely accessed so delay is ok. But you'd also be able to monitor for disk activity/latency on a system level and add more disks if you need more throughput; possibly disks attached to other machines, if you also need more cpu/ram to go with it. Should be relatively simple to partition the low use databases, because they're low use.
Something like that, yes. A tenant that hasn't been opened yet - well, you create the tenant first, and then proceed "as normal". With ActiveRecord, your `pool:` configuration defines how many database handles you want to keep open at the same time. I set it relatively high but it can be tweaked, I'd say. And there is automatic eviction from the pool, so if you have a few sites which are popular and a lot of sites which are not popular - it should balance out.
There could be merit to "open and close right after" though, for sure.
That is true, but in my experience this is less of an issue if your migration allow running against "current - 1". If your migration borks on one of the tenants - your system should still work, and you can grab that tenant and investigate.
This problem also does happen with sharded databases - which is why most co's have a structured approach of deploying schema changes, observing them having applied everywhere, and only then deploying the feature relying on the schema changes.
You do need to keep tabs on whether they are or not, that is true - and you will have a bit of management to do if they are under constant heavy write load. For me, this all works out because I don't have too many tenants - and I have tools to block writes when running migrations.
I use `site_id` in every model and keep everything in a single database. I think this is how Shopify started as well. You can see that because their product, variant, collection IDs are incremental across all shops. They might be using different solution today, but it feels like this is how they started.
What I like about having everything in one db, until it grows too big, is that I can do cross-users analytics/reporting. I also had it happen, like it was mentioned in the article, that I needed to share some data between users. Having everything in a single database made this much simpler problem than needing to move data between databases.
With SQLite, the approach should equally trivially work with any language and framework. Unlike e.g. Postgres, there's no connection overhead, pgbouncer, etc; you can open the DB at the beginning of the request, once the user credentials are available, and close it before / along with sending the response, and it still would feel instant.
We use "database-per-tenant" with ~1 mln users.
Database-per-tenant is great for read-heavy apps - most tenants are small and their tables don't have a lot of records, so even very complex joins are very fast.
The main problem is that release times can increase substantially, because you have to migrate lots of individual databases one by one, and sometimes schema/data drift can occur between DBs, and then your release stops midway and you have to figure out why in some tenants the feature works, and in some it breaks... So it's not free launch, the complexity just shifts to a different place.
This one-by-one migration nicely contains problems though if something goes wrong - only one customer affected at a time (or whatever your batch size was) I’ve done deployments this way pretty much everywhere.
I'm curious, how do you handle being in a partial state? It seems like it could be operationally tricky to have some users migrated, some pending, and some in an error state.
On the other hand, I could imagine doing lazy migrations as user or tenants sign in as a nice way to smooth out the increased effort in migrating if you can engineer for it.
When not per tenant you still have this problem, and typically resolve by releasing database migrations separately to the code that uses changes (ensuring backwards compatibility).
I guess per-tenant you would handle similarly but build some tooling to monitor the state of migrations. It might even be easier in some ways, as you might be more able to take table locks when migrating per-tenant.
We've got a desktop B2B application which we also run with per-tenant DB. This makes the setup equal between our hosted service and on-prem.
When updating, schema changes goes first and when successful the new version is allowed to be executed by users. We have small launcher app that the users run which starts the actual application to handle this.
Our DB changes are always backwards compatible in the sense that the old version should operate correctly on an updated DB.
Incompatible changes like renaming a column we handle by rolling out the new column first, then wait till the version still using the old column is retired before removing the column from the schema, or similar strategies.
This way partial updates are safe, as the old app will just continue to use the partially updated DB as normal.
We're migrating to a web app, and hope to have something similar. Plan is we'll update the tenant DB and once done we'll change the app routing so that new user sessions for that tenant goes to new app instances running the updated version.
That said, we've not got 1 million users... yet.
Good point. Also: nothing makes it impossible to apply migrations to multiple tenants in parallel, this is pretty much only IO. I didn't have to since my tenant numbers were very low, but very doable.
Do you then run app instances per user? Or how do your upstream control for some DBs being on different schemas?
At least with Single Point of Success database design you either move or don’t.
App per tenant would be nice but in practice mostly I’ve seen shared app instances with a careful dance of “schema changes can’t break app” and “app can’t use new schema until it is 100% rolled out” which is super annoying.
How is that super annoying? That’s exactly how migrations are supposed to be done, even without “app per tenant”.
Good call out, it definitely reduces the blast radius to do it this way. Poor man’s isolation
I think its ok if you have a traditional server approach, but in the cloud this is a really great way to make yourself pay a lot for a little.
Most systems have a minimum amount of traffic they'll let you pay for, most object stores are not setup to stream small updates in a transactionally consistent way, there's a ton of complexity "in the small" as it were.
Depends on the cloud and their model. Azure/Google have MySQL/PostGres options where you get a server they manage OS/Database software for but you can run as many databases as hardware will allow.
Some of other cloud databases don't charge you for database but simply for usage so in that case, usage = customer revenue so cost should scale as you do.
How do you do metrics across users? Do you have long running or many jobs across tenants to get derived data into one downstream target?
How large are the DBs?
Do you store the DB too, or rebuild it from your records if the client loses their DB?
You still host the database yourself, clients don’t lose their database.
Have you tried applying migrations lazily?
I worked as Ops person for a company that was database per user and that's we did. Every database had table with its current schema. On login, that current schema would be checked, if it was not up to date, RabbitMQ message was sent and database schema updater would update their database. User would get a screen saying "Updating to latest version" and it was just checking every 15 seconds to see if schema field updated. Most of time, it was done in less than 15 seconds.
For more important customers, you could always preemptively run their database schemas updates for them so they didn't see update screen.
Interesting! I use a budgeting app that has curious loading screens when you first log in. Sometimes it flies by, sometimes it takes a bit. It doesn’t appear to be related to data import, as data isn’t in sync when you log in. I wonder if what I’m seeing is a tenant migration technique like you describe.
(not a worthy comment, but)
I really like the free lunch / free launch pun here, intentional or not.
Why do you use database per tenant?
Why not use Postgres with row level security instead?
Postgres gives the option of schema per tenant which is a nice middle ground.
But read up on scaling limits before going down this path. Last time I checked it capped out at thousands of schemas, but no where near millions.
Postgres does hard tenant isolation via row level security.
Far better than database per tenant.
> Far better than database per tenant
The better isn’t clear here, why is it better?
Database per tenant - barring configuration errors to a connection string or something along those lines - means you won’t ever accidentally leak over other customers data
With Row level security, as good as it is, there is always a chance of leaking happening, due to bugs, user error or the like
Database per tenant also makes migrations safer, and allows easier versioning of an application, for example if you run a SaaS you may put your V2 application into permanent maintenance mode and allow existing customers to use it in perpetuity, while upgrading customers and new customers are put on the V3 platform. This is infinitely easier in practice when the database is per tenant
>Database per tenant also makes migrations safer
Many of our clients eventually want to host our app on-premises, so moving it to an on-premises environment is quite easy with the database-per-tenant approach. Just copy the database as is.
>> With Row level security, as good as it is, there is always a chance of leaking happening, due to bugs, user error or the like
That’s not correct. It’s hard isolation. In effect, RLS automatically applies a where clause preventing developers inadvertently accessing the wrong data.
If you don’t set the Postgres environment variable then no data is returned.
All the hard isolation without the schema sync pain.
Database per tenant makes write scaling trivial.
I really like SQLite but I wonder if what we are seeing here is a need for conventional OLTP databases to be able to unload parts of indexes from memory.
Because essentially what’s happening with database per user is that we don’t keep anything in memory for users who are inactive, or only active on another instance, for essentially the same schema (or maybe a couple versions of a schema).
This is starting to smell like the JSON situation with Mongo. Where Postgres is now twice as fast as Mongo at the thing Mongo is supposed to be an expert in.
I wouldn't say so, with use cases I saw the databases of tenants would almost never exceed available memory. A database could be loaded into memory in its entirety for the duration of the request, provided the disk IO on the server is cooperating (i.e. "does not degrade to floppy drive speed once you are out of IOPS credit").
> Because essentially what’s happening with database per user is that we don’t keep anything in memory for users who are inactive, or only active on another instance, for essentially the same schema (or maybe a couple versions of a schema).
The benefits of prefetching and page-level abstractions are certainly reduced when every tenant is guaranteed to live in a different file.
I would agree that you should have to deal with an uphill battle when arguing for a DB-per-customer architecture. There needs to be a special reason to not simply add a customer/tenant id column in each table.
I think one good argument for placing a tenant per database file is that it makes total destruction of their information possible without any impact to other tenants. If you don't require immediate destruction of the tenant's information (i.e., can wait for background cleanup activities), then I would begin to push back.
I would say most people don't need a database per tenant and that is definitely not the norm. There are specific cases that you would need to negate the drawbacks such as migrations and schema drift.
I think the lack of gems/libraries/patterns is proof of this. Just because you can doesn't mean you should.
Not saying there's no reason to ever do it, proceed with caution and know for a fact you need db per tenant.
Seems like it’s actually really convenient as independent customer data ends up being nicely isolated. Deleting churned customer data is as trivial as deleting their database.
> I think lack of gems/libraries/patterns is proof of this
This would effectively disqualify any new pattern from emerging. There have been many new patterns that have challenged the consensus that ended up becoming dominant.
Coincidentally I'm working on FeebDB[0], which is similar but for Elixir instead. It can be seen as a replacement to Ecto (which won't work well when you have thousands of databases).
Mostly as a fun experiment, but also from the realization that every place I worked at in the past (small/medium-sized B2B startups) would greatly benefit from such architecture.
Yes, there are massive trade-offs to this approach, and the concerns raised in the comment section are valid. This doesn't mean the database-per-tenant is never worth it. There's a sweet spot for it, and if it fits your business/application, I personally would consider it a technical advantage over competitors.
My goal with FeebDB is to eliminate or reduce the common pain points of database-per-tenant, including:
- ensure there is a single writer per database.
- improved connection management across all tenants (e.g. only keep open at most 1000 DB connections, similar to an LRU cache).
- on-demand migration (all shards are migrated on application startup, but if a shard that hasn't migrated yet receives a request, it will first perform the migration and then serve the request),
- on-demand backups and replication (e.g. the library knows which shards were updated in the last X minutes, so it can trigger Litestream or similar on demand).
- support for enumeration of databases (performing map/reduce/filter operations across multiple DBs)
- support for clustered deployment with "pinned" tenants (for now I'm assuming the IOPS of a single beefy server should be enough for all use cases, but once that's no longer sufficient you can have "shards of shards")
[0] - https://github.com/renatomassaro/FeebDB/
I think an in-between approach could be:
1. Identify top-N tenants
2. Separate the DB for these tenants
The top-N could be based on mix of IOPS, importance (revenue wise), etc.
The data model should be designed in such a way that from rows pertaining to each tenant can be extracted.
This is how we did it at mailchimp. I think this is ignored or overlooked because this means devs might have to care a bit more about operations or the company has to care more.
That is what most tenanting scale-ups do. "Jumbo"-tenants get relocated either to separate partitions, or to partitions which are more sparsely populated.
> And, of course, the “database per tenant” workflow is just starting
James Edward Gray spoke about doing this at RailsConf in 2012. https://youtu.be/GRfJ9lni4QA?si=BoDI5gQ8GvM1PDhw
Awesome stuff, Forward Email does similar with an encrypted sqlite db per mailbox / per user. Great way to differentiate protection per user.
If noshing else, the name is exshellent. Very Sean Connery.
Much appreciated!
--humbly signed-- a fellow admirer of Geert Van Schlänger
I used something similar in the past. Loved it. User wants their data? Bang! Here's the entire database. User deletes their account? `rm username.sql` and you're done. Compliance was never easier.
When data can be this isolated from each other and you don't have any scaling issues within a single tenant it's pretty hard to make a wrong design choice. Almost anything will work.
wonder if active record couldn’t borrow some pattern from coredata with its managedobjectcontext property associated to every object. This lets you work with objects transparently, not having to carry a « db » property everywhere, and the framework always knowing where the object come from, it can trigger an error if you’re trying to do something unholy by mixing them.
I bet it would not be very hard to write that as an extension for proof of concept. There have been some good flexible operations added for multi-database stuff in activerecord over the past couple versions.
And these are exactly the ones I'm using here ;-)
Could anyone who runs Rails with sqlite in production share some scale numbers like r/w per second, # rows in the db, and vps size? I have used it at trivial scale for SolidQueue and SolidCache, but I can't find experience reports for the primary OLTP db.
What’s the disk behind this strategy? Lots of sqllite DBs is easy until you have lots of servers, no?
Kind of, yes. If you are going to be doing this at scale, you will either need to pin servers to shards of users, or retrieve the DB from some durable storage "just in time", or... use a network mount. Note also, that with small-enough databases you can do the same thing Apple does - fetch your entire tenant from the "big datastore", set a lock on it, do the thing that you want to do - and then write it back into the "big datastore" at the end.
I guess if the tenant isolation is more important than the disk/network io overhead this would introduce then it's viable solution.
It'd probably work better if tenant == user and more specifically tenant == single-user-session.
Wonder how many sqlite databases would be too many. At one point I assume not all databases can be kept opened at all time. what sort of overhead would there be serving a tenant not opened up yet? there has to be caches etc. not warmed up causing lots of disk IO
At some level, it doesn't make a big difference if you've got a file open or not once the file's data falls out of the disk cache, you'll have the same kinds of latency to get it back. Sure, you'll avoid a round or two of latency pulling in the filesystem data to get to the file, but probably not a big deal on SSD.
Chances are, the popular databases stay in cache and are quick to open anyway; and the unpopular ones are rarely accessed so delay is ok. But you'd also be able to monitor for disk activity/latency on a system level and add more disks if you need more throughput; possibly disks attached to other machines, if you also need more cpu/ram to go with it. Should be relatively simple to partition the low use databases, because they're low use.
In this scenario you would use short-lived connections, and the overhead would probably be approximately the same as reading a file.
That would be the FD limit, divided by 3 (the DB itself, the shm file and the WAL).
But each SQLite connection (even to the same DB) will also consume 2 FDs for the DB and the WAL.
You'll more easily pool connections to the same DB, of course, but the difference might not be so stark.
Something like that, yes. A tenant that hasn't been opened yet - well, you create the tenant first, and then proceed "as normal". With ActiveRecord, your `pool:` configuration defines how many database handles you want to keep open at the same time. I set it relatively high but it can be tweaked, I'd say. And there is automatic eviction from the pool, so if you have a few sites which are popular and a lot of sites which are not popular - it should balance out.
There could be merit to "open and close right after" though, for sure.
Love the name.
That is all.
Thanks!
Database per tenant can get real nasty when schemas get out of sync, then you’re in real pain.
Postgres with row based access control is a much better solution to database per tenant/strong isolation.
That is true, but in my experience this is less of an issue if your migration allow running against "current - 1". If your migration borks on one of the tenants - your system should still work, and you can grab that tenant and investigate.
This problem also does happen with sharded databases - which is why most co's have a structured approach of deploying schema changes, observing them having applied everywhere, and only then deploying the feature relying on the schema changes.
Yeah but “schemas out of sync!” Is a never ending development and operations problem with SQLite database per tenant.
Postgres row level access control gives all the benefits of strong isolation with none of the pain of schemas getting out of sync.
You do need to keep tabs on whether they are or not, that is true - and you will have a bit of management to do if they are under constant heavy write load. For me, this all works out because I don't have too many tenants - and I have tools to block writes when running migrations.
It's all tradeoffs after all.
I’d heard of using Postgres schemas before but not RLS. Interesting…
https://www.crunchydata.com/blog/row-level-security-for-tena...
I use `site_id` in every model and keep everything in a single database. I think this is how Shopify started as well. You can see that because their product, variant, collection IDs are incremental across all shops. They might be using different solution today, but it feels like this is how they started.
What I like about having everything in one db, until it grows too big, is that I can do cross-users analytics/reporting. I also had it happen, like it was mentioned in the article, that I needed to share some data between users. Having everything in a single database made this much simpler problem than needing to move data between databases.
The "hard problem" with DB tenanting is drawing the line where your "common" apex DB model begins, and where the tenanted data ends.
[dead]
Came here for the amazing title pun. Like the approach, was a bit sad it's about Ruby but won't judge.
With SQLite, the approach should equally trivially work with any language and framework. Unlike e.g. Postgres, there's no connection overhead, pgbouncer, etc; you can open the DB at the beginning of the request, once the user credentials are available, and close it before / along with sending the response, and it still would feel instant.
Why is it sad it’s about Ruby?