The goal is to find the best intersection of your storage and tenant partitioning needs. Consider how the strategy impacts your ability to build, deliver and deploy versions in zero downtime environment. Assess the regulatory, business and legacy dimensions of a given environment.
SaaS Paritioning Models
Silo
Separate database for each tenant
Addresses concerns on sharing infrastructure with other tenants
Great for migration from existing solution to multi-tenant solution
Bridge
Single database, multiple schemas
Pool
Shared database, single schema
Requires introduction of partitioning key to scope and control access to tenant data
Fits with continuous delivery and agility goals that are essential to Saas providers
Silo Model
Pros
Compliance alignment
No cross-tenant impacts
Tenant-level tuning
Tenant-level availability
Cons
Compromised agility
Centralized management
Deployment complexity
Automating creation and configuring database on per-tenant basis adds a layer of complexity and a potential point of failure in your SaaS environment.
Cost
Pool Model
Pros
Agility
Cost optimization
Centralized management
Simplified deployment
Cons
Cross-tenant impacts
Compliance challenges
All or nothing availability
Bridge Model
Hybrid model combining pros and cons of both Silo and Pool model extremes.
Hybrid Silo/Pool Storage
One possible solution is to build a solution that fully supports pooled storage as your foundation. Then you can carve out a separate database for those tenants that demand a siloed storage solution.
Migration and Multitenancy
Minimize invasive changes
Favor data changes which have backward compatibility with earlier changes
Silo/Bridge Models
Data can be migrated on tenant-by-tenant basis
Allows careful migration of each SaaS tenant without exposing all tenants to possibility of migration error
Introduces complexity into overall orchestration of your deployment lifecycle
Pool Model
Easier migration process, all tenants are migrated at once
Any migration error would impact all tenants
Security Considerations
Robust security strategy to ensure that tenant data is effectively protected from unauthorized access
Adopting common security patterns supported by AWS
encrypt data at rest
utilize IAM policies to limit access to tenant data
works great with Silo and Bridge model to limit database access
in Pool model responsibility shifts to authorization models of your application’s services
Research on how isolation is achieved on each of the used AWS Services
Management and Monitoring
Building effective metrics and dashboard for aggregating storage trends
With siloed storage, data should be collected from each isolated database and aggregated in an aggregate model
Tenant-centric Views of Activity
represents the ability to drill down into tenant-centric storage activity
Silo models align more naturaly with constructing this view
Pool models will require some tenant-filtering mechanism
Policies and Alarms
More moving parts on a tenant-by-tenant basis will affect the complexity and manageability of your storage monitoring strategy
Overall goal of the policies to set proactive rules to anticipate and react to health events
Tiered Storage Models
It’s not uncommon to find a spectrum of different storage solutions in use across the set of microservices that make up your application
Storage can be used as another way to tier the SaaS solution
Each tier can leverage a separate storage strategy, offering varying levels of performance and SLAs
Developer Experience
Developers typically introduce layers of frameworks that centralize and abstract away horizontal aspects of their applications
Centralize and standardize policies and tenant resolution strategies
Data access layer would inject tenant context into data access requests
Linked Account Silo Model
Need to provision separate Linked Account for each tenant
Entire infrastructure of a tenant is isolated from other tenants
Linked approach relies on Consolidated Billing
More complex provisioning process
Automate creation of each Linked Account and adjust any limits as needed
AWS has constraints on the number of Linked Accounts - won’t be a good strategy for creating a large number of SaaS tenants
Multitenancy on DynamoDB
Schema-less nature of DDB makes migration easy
Silo Model
No notion of database instance, all tables are created globally in the region
Requires grouping tables belonging to a single tenante, e.g. prefix by tenant identifier
Access to the tables is controlled through IAM policies
Provisioning process should automate generation of tables and IAM policies
Tuning can be done on tenant-by-tenant basis
RCU and WCU, set on table level
Amazon CloudWatch Metrics that are captured on table level
Number of tables can drastically grow in DDB with each microservice introducing new set of tables for each particular tenant
Another approach to be considered is to have a single table for all data per tenant
Simplifies provisioning, management and migration profile of your solution
Bridge Model
Relaxing some isolation requirements through eliminating the introduction of any table-level IAM policies
Removing IAM policies could simplify your provisioning scheme
Pool Model
For evenly distributed data across tenant performance optimization can be achieved by simply relying on underlying partioning scheme
For SaaS environments which don’t have uniform multi-tenant data distribution you need to introduce a mechanism to better control the distribution of your data
One way would be to introduce shards per tenant and make it a parition key
Gives us control on how much data a shard should contain and make the distribution of data uniform across partitions
Tenants with large data footprint will be given more shards
Introduces level of indirection that has to be addressed in data access layers (tenant-shard resolution)
Introducing a tenant lookup table:
Mutlitenancy on RDS
Silo Model
Creating separate instances for each tenant
Typically satisfies the compliance needs of customers without the overhead of provisioning entirely separate account
Bridge Model
Leverage a single instance for all tenants
Create separate representation for each tenant
Requires provisioning and runtime resolution for each tenant
Requires adopting policies to limit schema changes
Some RDS containers limit the number of database/schemas that you can create for an instance
Pool Model
Tenant data is stored in a single RDS instance
Tenants share custom tables
Tenant identifier is used to access each tenant’s data
Single Instance Limits
Storage Amount Limits
MySQL, MariaDB, Oracle, PostgreSQL - 6TB
SQL Server - 4TB
Aurora - 64TB
Consider sharding tenants data and distributing accross multiple instances
Mutlitenancy on Amazon Redshift
Focuses on building high-performance clusters to house large-scale data warehouses
Places some limits on the constructs that you can create for each cluster
60 databases per cluster, 250 schemas per db, etc…
Silo Model
Requires provisioning separate clusters for each tenant
Access can be controlled and restricted using IAM policies and database priveleges
Ability to create tuned experience per tenant
Per tenant provisioning process adds extra complexity to your deployment footprint
Bridge Model
Create separate schemas for each tenant
You will run into 256 limit with Redshift
Redshift security grants all access to databases inside the cluster
SaaS application will be responsible for enforcing finer-grained access controls
The isolation profile of this solution is likely unacceptbale by customers
Pool Model
All tenants share databases and tables
Overall management, monitoring and agility are improved by using a single Redshift Cluster
Upper limit of 500 concurrent connections can be a bottleneck
SaaS developer defines an effective strategy to manage the connection, e.g. implementing client-based caching
Agility
Storage technology and isolation model directly impacts your ability to easily deploy new features. Underlying storage model must accomodate the required changes without requiring downtime. The storage model picked today might not be a good fit for tomorrow.