Managed Databases: Three Cost Configurations Nobody Reviews

Database costs are usually the second or third largest line item in a cloud bill, after compute. They're also the line item that gets reviewed least frequently, because databases feel dangerous to touch. Changing a compute instance size is reversible in minutes; a bad database configuration change can cause downtime or data issues that take hours to recover from. So teams provision databases carefully, verify they work, and then leave the configuration alone for as long as possible.

That caution makes sense for the configuration choices that affect reliability. It doesn't make sense for the configuration choices that affect only cost. Here are three configurations that look reasonable at provisioning time and quietly become expensive as usage patterns evolve.

Configuration 1: Storage Auto-Scaling Without a Maximum

Managed database platforms offer storage auto-scaling — the database volume expands automatically when it reaches a utilization threshold, typically 90%. This is a good feature. Running out of database storage causes an outage, and auto-scaling prevents that without requiring manual intervention at 2am.

The problem occurs when auto-scaling is enabled without a maximum storage limit. The database grows as needed, which is correct behavior. But growth is one-directional — managed database storage cannot be shrunk after it has been expanded. A database that grew to 2TB during a high-data period and subsequently had old records archived cannot be reduced back to 400GB. It will bill for 2TB indefinitely, regardless of actual data size.

We've seen accounts where database storage costs exceeded the instance compute costs because of unchecked auto-scaling over two years of operation. The fix requires migrating to a new database instance (not a trivial operation), or accepting the current storage cost. Prevention requires setting a reasonable maximum at provisioning time — large enough to cover projected growth with headroom, but bounded so that runaway data growth gets flagged rather than silently billed.

The review process is simple: for each managed database, compare the provisioned storage size to the actual data size. If provisioned storage is more than 2x actual data size, investigate whether a migration to right-sized storage is worth the operational cost.

Configuration 2: Multi-AZ Replication on Non-Critical Databases

Multi-AZ database replication keeps a synchronized standby in a separate availability zone and fails over automatically if the primary becomes unavailable. It's the right choice for production databases that serve customer traffic and where downtime has a direct business impact. It roughly doubles the database instance cost.

Non-production databases almost never meet this bar. A development database going down for 20 minutes while an engineer restores it is inconvenient, not a business incident. A staging database being unavailable during a deployment window is a recoverable delay, not an outage.

Multi-AZ on non-production databases persists because it was set at initial provisioning (often by copying the production configuration), and nobody has explicitly decided to turn it off. It's not that teams think non-production databases need high availability — it's that removing multi-AZ is a deliberate configuration change that needs to be justified, scheduled, and executed, and that work competes with feature work for prioritization.

The audit: list all managed database instances, cross-reference with environment tags, and calculate the savings from disabling multi-AZ on non-production databases. For a team running six non-production database instances at $200/month each with multi-AZ, that's $600/month in addressable savings with no reliability tradeoff at all.

Configuration 3: Performance-Class Storage Tiers on Low-IOPS Workloads

Database storage comes in multiple performance tiers. The top-tier provisioned IOPS storage can deliver tens of thousands of read/write operations per second and costs several times more than the standard general-purpose tier. It's the right choice for high-throughput transactional databases, analytics workloads with heavy concurrent reads, and any database where I/O latency directly affects user experience.

It's not the right choice for a reporting database that runs three queries per hour, a configuration database that gets read a few times per minute, or an audit log database that handles sequential writes but no random reads.

High-performance storage gets provisioned on low-IOPS databases for two reasons: either the original engineer wasn't sure how much I/O the workload would need and chose the conservative option, or the database was migrated from a high-traffic environment and the storage tier wasn't adjusted for the new workload.

Identifying candidates for downgrade: look at the database I/O metrics over the trailing 90 days. Any database that consumed fewer than 10% of its provisioned IOPS is worth evaluating for a tier downgrade. The performance difference on the actual workload will be negligible — you're giving up headroom you're not using — and the savings can be substantial for databases with large provisioned IOPS values.

The Review Cadence That Prevents Accumulation

These three configurations don't generate alerts. They don't show up as anomalies in cost dashboards. They're just steady costs that accumulate as the distance between "what was configured" and "what the workload actually needs" grows over time.

The prevention mechanism is a periodic review — quarterly is the right cadence for most teams. The review doesn't need to cover every database; it needs to cover the ones where usage patterns may have changed (recently migrated databases, databases for deprecated features that are still running, databases that were oversized during peak periods that have since normalized).

A structured checklist for each database:

Is the instance size appropriate for the current CPU and memory utilization?
Does this database require multi-AZ for its actual availability requirements?
Is the storage tier appropriate for the actual IOPS pattern?
Is provisioned storage significantly larger than actual data size?
When was this configuration last reviewed?

KernelRun surfaces managed database configurations alongside utilization metrics, flagging instances where the configuration appears out of alignment with the actual workload pattern. It doesn't make the change automatically — database configuration changes carry enough risk that a human should always make the call — but it makes sure the question gets asked.

Surface misconfigured databases automatically

KernelRun flags storage tier mismatches, multi-AZ on non-production databases, and over-provisioned storage across all your managed database instances.

Request a Demo