Every FinOps engineer eventually discovers the same frustration: AWS Compute Optimizer and Cost Explorer both rely primarily on CPU utilization to generate right-sizing recommendations, and CPU utilization alone produces recommendations that engineering teams reject at a high rate. In our experience across 50+ accounts, teams accept Compute Optimizer recommendations roughly 38% of the time. The other 62% are declined, deferred, or ignored because the recommendation misses something the engineer knows about the workload.
The issue is not that CloudWatch CPU metrics are wrong. They are accurate. The issue is that CPU utilization is one dimension of a multi-dimensional decision. When right-sizing tools collapse that decision to a single metric, they generate recommendations that are technically defensible but practically wrong.
The Four Dimensions of a Complete Right-Sizing Signal
A right-sizing decision for an EC2 instance requires at minimum four independent signals: CPU utilization, memory utilization, network throughput, and disk I/O. These four dimensions map to the four primary resource constraints an instance type satisfies. Optimizing for any one while ignoring the others produces a recommendation that may actually increase cost or cause performance degradation.
Consider the pattern we see most frequently: a web application server running on an m5.2xlarge with average CPU utilization of 14% and p95 CPU of 31%. A single-metric tool recommends downsizing to m5.large. The recommendation appears sound. But the instance is handling 2.4 Gbps of sustained network throughput during business hours, and the m5.large is rated for 10 Gbps burst versus the m5.2xlarge's 10 Gbps dedicated. Under sustained load, the smaller instance will throttle.
CloudWatch does expose network metrics — NetworkIn and NetworkOut are available in 1-minute resolution. The problem is that most right-sizing tools do not incorporate them because the correlation logic is more complex. You need to establish whether the network throughput is bursty (where burst capacity handles it fine on a smaller instance) or sustained (where the dedicated bandwidth of the larger instance is actually necessary).
Memory: The Metric CloudWatch Doesn't Capture by Default
CloudWatch does not collect memory utilization from EC2 instances without the CloudWatch Agent installed and configured to collect it. This is not a minor gap. Memory pressure is the primary reason right-sizing recommendations fail for database-adjacent workloads, caching layers, and JVM-based applications.
In accounts without the CloudWatch Agent deployed on EC2, right-sizing tools are operating blind on memory. An r5.xlarge instance running a Java application with a 28GB heap will show 22% average CPU utilization and look like a prime right-sizing candidate. Downsize it to an r5.large with 16GB of RAM, and the JVM garbage collection overhead increases substantially. The resulting performance degradation may not appear until load testing, or worse, until a production incident.
The practical solution is to install the CloudWatch Agent with at minimum memory and disk I/O collection enabled before beginning any right-sizing analysis. Without that data, any recommendation for memory-bound instances is speculative. We require 30 days of agent-collected memory data before generating right-sizing proposals for any instance running a database engine, JVM application, or in-memory cache.
How Sampling Interval Distorts the Picture
CloudWatch Basic Monitoring collects EC2 metrics at 5-minute intervals. Detailed Monitoring reduces that to 1-minute intervals. For right-sizing analysis, the difference is significant. A 5-minute average for CPU will smooth over peaks that last 2-3 minutes — peaks that are nonetheless real load events the instance needs to handle.
The standard recommendation to use p95 CPU utilization as the baseline for right-sizing assumes that the p95 represents the real peak demand the instance needs to satisfy. With 5-minute averages, a spike that lasts 3 minutes may appear in one or two 5-minute samples, heavily diluted. The p95 calculated from those samples will understate the actual peak.
We enable Detailed Monitoring for all instances before running the analysis engine. The incremental cost is approximately $2.10 per instance per month. For any analysis window longer than 30 days, that cost is recovered immediately by the accuracy improvement in the resulting recommendations.
Day-of-Week Segmentation: Why Weekly Patterns Matter
Workload patterns for most business applications follow a weekly cycle. Monday through Friday sees different utilization than Saturday and Sunday. Batch jobs run on specific days. A single 90-day utilization average flattens this pattern and may produce recommendations that are correct on average but wrong during peak periods.
Correct right-sizing analysis segments utilization by day of week and calculates p95 independently for each segment. For a workload that peaks heavily on Monday morning, the right-sizing baseline should be the Monday p95 — not the average across all seven days. For a batch workload that runs only on Sunday nights, the right-sizing baseline should incorporate the Sunday peak in its headroom calculation.
This segmentation is straightforward to implement if you have the raw CloudWatch data, but it requires retaining and processing 90 days of per-instance metrics at 1-minute resolution. That is a non-trivial data pipeline, and it is one reason most hosted right-sizing tools skip the segmentation step.
The Approval Gap: Why Technical Accuracy Is Not Enough
Even a perfectly accurate right-sizing recommendation will be declined if the engineer responsible for the instance does not understand the evidence behind it. Right-sizing is not purely a technical problem — it is a workflow problem. The engineer who provisioned the instance often has context about the workload that the metrics do not capture: an expected traffic spike, a migration in progress, a dependency on instance-level features that would change under a different instance family.
This is why the proposal workflow matters as much as the analysis accuracy. A right-sizing recommendation that shows the engineer the utilization data behind it — the 90-day CPU baseline, the memory profile, the network throughput history — gets accepted at a much higher rate than a recommendation that simply says "downsize to m5.large." In our data, recommendations with full utilization evidence attached have a 71% acceptance rate versus 38% for recommendations without supporting data.
Spot Instances and Right-Sizing: Two Different Problems
A common mistake in FinOps practice is conflating EC2 right-sizing with spot instance migration as a single cost optimization exercise. They address different cost drivers and should be analyzed independently. Right-sizing addresses over-provisioning within a given instance lifecycle. Spot migration addresses the pricing model, not the instance size.
Both are valid. But combining them into a single recommendation — "downsize to m5.large and migrate to spot" — creates compounding risk. If the right-sizing recommendation turns out to be incorrect, the spot instance disruption tolerance makes rollback harder. We recommend completing the right-sizing analysis and running the resulting instances at the correct size for at least two weeks before evaluating spot migration candidates.
As we discuss in our article on Reserved Instances vs. Savings Plans, the commitment decision should also come after right-sizing is complete — committing to a reserved instance at the wrong size locks in the over-provisioning cost for one to three years.
What a Complete Right-Sizing Signal Looks Like
A right-sizing analysis that produces actionable, accepted recommendations requires the following inputs at minimum: CPU utilization at 1-minute resolution over 90 days, memory utilization from the CloudWatch Agent over 30 days minimum, NetworkIn/NetworkOut at 1-minute resolution over 90 days, disk read/write bytes for storage-bound workloads, and a clear segmentation of the analysis by day-of-week and business hours versus off-hours.
From those inputs, the right-sizing engine identifies the smallest instance type that satisfies observed p95 demand across all four dimensions, with a configurable headroom buffer applied to each dimension independently. The headroom percentages should be configurable per-team because a critical production database should carry more headroom than a staging environment for a batch processing pipeline.
When you build the analysis on all four signal dimensions with correct sampling and segmentation, recommendation acceptance rates improve substantially. The analysis takes longer and requires more data pipeline infrastructure, but the outcome is a set of changes that engineers actually implement — which is the only metric that matters in a cost optimization program.
See KernelRun's multi-dimensional right-sizing analysis
KernelRun collects all four signal dimensions, requires the CloudWatch Agent for memory data, and presents full utilization evidence with each proposal. Connect your first account in 4 minutes.
Request a Demo