At some point, most engineering leaders will sit across from a finance partner who needs a 15 or 20 percent reduction in cloud spend by end of quarter. It’s one of those conversations that’s uncomfortable not because the math is hard but because the decisions behind the math are real — you’re choosing what to protect and what to let get slower, noisier, or more constrained.
Having been through this more than once, I’ve developed some views on which cuts are straightforward and which ones look cheap in the short term but cost you later.
Start with the obvious stuff — and actually start there
Before anything else: environment hygiene. In almost every medium-to-large engineering organization I’ve seen, there are running resources that shouldn’t be running. Development environments that spun up for a project that ended. Test environments that get used occasionally but are on 24/7. Load generators left running after a performance test. Dashboards that poll external services every five seconds even when nobody’s looking at them. This work is unglamorous and mildly tedious, but it has no downside — you’re not trading anything, you’re just cleaning up. In my experience, a focused two-week audit of idle and underused resources in a mature AWS environment typically surfaces meaningful savings — often in the 10%+ range — before you’ve touched anything that matters. Do this first.
Instance rightsizing is the second obvious move, and it’s also underused. Most teams provision compute generously when they’re building something new — they don’t know the traffic patterns, so they go bigger than they need. Over time, those instances become the baseline, and nobody goes back to revisit whether the size is still appropriate. Running CloudWatch metrics for the past 30-60 days and finding instances where CPU and memory utilization are consistently low is straightforward analysis. AWS’s Cost Explorer and Compute Optimizer will surface most of this automatically. The tricky part isn’t finding the candidates — it’s getting engineering teams to actually make the changes, which requires trust that the rightsizing was done carefully and a rollback plan if something goes wrong. Both are manageable if you plan for them.
Commitment-based savings: use them, but carefully
Reserved Instances and Savings Plans offer some of the highest return on effort available — up to 72% savings on On-Demand pricing for workloads that run predictably. If you’re not already using them on your steady-state compute, you’re leaving significant money on the table, and a budget-pressure moment is a reasonable forcing function to fix that.
The caution here is important: don’t commit your way out of a budget problem if your workload is actually going to shrink. If you’re buying a 1-year Savings Plan for compute you’re planning to decommission in six months, you’ve created a new problem. Commitments are for stable baseline workloads that you’re confident will run for the commitment period. Get the rightsizing right first, understand which workloads are actually stable, and then commit.
The cuts that look easy but cost you later
Monitoring and observability budgets are the first thing I’d protect. When teams cut CloudWatch log retention, reduce metric resolution, or decommission dashboards to save money, they’re buying short-term savings with future incident cost. The next time something goes wrong — and something always goes wrong — the debugging time increases, the mean time to resolution increases, and you’ve traded a predictable cost for an unpredictable one. I’ve seen teams cut $5,000/month in logging costs and then spend three days diagnosing an outage that would have taken three hours with the logs they turned off.
Testing and staging environments are similar. It’s tempting to scale these down or consolidate them aggressively, but the cost of a production incident that a good staging environment would have caught is almost always higher than the savings. Consolidate where the environments are truly redundant; be conservative where they’re actually being used.
Data transfer costs are worth understanding before you try to cut them, because they’re often less controllable than they appear. If your architecture moves large amounts of data between availability zones, regions, or out to the internet, the right answer is often an architectural change rather than a budget cut — and architectural changes have their own cost and timeline. Don’t promise finance a number you can only hit by rearchitecting something in six weeks.
What the conversation with finance actually requires
The most useful thing you can do in a cloud cost conversation with non-engineering stakeholders is separate short-term cuts from medium-term structural changes. Some savings are available immediately with no meaningful risk. Others require time and carry trade-offs. Being clear about which category each item falls into is what makes the conversation productive rather than adversarial. Finance isn’t unreasonable; they just don’t have a map of the trade-off space. Your job is to give them one.
The teams that handle budget pressure best aren’t better at cutting — they’re better at knowing what they have. Cost visibility isn’t a crisis response; it’s the thing that keeps the crisis from arriving.