Introduction
Over the years, I’ve helped numerous businesses streamline their infrastructure, whether by consolidating cloud resources, optimizing application architectures, or simplifying overly complex setups. This particular project stands out because of how significantly it impacted the client’s financials — we were able to cut their AWS spending by more than 50%, largely by moving from managed OpenSearch and RDS to thoughtfully configured EC2 instances.
This wasn’t about squeezing every penny or taking reckless shortcuts. It was about aligning technology with the business stage, finding the right trade-offs, and ensuring future flexibility. In this article, I’ll walk through the real-world challenges, the business conversations, and the architectural choices that led to such substantial savings.
The Problem: AWS Costs Outpacing Revenue
When this client — an ecommerce company doing about $1.5 million in annual revenue — approached me, their AWS bills had grown to more than $4,500 per month. For a company still scaling carefully, this was becoming a serious strain.
They didn’t have glaring inefficiencies in their application. Pages were served quickly, the user experience was solid, and their tech stack was modern enough. Instead, the primary issue was an overreliance on fully managed services that were simply more than they realistically needed.
The Breakdown
After I performed a detailed analysis using AWS Cost Explorer and CloudWatch metrics, here’s roughly where the money was going each month:
- AWS OpenSearch (formerly Elasticsearch Service): ~$1,900
- AWS RDS (MySQL Multi-AZ): ~$1,400
- Remaining EC2, S3, Lambda, other resources: ~$1,200
OpenSearch and RDS alone accounted for more than 70% of their monthly bill.
Why So Expensive?
Managed OpenSearch
They had started with a small managed Elasticsearch cluster for logging and some on-site product search. Over time, indexes grew. New teams added more logs, marketing wanted richer search analytics, and no one aggressively cleaned up old data.
The cluster was now running 3 data nodes, 2 master nodes, plus UltraWarm for historical data — with combined costs close to $2,000/month, despite under 30% average CPU usage.
Managed RDS
Similarly, their database began on a modest t3.medium multi-AZ instance. By the time I reviewed it, they were on larger instances, paying for multi-AZ replication, and had ballooned to roughly 400 GB of storage. Automated snapshots, read replicas for staging, and I/O costs all added up.
The First Conversations: Business Goals Over Tech Obsession
When people talk about cutting cloud costs, they often dive straight into tech tricks — reserved instances, savings plans, tweaking storage. I took a different approach.
I spent my first few days on this engagement having long discussions with their CTO, marketing lead, and CFO. I wanted to know:
- What are your biggest priorities over the next 6-12 months?
- Is maximum uptime your #1 concern, or would short maintenance windows be acceptable?
- Do you expect big seasonal traffic spikes?
- How aggressive do you want to be on cost savings vs redundancy?
Their answers painted a clear picture: they needed meaningful savings to reinvest into customer acquisition and retention, but they could tolerate minor increases in operational overhead and slightly longer recovery times — provided there were solid backup and rollback plans.
Building a Clear Financial Case
Before proposing any infrastructure changes, I built three financial scenarios:
Keep all managed, maybe rightsize nodes | ~$3,800 | ~$8,400 |
Move OpenSearch to EC2, keep RDS managed | ~$2,800 | ~$20,400 |
Move both OpenSearch & RDS to EC2 | ~$2,000 | ~$30,000 |
This gave leadership a straightforward business trade-off. Moving both services off fully managed options could free up $30,000+ annually, a huge deal for their size.
Planning the Migration: Technical Considerations Without DIY Details
I avoid giving step-by-step implementation in public articles, both to protect client confidentiality and because every environment’s nuances differ. Instead, here’s what I focused on:
For OpenSearch
- Calculated actual heap memory needs and query throughput to right-size EC2 instances.
- Planned index rotation and more aggressive deletion of stale marketing logs.
- Set up EBS snapshots with lifecycle policies, balancing RTO (recovery time objective) with cost.
For MySQL
- Moved from multi-AZ RDS to a single EC2-based MySQL instance with regular automated backups and binlog replication to a warm standby in another AZ.
- Designed scheduled test restores to ensure backups weren’t theoretical.
Security & IAM
- Preserved VPC-level controls and tightened IAM roles, making sure access was still auditable and aligned with least privilege.
Addressing Leadership Concerns
Risk of Downtime
Their CTO was initially very hesitant about leaving RDS’s multi-AZ failover. I laid out what a failure scenario would look like on EC2 — we’d potentially have a brief read-only period while switching to the standby, or if the primary died completely, a short restore window from the latest snapshot.
Given that their customer flows were heavily cached with CloudFront and Redis, and payment processing had idempotent retries, we determined this was a reasonable business risk.
Operational Overhead
Shifting away from fully managed meant someone needed to handle patching, upgrades, and monitoring. That’s where I stayed engaged on a lightweight monthly retainer — handling slow query reviews, OS updates, and JVM patching.
Executing the Transition
We staged everything carefully. For about three weeks, both the managed services and the new EC2 replacements ran in parallel. We used this period to:
- Validate replication integrity and search index consistency.
- Monitor performance under typical weekday loads and planned marketing pushes.
- Run drills that simulated instance failures to test failover documentation.
Only after the leadership team felt confident did we update DNS and application configs to fully cut over.
The Results: Financial and Strategic Wins
When their next AWS invoice arrived, it told the story vividly:
Managed OpenSearch | ~$1,900 | ~$110 |
Managed RDS | ~$1,400 | ~$120 |
New combined EC2 (DB + search) | ~$1,000 | ~$1,700 |
Total | ~$4,500 | ~$1,930 |
That was a 57% reduction, putting over $30,000 per year back into their operating budget. They immediately increased their paid social campaigns and launched two additional product lines — things that would have otherwise waited another year.
More Than Just Cost Savings
An unexpected benefit was how much more ownership their internal team took on. They became deeply familiar with query patterns, learned to plan index lifecycle more strategically, and set up more granular metrics dashboards. Instead of AWS “just handling everything,” they understood exactly how much adding a new feature or data source might cost.
This also created a foundation for future scaling. With clearer insights, they could confidently consider Aurora or even a return to managed OpenSearch if query volume exploded. But it would be an informed, deliberate move, not just following the default recommendation.
Why This Wasn’t About Being Cheap
I often caution clients against chasing the lowest possible infrastructure bill. This was never about cutting corners. It was about building an architecture appropriate to the current scale and growth trajectory of the business.
If they were processing 100x more orders daily, we’d have likely stuck with multi-AZ RDS and maybe even used AWS OpenSearch Serverless. But at their level, the cost for that much managed redundancy simply didn’t justify itself.
Questions Clients Often Ask on Projects Like This
What If Our Traffic Spikes?
We sized EC2 generously with additional CPU and memory headroom. Plus, most of their user experience was cached behind CloudFront and Redis. This meant the database load only increased moderately even during big email campaigns.
What If an EC2 Instance Fails?
We had snapshot-based restores and a warm standby ready. The reality is, EC2 has strong uptime — paired with our tested recovery playbooks, it was a manageable risk.
What If We Outgrow This Setup?
Great — we’d simply revisit. Because we documented query growth, storage trends, and CPU profiles, moving back to more managed redundancy (or something like Aurora Global) would be straightforward.
Looking Ahead: Their Next Chapters
Six months after we completed this migration, the company had grown revenue by nearly 40%. The CTO reached out again, this time to explore more advanced BI tooling on top of S3 and Athena — ironically spending some of those saved infrastructure dollars on better analytics.
This is exactly the outcome I aim for: using optimized infrastructure to free up budget for investments that actually grow the business, not just feed the cloud provider.
Final Thoughts: Why Clients Hire Me
This company didn’t bring me on board simply to cut costs. They hired me because I could:
- Map technical decisions to business priorities, showing in clear tables what each scenario meant financially.
- Build a migration plan with well-understood risks and tested fallback options, so leadership could confidently sign off.
- Stay engaged to keep their environment secure, patched, and tuned, reducing long-term anxiety.
That’s what true infrastructure consulting looks like: not racing to the cheapest solution, but designing systems that match your stage — and pivoting wisely as you grow.
Want Help With Your AWS Bill?
If your company is seeing AWS bills that don’t seem to reflect your actual scale, or if you’re curious whether your managed services are still the right fit, I’d be glad to take a look.
You can reach me at [email protected] or see more about how I work at kevinhq.com.
Sometimes it’s just a matter of resizing. Other times, like this case, it means rethinking the architecture entirely. Either way, it’s always about aligning your technology spend to what moves your business forward.
Thanks for reading.