How LLMs Saved 90% Cloud Costs
I couldn't believe it when I read it. Amazing Success Story.
Together with Amnic
More than Generative AI. Using AI agents to accelerate your FinOps Processes.
70% of FinOps teams have their AI tooling set up wrong.
Here's how you can right it. AI agents can help you manage costs, analyze data, extract spend, identify anomalies and do much more, while you focus on the higher value tasks focused on business outcomes.
Grab a copy of this e-book to see what AI agents can truly help FinOps teams accomplish.
COST OPTIMIZATION
90% Cost Cut with LLM Microservices
A software company found a smart way to cut costs by over 90% while making their system work better. They were using two separate Google services - one to translate text and another to check if the text had bad language or seemed fake.
The team realized they could replace both services with one simple solution using AI. They built a small microservice powered by Google's Gemini AI that could do both jobs at once. Here's what made this work so well:
The AI could translate text and check for problems in a single request instead of two separate ones.
Short pieces of text don't use many AI tokens, which keeps costs very low.
They could ask the AI to do exactly what they needed in whatever format they wanted.
The team first built a quick test version in Python in just one day. After seeing it worked and cost much less, they rebuilt it properly in Scala to match their other systems.
They even looked into running AI models on their own servers but found that would cost more than using Google's AI service.
FINANCE & PROCUREMENT
A Newsletter to Help Your Finance & Procurement into FinOps
We love to have more content in FinOps. And specially for personas that don’t have enough to digest and learn this practice. I’m talking Finance & Procurement.
While I try to include everyone in this newsletter, there’s not infinite room in an email.
That’s why we needed a newsletter dedicated to finance.
That’s why FinOps Cash Flow was born. Run by Alexa Abbruscato, the mission is:
I’ll be sharing strategies and insights that bridge FinOps with Procurement and (hopefully) spark conversations surrounding business financial value across both communities. Excited to see this evolve over time.
Share with your finance and procurement teammates our first edition!
|
FINOPS EVENTS
Mastering AI Economics
Learn how top engineering and FinOps teams are aligning performance with budget by optimizing architecture, tracking true cost per model, and using practical insights to stay ahead of runaway spend.
What we’ll cover:
Understanding unit costs: What they are, why they matter, and how to track them
Cost-efficient architecture: Design patterns and trade-offs that lower compute and storage bills
FinOps for AI: Building transparency and accountability into fast-moving AI teams
Speakers
Vaibhav Sharma, David Gross & Alon Savo
Host: Victor Garcia
September 9th - 6:00 PM CEST / 10AM EST
KUBERNETES
Kubernetes 1.33 Helps You Cut Cloud Transfer Costs
When your pods talk to each other across different availability zones, AWS charges $0.01 per GB for data going out of a zone and another $0.01 per GB for data coming in. That adds up to $0.02 per GB for each cross-zone conversation.
Here's how the math works out. In a typical setup with 30 microservices spread across 3 zones, about 67% of your internal traffic will cross zone boundaries. If your platform moves 8TB of data monthly, that means 5.4TB crosses zones, costing you roughly $1,080 per month.
The good news is Kubernetes 1.33 introduced a new feature called trafficDistribution with a PreferClose option. This tells the system to route traffic to services in the same zone whenever possible, only sending requests elsewhere if no local options exist.
Setting it up is simple. You just add one line to your service definition: trafficDistribution: PreferClose.
A real test shows dramatic results. Without the feature, requests from zone A spread evenly across all zones - 34% stayed local while 66% crossed boundaries. With PreferClose enabled, 100% of requests stayed in the same zone.
For a platform handling 20TB monthly, this change could reduce cross-zone traffic from 13.4TB to nearly zero, cutting monthly transfer costs from $3,580 to almost nothing.
CLOUD PROVIDERS
AWS Releases MCP Server for Cloud Billing, New Instances and More
AWS
AWS introduced split cost allocation data for Amazon EKS, providing granular visibility into shared EC2 instances with multiple accelerators like NVIDIA or AMD GPUs.
Amazon RDS for Oracle now supports bare metal instances, offering direct hardware access without hypervisors and up to 25% cost reduction
Amazon CloudWatch gained two key improvements: creating alarms based on multiple metric queries using math expressions, and extending console metric queries to two-week periods.
Amazon Managed Service for Prometheus usage and quotas are now visible in AWS Service Quotas and as CloudWatch metrics, helping teams proactively manage monitoring resources.
Microsoft Azure
Azure Ultra Disk Storage received price reductions across multiple regions including parts of the US and Europe. This directly benefits applications requiring top-tier storage performance like large databases and transaction-heavy systems.
Google Cloud
🫙 🫙
AI
Boosting Cloud-Native AI Efficiency with FinOps Strategies

Cloud for AI is getting expensive and complicated. GPUs are hard to find and cost a lot of money. Companies running AI programs need to be smarter about how they use these resources.
The old ways of saving money on cloud services don't work well for AI. AI programs need special treatment because they use GPUs differently than regular computer programs.
Smart companies are learning new tricks to make their AI programs run better without spending more money.
A financial company on Amazon Web Services cut their training costs by half. A retail company on Microsoft Azure improved their computer use by 30%. A media company on Google Cloud made their costs more predictable.
The key is not just using computers at 100% capacity. The real goal is doing more work faster while spending less money and keeping customers happy.
📺️ VIDEO
Understanding FinOps from Visibility to Action
Nilofar Bhurawala reveals how TBM and FinOps align to optimize IT spend, drive business value, and thrive in the cloud + AI era.
Learn how to align technology investments with strategic outcomes, explore the synergies between FinOps and TBM, and understand their role in the era of cloud and AI.
A must-listen episode packed with practical insights for FinOps and TBM leaders and professionals.
🎖️ MENTION OF HONOUR
Tackling FinOps Burnout
Elly Rauch started her career in finance and sourcing before FinOps even had a name. She watched cloud spending grow not just in size, but in how many different teams used it.
When she got the chance to build a FinOps team from scratch, it seemed perfect.
The Scope That Never Stops Growing
FinOps teams usually start with one cloud provider - the one causing the biggest headache. But once you prove FinOps works, everyone wants you to take on more. The definition of FinOps has grown from just public cloud to "Cloud+" which now includes software, private cloud, and basically anything else.
When AI became hot, engineering teams looked to FinOps for help. FinOps teams are supposed to understand hundreds of different technologies, but there's no way to focus on all of them at once.
The Lonely Middle Ground
FinOps sits between engineering and finance departments. This gives you a unique view of the company, which can be your superpower. But being in the middle can also be lonely.
Rauch's team had to forecast cloud spending for hundreds of other teams. When leadership pushed back on the numbers, FinOps had to answer questions for all those teams.
What Burnout Actually Looks Like
She remembers staying late at work not because someone asked her to, but because she felt there was always more to do.
How to Fight Back
After stepping back to support other FinOps professionals, Elly learned some key lessons.
Shared responsibility works better than one team trying to solve everything. Breaking big problems into smaller pieces makes them less overwhelming. Automating boring tasks like data prep frees up time for more important work.
The FinOps community is a valuable resource. Talking to others who face similar challenges helps you realize you're not alone.
FinOps is incredibly hard work, but that's also what makes it meaningful - you're solving problems others think are impossible to solve.
Professional Spotlight
Assaf Flato
FinOps Community Supporter
Assaf has been always trying to help us make FinOps Weekly a better place with his speaking engagement. It’ll be great to have him in the FinOps Weekly Summit 2025!
That’s all for this week. See you next Sunday!
Join FinOps professionals at the FinOps Weekly Summit 2025 and discover how to:
Transform from reactive fire-fighting to strategic leadership
Master AI-powered cost optimization
Build bulletproof unit economics
This is the only major FinOps event left in 2025, featuring battle-tested strategies from companies managing billions in cloud spend.
FinOps for Everyone, baby.
October 23rd, 2025 | 4:00 PM - 8:00 PM CEST
Limited seats available