Member-only story

Optimizing Cloud GenAI costs

Park Sehun
3 min readDec 14, 2024

Snowball Impact

The “snowball impact” in costs when using Generative AI (GenAI) via cloud services refers to how expenses can quickly escalate due to the scalable nature of cloud resources.

Cloud services often have low upfront costs, making it easy for businesses to start using GenAI services with minimal initial investment. Ironically, the on-demand nature of cloud services means costs can spike unexpectedly, especially if usage isn’t carefully monitored.

To manage these costs, companies often adopt a FinOps strategy, which involves collaboration between IT and finance teams to monitor and control cloud spending.

Simulation #1

Calculator, Pricing based on PTUs

Case: 100K people in the company use OpenAI API daily, 5 days a week, around 30 API per day.

Token Usage:

  • 100,000 users * 30 API calls/user * 100 tokens = 300,000,000 tokens/day
  • 6000 million tokens/month * 12 months = 72,000 million tokens/year.
  • Estimated cost = 72,000,000,000 tokens/year * $0.03/1,000 tokens = $2,160,000 / year

Yearly around $2M.

This might be much over/underestimated based on the company strategy and daily usage by users. However, the more people rely on ChatGPT, the more frequently they will use it for basic tasks. (Calculation, translation, summary, etc.). Therefore, it will be twice, x…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

No responses yet

Write a response