AI adoption stalls as inferencing costs confound cloud users

4 days ago 1

Broader AI adoption by enterprise customers is being hindered by the complexity of trying to forecast inferencing costs amid a fear being saddled with excessive bills for cloud services.

Or so says market watcher Canalys, which today published stats that show businesses spent $90.9 billion globally on infrastructure and platform-as-a-service with the likes of Microsoft, AWS and Google in calendar Q1, up 21 percent year-on-year, as the march of cloud adoption continues.

Canalys says that growth came from enterprise users migrating more workloads to the cloud and exploring the use of generative AI, which relies heavily on cloud infrastructure.

Yet even as organizations move beyond development and trials to deployment of AI models, a lack of clarity over the ongoing recurring costs of inferencing services is becoming a concern.

"Unlike training, which is a one-time investment, inference represents a recurring operational cost, making it a critical constraint on the path to AI commercialization," said Canalys senior director Rachel Brindley.

"As AI transitions from research to large-scale deployment, enterprises are increasingly focused on the cost-efficiency of inference, comparing models, cloud platforms, and hardware architectures such as GPUs versus custom accelerators," she added.

Canalys researcher Yi Zhang said many AI services follow usage-based pricing models that charge on a per token or API call basis. This makes cost forecasting hard as the use of the services scale up.

"When inference costs are volatile or excessively high, enterprises are forced to restrict usage, reduce model complexity, or limit deployment to high-value scenarios," Zhang said. "As a result, the broader potential of AI remains underutilized."

It's not surprising that businesses are hesitant about committing to a wider use of inferencing services, when many have already been stung with higher than expected bills for cloud services after their usage grew faster than anticipated or they overprovisioned because of the complexity of gauging requisite resources.

An extreme example is 37signals, developer of project management platform Basecamp, which embarked on a switch to on-premises IT after being hit by an annual cloud bill of more than $3 million.

Gartner warned last year that end user organizations adopting AI could discover "500 to 1,000 percent errors of AI cost estimates are possible," because of vendor price hikes, not paying attention to the cost, or simply inappropriate use of AI.

According to Canalys, cloud providers are aiming to improve inferencing efficiency via a modernized infrastructure built for AI, and reduce the cost of AI services.

In October, Canalys chief analyst Alastair Edwards said public clouds may not be the most suitable environment for AI model inferencing.

"The public cloud, as you start to deploy these use cases we're all focused on and start to scale that, if you're doing that in the public cloud, it becomes unsustainable from a cost perspective," he said at the Canalys Forum EMEA in Berlin.

Some companies are instead turning to colocation and specialized hosting providers rather than the big public cloud operators, he added.

The latest Canalys report found the big three players (AWS, Azure, Google Cloud) continue to dominate the IaaS and PaaS market, accounting for 65 percent of customer spending worldwide.

However, Microsoft and Google are slowly gaining ground on AWS, as its growth rate has slowed to "only" 17 percent, down from 19 percent in the final quarter of 2024, while the two rivals have maintained growth rates of more than 30 percent. ®

Read Entire Article