Introduction
It’s a rare occurrence when AWS decides to increase its prices..
Thursday the 9th of May, AWS released a news post conveying a change to the Cognito pricing structure in machine-to-machine (m2m) scenarios. Any flows involving user authentication will be unaffected. The client credentials grant flows would no longer be included free of charge, introducing two added cost types: Number of token requests per month and Number of app clients per month.
Note that there is a 12 month grace period for already existing Cognito customers, read the news post to learn more about it.
Main reason for this change is to
better support continued growth and expand capabilities
Some will interpret this as AWS wants to invest more into this services future. Others will perhaps see it as a money grab, that will turn some away from using the service.
For those that stick with Cognito, this change will require us to rethink of how we use our m2m tokens, as the previous free-for-all usage might potentially prove to be an expensive mistake.
What is machine-to-machine authentication?
M2M is what if referred to authentication methods used when there are no users involved. A machine can also be referred to an application, service or client. The important part is that it’s a piece of code, needing to provide authenticate towards a resource server without any user context.
A more concrete example of this might be a piece of warehouse management system (WMS) that will automatically order delivery trucks on certain conditions. This system need to communicate towards an external logistics supplier to perform the order. This is often done in the form of a REST API on the supplier side, allowing their customers to integrate towards their systems and automate order creation. The WMS system need to provide proof of authentication in order to be able to securely put orders for the right customer. Here’s where m2m authentication comes in.
By using the commonly used OAuth 2.0 client credentials grant flow, the WMS system would authenticate towards an identity provider (IDP), for example the logistics suppliers AWS Cognito solution, using its provided client ID and client secret. Other things normally supplied are scopes. The IDP would return an JWT access token with an expiration time, to be used to prove authentication against resource servers.
The WMS system can now take the access token and give it to the logistics suppliers API, which would then validate the token to ensure its authenticity. Upon a successful token validation and authorization, the API would process the order and return a successful response.
Now that we have a brief lesson on the fundamentals on m2m authentication, let’s dig into the Cognito changes.
New Cognito pricing
AWS Cognito M2M app client pricing as of 2024-05-10 for the eu-north-1 Stockholm region
AWS Cognito M2M token request as of 2024-05-10 for the eu-north-1 Stockholm region
The provided examples by AWS shows how this change could add an additional $2,500 USD a month. This might be an API application that is responsible for providing data to other consuming microservices.
Doing some quick math
Total # of token requests: 900,000
Total # of app clients: 200
Default access token expiration: 60 minutes
Average request per app client per month: 900000/200 = 4,500
Average request per app client per hour: 4500/730 = 6,16
On average, each client is requesting a new token just above 6 times an hour.
Steps to reduce cost
It’s important to realize that there are some things that can be done to address this, but certain actions will come with potential tradeoffs that need consideration.
Estimate the request usage
Cognito publishes successful token refresh metrics to CloudWatch called TokenRefreshSuccesses. This can be used to do cost analysis on estimations on the total cost. The question we want answered is:
Is the effort required justified to address this issue?
It might take 2 full days, spending a lot of money to address something that will pay itself back in years. Always think in terms of return of investment and other business priorities.
This can also become a tricky question to answer in a large organization with many accounts using Cognito as it might be tricky to gather this information.
Identify Cognito UserPool App Clients with M2M capabilities
In a large AWS Organization, it might be of interest to quickly discover how many app clients we do have with m2m capabilities and are in scope to be billed. Most businesses have AWS Config enabled to discover resources, their configuration and track changes performed on them.
The following AWS Config advanced query can be used to discover the total number. This should be performed on the Config Aggregator account and region.
SELECT COUNT(*) WHERE resourceType = 'AWS::Cognito::UserPoolClient' AND configuration.AllowedOAuthFlows = 'client_credentials'And the following to get a list.
SELECT * WHERE resourceType = 'AWS::Cognito::UserPoolClient' AND configuration.AllowedOAuthFlows = 'client_credentials'These numbers can be used to get a better understanding of how many app clients are in scope for billing.
Increase the access token expiration time
Once we can put our finger on a figure of certain added cost, we can take a look at the options we have in reducing it.
In Cognito, the default access expiration time is 60 minutes. This means that a busy client would have to renew its token at least 24 times a day, 720 times in a month. This in of itself is not a lot, just $1,68 USD a month. Increasing the expiration time to the max, 1 day, would mean that we’re consuming 24 times less, $0,0703125 USD a month.
Increasing the expiration time comes with a few drawbacks though:
- Security - Having longer living tokens increases the risk by increasing the window of opportunity for attackers. Organizations might also be under compliance that regulates token expiration.
- Revokation - Revoking client credential issued access tokens are not currently supported by Cognito, only tokens issued by refresh tokens. This means that we cannot revoke a token once it’s out in the wild. This of course could make a huge difference if the token is valid for 60 minutes or 24 hours. In other words, if a API customer would be issued an access token with a 24 hour expiration period, the access cannot be revoked by any Cognito means, only the expiration would remove the access.
Implement token endpoint caching
Every time we use our client credentials to retrieve access tokens, Cognito will always produce and return a new one, now adding cost.
We can work around this behavior by implement caching on different levels to reduce the number of requests touching the Cognito token endpoint.
Server side
Implementing server side caching before hitting the Cognito token endpoint should be the first priority after setting the proper token expiration time. This will ensure that a client will always receive its cached token.
AWS has published a nice article on how API Gateway can be used to proxy to and cache responses from the Cognito Token endpoint.
This looks to be done manually, as some caching parameters (mainly the Per-key cache invalidation) are unsupported in CloudFormation.
A start but not all the way there:
Parameters: CognitoEndpoint: Type: String Description: 'The full URI for the Cognito endpoint. Example https://mydomain.auth.us-east-1.amazoncognito.com' Resources: CacheApi: Type: AWS::ApiGateway::RestApi Properties: Name: CognitoCacheApi PostMethod: Type: AWS::ApiGateway::Method Properties: HttpMethod: POST ResourceId: !GetAtt CacheApi.RootResourceId RestApiId: !Ref CacheApi AuthorizationType: NONE RequestParameters: method.request.querystring.scope: false method.request.header.Authorization: false Integration: Type: HTTP_PROXY IntegrationHttpMethod: POST Uri: !Sub '${CognitoEndpoint}/oauth2/token' RequestParameters: integration.request.header.Authorization: method.request.header.Authorization CacheKeyParameters: - method.request.querystring.scope - method.request.header.Authorization CacheApiDeployment: DependsOn: PostMethod Type: AWS::ApiGateway::Deployment Properties: RestApiId: !Ref CacheApi ProdStage: Type: AWS::ApiGateway::Stage Properties: StageName: prod RestApiId: !Ref CacheApi DeploymentId: !Ref CacheApiDeployment CacheClusterEnabled: true CacheClusterSize: 0.5 MethodSettings: - ResourcePath: '/*' HttpMethod: '*' CachingEnabled: true CacheDataEncrypted: true CacheTtlInSeconds: 3600 Outputs: InvokeUrl: Description: 'Invoke URL' Value: !Sub 'https://${CacheApi}.execute-api.${AWS::Region}.amazonaws.com/prod/'I would suspect Terraform would support this in a better way as the APIs do exist to perform these action.
This solution proves valuable in high usage, but there is a minimum fixed cost of ~ $14 USD per month for the 0.5GB caching, added on top of the normal REST API Gateway charges.
Client side
Depending on our organization and control over the client side, we could also put client side caching to either reduce our request count towards our token endpoint (or cache), but also to reduce expensive high latency requests. This might be needed if the client application is in the US and our Cognito solution in Asia, with high latency and potentially unstable connections. Caching can be done in many ways that I won’t dig much into, but Redis / ElastiCache or DynamoDB might be a good fit depending on the client.
Other ways might be creative ways of not needing to access the needed resource from the API, like local caching of the data over an extended period of time.
Summary
To summarize, the Cognito cost changes can have a significant cost implication for customers using m2m capabilities to a large extent. A service provider hosting APIs in AWS to resource consumers can easily get into thousands of dollars of cost from a few hundred clients of regular use.
But by doing some quick information gathering and cost projections, we can quickly figure out if any action is needed, and that