Posted 2022-01-15AWSa minute read (About 209 words)

On the Abnormal Spike in AWS ElastiCache CPU Utilization Around Noon on 2022-01-15

Around 11:40 on 2022-01-15, an event was observed in which several nodes of AWS ElastiCache (ap-northeast-1) recorded CPU utilization values well in excess of 100%.

After assessing the situation, we confirmed that there was no particular impact on users.

Summary of the situation during the CPU utilization spike

Nothing in particular was recorded on the Service Health Dashboard
The cache hit rate temporarily dropped by roughly 92% → 78% (-14%)
No 5xx errors occurred in the parts of the application that use Redis
The worker jobs that use Redis were also unaffected
Since the Engine CPU Utilization (= the CPU utilization of the Redis engine thread) was low, it seems there was no impact on the Redis processing itself
Because the CPU Utilization (the CPU utilization of the entire host other than Redis) surged, it appears that AWS made some update to the host
- Could this be related?
  https://aws.amazon.com/jp/about-aws/whats-new/2022/01/amazon-elasticache-streaming-storing-redis-engine-logs/

Reference: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.Redis.html

TODO: I will contact support and add the findings here.

Result of contacting AWS Support

It turned out to be a bug in the metrics. (Phew.)

kenzo0107

About me

On the Abnormal Spike in AWS ElastiCache CPU Utilization Around Noon on 2022-01-15

Summary of the situation during the CPU utilization spike

Result of contacting AWS Support

Like this article? Support the author with

Catalogue