AWS ECS Troubleshooting
While working with ECS, I ran into a number of pitfalls, so I’ve summarized them here.
“started 1 task” runs multiple times, but the container won’t start
1 | $ ecs-cli compose service up ... |
This is a state where, although ecs-cli compose service up attempts to launch the task during deployment, the launch doesn’t succeed.
This can be caused by a problem in the processing that runs when the container starts.
- Check the container logs, and look at the logs around the time the container failed to start.
- For example, there may be a typo or syntax error in the Nginx configuration file or the Rails code.
already using a port required by your task
1 | service hogehoge was unable to place a task because no container instance met all of its requirements. |
The port mapping had been configured as follows.
1 | "portMappings": [ |
Because the new task also tries to use the 0:80 port, this results in an error.
Configuring it as follows allowed me to avoid the problem.
1 | "portMappings": [ |
insufficient memory available
1 | INFO[0031] (service hogehoge) was unable to place a task because no container instance met all of its requirements. The closest matching (container-instance a1b2c3d4-e5f6-g7h8-j9k0-l1m2n3o4p5q6) has insufficient memory available. For more information, see the Troubleshooting section of the Amazon ECS Developer Guide. timestamp=2018-03-09 15:45:24 +0000 UTC |
When a memory shortage like the above appears while running a task update (ecs-cli compose service up),
you need to increase the memory resources, for example by upgrading the instance type or by deleting other tasks.
no space on device
Unable to pull the image due to no space on device.
Check the capacity usage with the df -hT command.
Clean up by forcibly removing unused containers and volumes.
1 | docker system prune -af --volumes |
msg=”Couldn’t run containers” reason=”RESOURCE:CPU”
1 | msg="Couldn't run containers" reason="RESOURCE:CPU" |
The cpu (vCPU) specified in the task is insufficient.
You need to increase the CPU resources, for example by upgrading the instance type or by deleting other tasks.
Fargate - Port Mapping Error
1 | level=error msg="Create task definition failed" error="ClientException: When networkMode=awsvpc, the host ports and container ports in port mappings must match.\n\tstatus code: 400, request id: a1b2c3d4-e5f6-g7h8-j9k0-l1m2n3o4p5q6" |
With the Fargate launch type, a configuration like the following is NG.
1 | ports: |
This one is OK.
1 | ports: |
You need a mapping between the host port and the container port.
Fargate volume_from cannot be used
volume_from cannot be used with Fargate.
1 | level=error msg="Create task definition failed" error="ClientException: host.sourcePath should not be set for volumes in Fargate.\n\tstatus code: 400, request id: a1b2c3d4-e5f6-g7h8-j9k0-l1m2n3o4p5q6" |
The specified IAM Role has not been granted the proper permissions
Grant the appropriate permissions to the IAM Role.
1 | level=info msg="(service hogehoge) failed to launch a task with (error ECS was unable to assume the role 'arn:aws:iam::123456789012:role/ecsTask |
An error saying the image can’t be pulled is also mostly caused by permissions not being granted.
1 | CannotPullContainerError: API error (500): Get https://123456789012.dkr.ecr.ap-northeast-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection" |
Please refer to the permissions of the IAM Role of your currently running ECS. Since they may change, treat this only as a reference and respond using the latest information as appropriate.
1 | { |
That’s all.
I’ll keep adding to this whenever something new comes up.
