-
Notifications
You must be signed in to change notification settings - Fork 312
(3.0.0 3.1.3) build image creates invalid images when using aws cdk.aws imagebuilder 1.153.0
ParallelCluster depends on the aws-cdk
Python library. It is installed automatically when installing the aws-parallelcluster
package from PyPI.
The latest version of aws-cdk
(1.153.0
) changed its internal behaviour in the aws-cdk.aws-imagebuilder
library, causing the build image process, executed by the pcluster build-image
command, to create invalid images.
You can verify if you are affected by this issue by checking the installed version of the aws-imagebuilder
library in the environment on which you have the ParallelCluster CLI installed using the following command:
$ pip freeze | grep imagebuilder
aws-cdk.aws-imagebuilder==1.153.0
When using the 1.153.0
version of the library, the pcluster build-image
execution doesn’t return any error, the AMI is created but the image won’t be available in the list of ParallelCluster images, because internal information are missing (e.g. tags for ParallelCluster version, image id, etc).
This means you won’t see the image in the output of the pcluster describe-image
and pcluster list-images
commands and you cannot delete it with pcluster-delete
command.
$ pcluster build-image -c image-config.yaml -i test
{
"image": {
"imageId": "test",
"imageBuildStatus": "BUILD_IN_PROGRESS",
"cloudformationStackStatus": "CREATE_IN_PROGRESS",
"cloudformationStackArn": "arn:aws:cloudformation:eu-west-1:xxx:stack/test/e5d0c250-c232-11ec-bc7a-0661d2293f0d",
"region": "eu-west-1",
"version": "3.1.3"
}
}
# After about 1h the CFN stack will be deleted but the image will not be available
$ pcluster describe-image -i test
{
"message": "No image or stack associated with ParallelCluster image id: test."
}
Note: If you created any AMI using this command with the mentioned version of the aws-imagebuilder
library, you need to manually delete them from the AWS console because the delete-image
won’t be able to find and delete them.
All versions of ParallelCluster >= 3.0.0 are affected when using aws-cdk.aws-imagebuilder==1.153
.
All the versions of aws-cdk.aws-imagebuilder
library < 1.153.0 or >= 1.153.1 work as expected. You can fix the issue by downgrading all the aws-cdk
python libraries or upgrading all of them by executing the following steps:
- Double check that the installed version of CDK image builder library matches the erroneous version (i.e.
1.153.0
)
$ pip freeze | grep imagebuilder
aws-cdk.aws-imagebuilder==1.153.0
- Create a new
requirements.txt
file with the following content:
aws-cdk.core>=1.153.1
aws-cdk.aws-batch>=1.153.1
aws_cdk.aws-cloudwatch>=1.153.1
aws-cdk.aws-codebuild>=1.153.1
aws-cdk.aws-dynamodb>=1.153.1
aws-cdk.aws-ec2>=1.153.1
aws-cdk.aws-efs>=1.153.1
aws-cdk.aws-events>=1.153.1
aws-cdk.aws-fsx>=1.153.1
aws-cdk.aws-imagebuilder>=1.153.1
aws-cdk.aws-iam>=1.153.1
aws_cdk.aws-lambda>=1.153.1
aws-cdk.aws-logs>=1.153.1
aws-cdk.aws-route53>=1.153.1
aws-cdk.aws-ssm>=1.153.1
aws-cdk.aws-sqs>=1.153.1
aws-cdk.aws-cloudformation>=1.153.1
- Install the
aws-cdk
library with the new requirements file:
pip install -r requirements.txt
- Verify that the next version of the library is correctly installed.
$ pip freeze | grep imagebuilder
aws-cdk.aws-imagebuilder==1.153.1
Now you may use use the pcluster build-image
command and the created image will be available as expected:
$ pcluster build-image -c image-config.yaml -i test-working
# After about 1h the image will be created and can be described succeessfully
$ pcluster describe-image -i test-working
{
"imageConfiguration": {
"url": "..."
},
"imageId": "test-working",
"creationTime": "2022-04-22T18:52:26.000Z",
"imageBuildStatus": "BUILD_COMPLETE",
"region": "eu-west-1",
...
The CDK library is used by the build-image
command to generate the CloudFormation template with all the resources required for the build process.
When using aws-cdk.aws-imagebuilder==1.153.0
the AmiDistributionConfiguration
field created by the library in the generated template is an empty dictionary, while it should contain important information like the tags that are used by the ParallelCluster CLI commands to manage the created image as seen below:
...
DistributionConfiguration:
DependsOn:
- DeleteStackFunctionExecutionRole
Properties:
Distributions:
- AmiDistributionConfiguration: {}
Region:
Ref: AWS::Region
...
To check if the template generated by the build-image
process is problematic, retrieve it with AWS CLI. If the AmiDistributionConfiguration
is an empty dict, the build process will create broken images.
$ pcluster build-image -c image-config.yaml -i test
$ aws cloudformation get-template --stack-name test --output text | grep AmiDistributionConfiguration -A 3
- AmiDistributionConfiguration: {}
Region:
Ref: AWS::Region
In a working version, the AmiDistributionConfiguration
contains a list of tags and the name of the image:
$ aws cloudformation get-template --stack-name test-working --output text | grep AmiDistributionConfiguration -A 18
- AmiDistributionConfiguration:
AmiTags:
parallelcluster:build_config: s3://parallelcluster-xxx-v1-do-not-delete/parallelcluster/3.1.3/images/test-working-0iwk3tixcvrltuop/configs/image-config.yaml
parallelcluster:build_log:
Fn::Join:
- ''
- - 'arn:'
- Ref: AWS::Partition
- ':logs:eu-west-1:'
- Ref: AWS::AccountId
- :log-group:/aws/imagebuilder/ParallelClusterImage-test-working
parallelcluster:image_id: test-working
parallelcluster:image_name: test-alinux2
parallelcluster:s3_bucket: parallelcluster-xxx-v1-do-not-delete
parallelcluster:s3_image_dir: parallelcluster/3.1.3/images/test-working-0iwk3tixcvrltuop
parallelcluster:version: 3.1.3
Name: test-alinux2 {{ imagebuilder:buildDate }}
Region:
Ref: AWS::Region