While testing Jenkins X I hit an issue that puzzled me. I use Kaniko to build Docker images and push them into Google Container Registry. But the push to GCR was failing with
INFO[0000] Taking snapshot of files...
error pushing image: failed to push to destination gcr.io/myprojectid/croc-hunter:1: DENIED: Token exchange failed for project 'myprojectid'. Caller does not have permission 'storage.buckets.get'. To configure permissions, follow instructions at: https://cloud.google.com/container-registry/docs/access-control
During installation Jenkins X creates a GCP Service Account based on the name of the cluster (in my case jx-rocks
) called jxkaniko-jx-rocks
with roles:
roles/storage.admin
roles/storage.objectAdmin
roles/storage.objectCreator
More roles are added if you install Jenkins X with Vault enabled.
A key is created for the service account and added to Kubernetes as secrets/kaniko-secret
containing the service account key json, which is later on mounted in the pods running Kaniko as described in their instructions.
After looking and looking the service account and roles they all seemed correct in the GCP console, but the Kaniko build was still failing. I found a stackoverflow post claiming that the permissions were cached if you had a previous service account with the same name (WAT?), so I tried with a new service account with same permissions and different name and that worked. Weird. So I created a script to replace the service account by another one and update the Kubernetes secret.
ACCOUNT=jxkaniko-jx-rocks PROJECT_ID=myprojectid # delete the existing service account and policy binding gcloud -q iam service-accounts delete ${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com gcloud -q projects remove-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.admin gcloud -q projects remove-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectAdmin gcloud -q projects remove-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectCreator # create a new one gcloud -q iam service-accounts create ${ACCOUNT} --display-name ${ACCOUNT} gcloud -q projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.admin gcloud -q projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectAdmin gcloud -q projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectCreator # create a key for the service account and update the secret in Kubernetes gcloud -q iam service-accounts keys create kaniko-secret --iam-account=${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com kubectl create secret generic kaniko-secret --from-file=kaniko-secret
And it did also work, so no idea why it was failing, but at least I’ll remember now how to manually cleanup and recreate the service account.
Container Registry only recognizes permissions set on the Cloud Storage bucket. Container Registry will ignore permissions set on individual objects within the Cloud Storage bucket.
Thanks for this! I’ve also got a gist that has a bit more detail on the issue and the fix.
GCP cached permissions issue
GCP has an issue which surfaces when service accounts are recreated with the same name but without the old policies being removed. It is confusing because the GUI and CLI will show that permissions are there and it will even let you re-add them BUT, anytime you try to do something that requires the permissions it won’t work. For example, if you try to push an image it may say that you don’t have
storage.buckets.get
even thought everything shows that you are part ofstorage.admin
.Reproducing the issue
Set the values to match your environment
Create a service account with the required permissions and generate a key for it. Let’s create a script called
create.sh
that does exactly that.When we run the script we will see the expected permissions
Now let’s just check that an image can be pushed
Cool. It worked. Ok, now let’s delete the account and re-run
create.sh
That’s odd, now when we re-run
docker login -u _json_key -p "$(cat key.json)" https://gcr.io && docker push $IMAGE_NAME
it is failing with the following issue.But it shows that we have the right permissions in the GUI and in the CLI!!!
Fixing the issue
The issue is caused by the old permission hanging around. In fact, even if we re-create the service account and don’t add any permissions, the old permissions show in the GUI and CLI. And if we try to add the permission, it will allow it but it won’t actually be applied.
To fix, we have to make sure we delete the service and remove the permissions before we recreate it.
After the above is ran we can re-run
create.sh
, wait for about 10 seconds and then try to login and push (i.e.docker login -u _json_key -p "$(cat key.json)" https://gcr.io && docker push $IMAGE_NAME
), it will be successful. If not, wait a bit longer and do the push again, it will work.GCP cached permissions issue.md
hosted with ❤ by GitHub