Deploying Kubernetes Apps into Alibaba Cloud Container Service

alibaba-cloud-logo-898D58C1CE-seeklogo.comAlibaba Cloud has a managed Kubernetes service called Alibaba Cloud Container Service. As with other distributions of Kubernetes there are some quirks to use it. I have documented the issues I’ve found when trying to run Jenkins X there.

Alibaba Cloud has several options to run Kubernetes:

  • Dedicated Kubernetes: You must create three Master nodes and one or multiple Worker nodes for the cluster
  • Managed Kubernetes: You only need to create Worker nodes for the cluster, and Alibaba Cloud Container Service for Kubernetes creates and manages Master nodes for the cluster
  • Multi-AZ Kubernetes
  • Serverless Kubernetes (beta): You are charged for the resources used by container instances. The amount of used resources is measured according to resource usage duration (in seconds).

You can run in multiple regions across the globe, however to run in the mainland China regions you need a Chinese id or business id. When running there you also have to face the issues of running behind The Great Firewall of China, that is currently blocking some Google services, such as Google Container Registry access, where some Docker images are hosted. DockerHub or Google Storage Service are not blocked.

Creating a Kubernetes Cluster

Alibaba requires several things in order to create a Kubernetes cluster, so it is easier to do it through the web UI the first time.

The following services need to be activated: Container Service, Resource Orchestration Service (ROS), RAM, and Auto Scaling service, and created the Container Service roles.

If we want to use the command line we can install the aliyun cli. I have added all the steps needed below in case you want to use it.

brew install aliyun-cli
aliyun configure
REGION=ap-southeast-1

The clusters need to be created in a VPC, so that needs to be created with VSwitches for each zone to be used.

aliyun vpc CreateVpc \
    --VpcName jx \
    --Description "Jenkins X" \
    --RegionId ${REGION} \
    --CidrBlock 172.16.0.0/12

{
    "ResourceGroupId": "rg-acfmv2nomuaaaaa",
    "RequestId": "2E795E99-AD73-4EA7-8BF5-F6F391000000",
    "RouteTableId": "vtb-t4nesimu804j33p4aaaaa",
    "VRouterId": "vrt-t4n2w07mdra52kakaaaaa",
    "VpcId": "vpc-t4nszyte14vie746aaaaa"
}

VPC=vpc-t4nszyte14vie746aaaaa

aliyun vpc CreateVSwitch \
    --VSwitchName jx \
    --VpcId ${VPC} \
    --RegionId ${REGION} \
    --ZoneId ${REGION}a \
    --Description "Jenkins X" \
    --CidrBlock 172.16.0.0/24

{
    "RequestId": "89D9AB1F-B4AB-4B4B-8CAA-F68F84417502",
    "VSwitchId": "vsw-t4n7uxycbwgtg14maaaaa"
}

VSWITCH=vsw-t4n7uxycbwgtg14maaaaa

Next, a keypair (or password) is needed for the cluster instances.

aliyun ecs ImportKeyPair \
    --KeyPairName jx \
    --RegionId ${REGION} \
    --PublicKeyBody "$(cat ~/.ssh/id_rsa.pub)"

The last step is to create the cluster using the just created VPC, VSwitch and Keypair. It’s important to select the option Expose API Server with EIP (public_slb in the API json) to be able to connect to the API from the internet.

echo << EOF > cluster.json
{
    "name": "jx-rocks",
    "cluster_type": "ManagedKubernetes",
    "disable_rollback": true,
    "timeout_mins": 60,
    "region_id": "${REGION}",
    "zoneid": "${REGION}a",
    "snat_entry": true,
    "cloud_monitor_flags": false,
    "public_slb": true,
    "worker_instance_type": "ecs.c4.xlarge",
    "num_of_nodes": 3,
    "worker_system_disk_category": "cloud_efficiency",
    "worker_system_disk_size": 120,
    "worker_instance_charge_type": "PostPaid",
    "vpcid": "${VPC}",
    "vswitchid": "${VSWITCH}",
    "container_cidr": "172.20.0.0/16",
    "service_cidr": "172.21.0.0/20",
    "key_pair": "jx"
}
EOF

aliyun cs  POST /clusters \
    --header "Content-Type=application/json" \
    --body "$(cat create.json)"

{
    "cluster_id": "cb643152f97ae4e44980f6199f298f223",
    "request_id": "0C1E16F8-6A9E-4726-AF6E-A8F37CDDC50C",
    "task_id": "T-5cd93cf5b8ff804bb40000e1",
    "instanceId": "cb643152f97ae4e44980f6199f298f223"
}

CLUSTER=cb643152f97ae4e44980f6199f298f223

We can now download kubectl configuration with

aliyun cs GET /k8s/${CLUSTER}/user_config | jq -r .config > ~/.kube/config-alibaba
export KUBECONFIG=$KUBECONFIG:~/.kube/config-alibaba

Another detail before being able to install applications that use PersistentVolumeClaims is to configure a default storage class. There are several volume options that can be listed with kubectl get storageclass.

NAME                          PROVISIONER     AGE
alicloud-disk-available       alicloud/disk   44h
alicloud-disk-common          alicloud/disk   44h
alicloud-disk-efficiency      alicloud/disk   44h
alicloud-disk-ssd             alicloud/disk   44h

Each of them matches the following cloud disks:

  • alicloud-disk-common: basic cloud disk (minimum size 5GiB). Only available in some zones (us-west-1a, cn-beijing-b,…)
  • alicloud-disk-efficiency: high-efficiency cloud disk, ultra disk (minimum size 20GiB).
  • alicloud-disk-ssd: SSD disk (minimum size 20GiB).
  • alicloud-disk-available: provides highly available options, first attempts to create a high-efficiency cloud disk. If the corresponding AZ’s efficient cloud disk resources are sold out, tries to create an SSD disk. If the SSD is sold out, tries to create a common cloud disk.

To set SSDs as the default:

kubectl patch storageclass alicloud-disk-ssd \
    -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class":"true"}}}'

NOTE: Alibaba cloud disks must be more than 5GiB (basic) or 20GiB (SSD and Ultra)) so we will need to configure any service that is deployed with PVCs to have that size as a minimum or the PersistentVolumeprovision will fail.

You can continue reading about installing Jenkins X on Alibaba Cloud as an example.

Jenkins China Tour 2019

jenkins-xJenkins 中国巡演. 你好. The next two weeks I’ll be in China participating in the Continuous Delivery Summit, KubeCon and several Jenkins related events, and giving some talks:

Progressive Delivery: Continuous Delivery the Right Way

Progressive Delivery makes it easier to adopt Continuous Delivery, by deploying new versions to a subset of users and evaluating their correctness and performance before rolling them to the totality of the users, and rolled back if not matching some key metrics. Canary deployments is one of the techniques in Progressive Delivery, used in companies like Facebook to roll out new versions gradually. But good news! you don’t need to be Facebook to take advantage of it.

We will demo how to create a fully automated Progressive Delivery pipeline with Canary deployments and rollbacks in Kubernetes using Jenkins X and Flagger, a project that uses Istio and Prometheus to automate Canary rollouts and rollbacks.


Jenkins X: Next Generation Cloud Native CI/CD

Jenkins X is an open source CI/CD platform for Kubernetes based on Jenkins. It runs on Kubernetes and transparently uses on demand containers to run build agents and jobs, and isolate job execution. It enables CI/CD-as-code using automated deployments of commits and pull requests using Skaffold, Helm and other popular open source tools.

Jenkins X integrates Tekton, a new project created at Google and part of the Continuous Delivery Foundation, for serverless CI/CD pipelines. Jobs run in Kubernetes Pods using containers scaling up as needed, and down, so you don’t pay if no jobs are running.

Google Container Registry Service Account Permissions

21046548While testing Jenkins X I hit an issue that puzzled me. I use Kaniko to build Docker images and push them into Google Container Registry. But the push to GCR was failing with

INFO[0000] Taking snapshot of files...
error pushing image: failed to push to destination gcr.io/myprojectid/croc-hunter:1: DENIED: Token exchange failed for project 'myprojectid'. Caller does not have permission 'storage.buckets.get'. To configure permissions, follow instructions at: https://cloud.google.com/container-registry/docs/access-control

During installation Jenkins X creates a GCP Service Account based on the name of the cluster (in my case jx-rocks) called jxkaniko-jx-rocks with roles:

  • roles/storage.admin
  • roles/storage.objectAdmin
  • roles/storage.objectCreator

More roles are added if you install Jenkins X with Vault enabled.

A key is created for the service account and added to Kubernetes as secrets/kaniko-secret containing the service account key json, which is later on mounted in the pods running Kaniko as described in their instructions.

After looking and looking the service account and roles they all seemed correct in the GCP console, but the Kaniko build was still failing. I found a stackoverflow post claiming that the permissions were cached if you had a previous service account with the same name (WAT?), so I tried with a new service account with same permissions and different name and that worked. Weird. So I created a script to replace the service account by another one and update the Kubernetes secret.

ACCOUNT=jxkaniko-jx-rocks
PROJECT_ID=myprojectid

# delete the existing service account and policy binding
gcloud -q iam service-accounts delete ${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com
gcloud -q projects remove-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.admin
gcloud -q projects remove-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectAdmin
gcloud -q projects remove-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectCreator

# create a new one
gcloud -q iam service-accounts create ${ACCOUNT} --display-name ${ACCOUNT}
gcloud -q projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.admin
gcloud -q projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectAdmin
gcloud -q projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com --role roles/storage.objectCreator

# create a key for the service account and update the secret in Kubernetes
gcloud -q iam service-accounts keys create kaniko-secret --iam-account=${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com
kubectl create secret generic kaniko-secret --from-file=kaniko-secret

And it did also work, so no idea why it was failing, but at least I’ll remember now how to manually cleanup and recreate the service account.