Progressive Delivery in Kubernetes: Blue-Green and Canary Deployments

kubernetesProgressive Delivery is the next step after Continuous Delivery, where new versions are deployed to a subset of users and are evaluated in terms of correctness and performance before rolling them to the totality of the users and rolled back if not matching some key metrics.

There are some interesting projects that make this easier in Kubernetes, and I’m going to talk about three of them that I took for a spin with a Jenkins X example project: Shipper, Istio and Flagger.


Shipper is a project from extending Kubernetes to add sophisticated rollout strategies and multi-cluster orchestration (docs). It supports deployments from one to multiple clusters, and allows multi-region deployments.

Shipper is installed with a cli shipperctl, that pushes the configuration of the different clusters to manage. Note this issue with GKE contexts.

Shipper uses Helm packages for deployment but they are not installed with Helm, they won’t show in helm list. Also, deployments must be version apps/v1 or shipper will not edit the deployment to add the right labels and replica count.

Rollouts with Shipper are all about transitioning from an old Release, the incumbent, to a new Release, the contender. This is achieved by creating a new Application object that defines the n stages that the deployment goes through. For example for a 3 step process:

  1. Staging: Deploy the new version to one pod, with no traffic
  2. 50/50: Deploy the new version to 50% of the pods and 50% of the traffic
  3. Full on: Deploy the new version to all the pods and all the traffic
     - name: staging
         contender: 1
         incumbent: 100
         contender: 0
         incumbent: 100
     - name: 50/50
         contender: 50
         incumbent: 50
         contender: 50
         incumbent: 50
     - name: full on
         contender: 100
         incumbent: 0
         contender: 100
         incumbent: 0

If a step in the release does not send traffic to the pods they can be accessed with kubectl port-forward, ie. kubectl port-forward mypod 8080:8080, which is useful for testing before users can see the new version.

Shipper supports the concept of multiple clusters, but treats all clusters the same way, only using regions and filter by capabilities (set in the cluster object), so there’s no option to have dev, staging, prod clusters with just one Application object. But we could have two application objects

  • myapp-staging deploys to region “staging
  • myapp deploys to other regions

In GKE you can easily configure a multi cluster ingress that will expose the service running in multiple clusters and serve from the cluster closest to your location.


The main limitations in Shipper:

  • Chart restrictions: The Chart must have exactly one Deployment object. The name of the Deployment should be templated with {{.Release.Name}}. The Deployment object should have apiVersion: apps/v1.
  • Pod-based traffic shifting: there is no way to have fine grained traffic routing, ie. send 1% of the traffic to the new version, it is based on the number of pods running.
  • New Pods don’t get traffic if Shipper is not working


Istio is not a deployment tool but a service mesh. However it is interesting as it has become very popular and allows traffic management, for example sending a percentage of the traffic to a different service and other advanced networking.

In GKE it can be installed by just checking the box to enable Istio in the cluster configuration. In other clusters it can be installed manually or with Helm.

With Istio we can create a Gateway that processes all external traffic through the Ingress Gateway and create VirtualServices that manage the routing to our services. In order to do that just find the ingress gateway ip address and configure a wildcard DNS for it. Then create the Gateway that will route all external traffic through the Ingress Gateway

kind: Gateway
 name: public-gateway
 namespace: istio-system
   istio: ingressgateway
 - port:
     number: 80
     name: http
     protocol: HTTP
   - "*"

Istio does not manage the app lifecycle just the networking. We can create a Virtual Service to send 1% of the traffic to the service deployed in a pull request or in the master branch, for all requests coming to the Ingress Gateway.

kind: VirtualService
 name: croc-hunter-jenkinsx
 namespace: jx-production
 - public-gateway.istio-system.svc.cluster.local
 - mesh
 - route:
   - destination:
       host: croc-hunter-jenkinsx.jx-production.svc.cluster.local
         number: 80
     weight: 99
   - destination:
       host: croc-hunter-jenkinsx.jx-staging.svc.cluster.local
         number: 80
     weight: 1


Flagger is a project sponsored by WeaveWorks using Istio to automate canarying and rollbacks using metrics from Prometheus. It goes beyond what Istio provides to automate progressive rollouts and rollbacks based on metrics.

Flagger requires Istio installed with Prometheus, Servicegraph and configuration of some systems, plus the installation of the Flagger controller itself. It also offers a Grafana dashboard to monitor the deployment progress.


The deployment rollout is defined by a Canary object that will generate primary and canary Deployment objects. When the Deployment is edited, for instance to use a new image version, the Flagger controller will shift the loads from 0% to 50% with 10% increases every minute, then it will shift to the new deployment or rollback if metrics such as response errors and request duration fail.


This table summarizes the strengths and weaknesses of both Shipper and Flagger in terms of a few Progressive Delivery features.

Shipper Flagger
Traffic routing Bare k8s balancing as % of pods Advanced traffic routing with Istio (% of requests)
Deployment progress UI No Grafana Dashboard
Deployments supported Helm charts with strong limitations Any deployment
Multi cluster deployment Yes No
Canary or blue/green in different namespace (ie. jx-staging and jx-production) No No, but the VirtualService could be manually edited to do it
Canary or blue/green in different cluster Yes, but with a hack, using a new Application and link to a new “region” Maybe with Istio multicluster ?
Automated rollout No, operator must manually go through the steps Yes, 10% traffic increase every minute, configurable
Automated rollback No, operator must detect error and manually go through the steps Yes, based on Prometheus metrics
Requirements None Istio, Prometheus
Alerts Slack

To sum up, I see Shipper’s value on multi-cluster management and simplicity, not requiring anything other than Kubernetes, but it comes with some serious limitations.

Flagger really goes the extra mile automating the rollout and rollback, and fine grain control over traffic, at a higher complexity cost with all the extra services needed (Istio, Prometheus).

Find the example code for Shipper, Istio and Flagger.

Jenkins Kubernetes Plugin: 2018 in Review

kubernetesLast year has been quite prolific for the Jenkins Kubernetes Plugin, with 55 releases and lots of external contributions!

In 2019 there will be a push for Serverless Jenkins and with that a shift to make agents work better in a Kubernetes environment, with no persistent jnlp connections. You can watch my Jenkins X and Serverless Jenkins demo at Kubecon.

Main changes in the Kubernetes plugin in 2018:

  • Allow creating Pod templates from yaml. This allows setting all possible fields in Kubernetes API using yaml
  • Add yamlFile option for Declarative agent to read yaml definition from a different file
  • Support multiple containers in declarative pipeline
  • Support passing kubeconfig file as credentials using secretFile credentials
  • Show pod logs and events in the Jenkins node page
  • Add optional usage restriction for a Kubernetes cloud using folder properties
  • Add Pod Retention policies to keep pods around on failure
  • Validate label and container names with regex
  • Add option to apply caps only on alive pods
  • Split credentials classes into new plugin kubernetes-credentials

Full Changelog

2018-12-31 kubernetes-1.14.2
2018-12-24 kubernetes-1.14.1
2018-12-19 kubernetes-1.14.0
2018-12-19 kubernetes-1.13.9
2018-12-13 kubernetes-1.13.8
2018-11-30 kubernetes-1.13.7
2018-11-23 kubernetes-1.13.6
2018-10-31 kubernetes-1.13.5
2018-10-30 kubernetes-1.13.4
2018-10-30 kubernetes-1.13.3
2018-10-24 kubernetes-1.13.2
2018-10-23 kubernetes-1.13.1
2018-10-19 kubernetes-1.13.0
2018-10-17 kubernetes-1.12.9
2018-10-17 kubernetes-1.12.8
2018-10-11 kubernetes-1.12.7
2018-09-07 kubernetes-1.12.6
2018-09-07 kubernetes-1.12.5
2018-08-28 kubernetes-1.12.4
2018-08-09 kubernetes-1.12.3
2018-08-07 kubernetes-1.12.2
2018-08-06 kubernetes-1.12.1
2018-07-31 kubernetes-1.12.0
2018-07-31 kubernetes-1.11.0
2018-07-23 kubernetes-1.10.2
2018-07-16 kubernetes-1.10.1
2018-07-11 kubernetes-1.10.0
2018-07-11 kubernetes-1.9.3
2018-06-26 kubernetes-1.9.2
2018-06-26 kubernetes-1.9.1
2018-06-26 kubernetes-1.9.0
2018-06-22 kubernetes-1.8.4
2018-06-22 kubernetes-1.8.3
2018-06-19 kubernetes-1.8.2
2018-06-13 kubernetes-1.8.1
2018-06-13 kubernetes-1.8.0
2018-05-30 kubernetes-1.7.1
2018-05-30 kubernetes-1.7.0
2018-05-29 kubernetes-1.6.4
2018-05-25 kubernetes-1.6.3
2018-05-23 kubernetes-1.6.2
2018-05-22 kubernetes-1.6.1
2018-04-25 kubernetes-1.6.0
2018-04-16 kubernetes-1.5.2
2018-04-09 kubernetes-1.5.1
2018-04-01 kubernetes-1.5
2018-03-28 kubernetes-1.4.1
2018-03-21 kubernetes-1.4
2018-03-16 kubernetes-1.3.3
2018-03-07 kubernetes-1.3.2
2018-02-21 kubernetes-1.3.1
2018-02-21 kubernetes-1.3
2018-02-16 kubernetes-1.2.1
2018-02-02 kubernetes-1.2
2018-01-29 kubernetes-1.1.4
2018-01-10 kubernetes-1.1.3

KubeCon: Jenkins X: Continuous Delivery for Kubernetes


The video of my talk at KubeCon 2018 Seattle

Jenkins X is a new open source CI/CD platform for Kubernetes based on Jenkins.
Jenkins X runs on Kubernetes and transparently uses on demand containers to run build agents and jobs, and isolate job execution. It enables CI/CD-as-code using Jenkins Pipelines and automated deployments of commits and pull requests using Skaffold, Helm and other popular tools. We will demo how to use Jenkins X on any Kubernetes cluster for fully automated CI and CD using a GitOps approach.

JavaZone: Using Kubernetes for Continuous Integration and Continuous Delivery

javazone@2x-luftThe video of my talk at JavaZone 2018

Building and testing is a great use case for containers, both due to the dynamic and isolation aspects. We will share our experience running Jenkins at scale using Kubernetes

Jenkins is an example of an application that can take advantage of Kubernetes technology to run Continuous Integration and Continuous Delivery workloads. Jenkins and Kubernetes can be integrated to transparently use on demand containers to run build agents and jobs, and isolate job execution. It also supports CI/CD-as-code using Jenkins Pipelines and automated deployments to Kubernetes clusters. The presentation and demos will allow a better understanding of how to use Jenkins on Kubernetes for container based, totally dynamic, large scale CI and CD.

Installing kube2iam in AWS Kubernetes EKS Cluster


This is a follow up to Installing kube2iam in AWS Kubernetes Kops Cluster.

kube2iam allows a Kubernetes cluster in AWS to use different IAM roles for each pod, and prevents pods from accessing EC2 instance IAM roles.


Edit the node IAM role (ie. ​EKS-attractive-party-000-D-NodeInstanceRole-XXX) to allow nodes to assume different roles, changing the account id 123456789012 to yours or using "Resource": "*"
    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": [

Install kube2iam using the helm chart

helm install stable/kube2iam --name my-release \
  --namespace kube-system \

Note the eni+ host interface name.

A curl to the metadata server from a new pod should return kube2iam

$ curl

Role configuration

Create the roles that the pods can assume. They must start with k8s- (see the wildcard we set in the Resource above) and contain a trust relationship to the node pool role.

For instance, to allow access to the S3 bucket mybucket from a pod, create a role k8s-s3.

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "s3bucketActions",
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::mybucket",

Then edit the trust relationship of the role to allow the node role (the role used by your nodes Auto Scaling Goup) to assume this role.

  "Version": "2012-10-17",
  "Statement": [
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": ""
      "Action": "sts:AssumeRole"
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/"
      "Action": "sts:AssumeRole"

Test it by launching a pod with the right annotations

apiVersion: v1
kind: Pod
  name: aws-cli
    name: aws-cli
  annotations: k8s-s3
  - image: fstab/aws-cli
      - "/home/aws/aws/env/bin/aws"
      - "s3"
      - "ls"
      - "some-bucket"
    name: aws-cli

Securing namespaces

kube2iam supports namespace restrictions so users can still launch pods but are limited to a predefined set of IAM roles that can assume.

apiVersion: v1
kind: Namespace
  annotations: |
  name: default

Google Cloud Next Recap

google-next-logoSeveral interesting announcements from last week Google Next conference.

Knative, a new OSS project built by Google, Red Hat, IBM,… to build, deploy, and manage modern serverless workloads on Kubernetes. Built upon Istio, with 1.0 coming soon and managed Istio on GCP. It includes a build primitive to manage source to kubernetes flows, that can be used independently. Maybe it is the new standard to define sources and builds in Kubernetes. Read more from Mark Chmarny.

GKE on premise, a Google-configured version of Kubernetes with multi-cluster management, running on top of VMware’s vSphere.

Another Kubernetes related mention was the gVisor pod sandbox, with experimental support for Kubernetes, to allow running sandboxed containers in a Kubernetes cluster. Very interesting for multi-tenant clusters and docker image builds.

Cloud Functions are now Generally Available, and more serverless features are launched:

Serverless containers allow you to run container-based workloads in a fully managed environment and still only pay for what you use. Sign up for an early preview of serverless containers on Cloud Functions to run your own containerized functions on GCP with all the benefits of serverless.

A new GKE serverless add-on lets you run serverless workloads on Kubernetes Engine with a one-step deploy. You can go from source to containers instantaneously, auto-scale your stateless container-based workloads, and even scale down to zero.

Cloud Build, a fully-managed CI/CD platform that lets you build and test applications in the cloud. With an interesting approach where all the pipeline steps are containers themselves so it is reasonably easy to extend. It integrates with GitHub for repos with a Dockerfile (let’s see if it lasts long after Microsoft acquisition).

Other interesting announcements include:

  • Edge TPU, a tiny ASIC chip designed to run TensorFlow Lite ML models at the edge.
  • Shielded VMs – untampered virtual machines

  • Titan Security Key, a FIDO security key with firmware developed by Google. Google security was giving away at the conference both NFC and bluetooth keys, a good replacement for the yubikeys specially for mobile devices.

Sending Kubernetes Logs to CloudWatch Logs using Fluentd

fluentd-logofluentd can send all the Kubernetes or EKS logs to CloudWatch Logs to have a centralized and unified view of all the logs from the cluster, both from the nodes and from each container stdout.


To send all nodes and container logs to CloudWatch, create a CloudWatch log group named kubernetes.

aws logs create-log-group --log-group-name kubernetes

Then install fluentd-cloudwatch helm chart. This will send logs from node, containers, etcd,… to CloudWatch as defined in the default fluentd chart config.

helm install --name fluentd incubator/fluentd-cloudwatch \
  --set awsRegion=us-east-1,rbac.create=true

Each node needs to have permissions to write to CloudWatch Logs, so either add the permission using IAM instance profiles or pass the awsRole if your are using kube2iam.

helm install --name fluentd incubator/fluentd-cloudwatch \
  --set awsRole=arn:aws:iam::123456789012:role/k8s-logs,awsRegion=us-east-1,rbac.create=true,extraVars[0]="{ name: FLUENT_UID, value: '0' }"

The k8s-logs role policy is configured as

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "logs",
            "Effect": "Allow",
            "Action": [
            "Resource": [

Now you can go to CloudWatch and find your logs.