Deployments using Traffic Splitting

Overview

You can use traffic splitting for most deployment scenarios, including canary, blue-green, etc. The ability to control traffic flow to different versions of an application makes it easy to roll out a new application version with minimal effort and interruption to production traffic.

Before You Begin

  1. Install kubectl.

  2. Set up a Kubernetes cluster with NGINX Service Mesh deployed. This guide assumes automatic injection is enabled either cluster wide or for the default namespace. If automatic injection is disabled all the necessary resources have to be manually injected. Refer to Automatic Proxy Injection and Manual Proxy injection.

    Note:
    This tutorial assumes traffic can be sent to injected Pods and Services without mTLS sessions. For the purposes of this tutorial NGINX Service Mesh must be deployed with --mtls-mode in the permissive or off states.

  3. Download all the example files:

Objectives

Follow the steps in this guide to learn how to use traffic splitting for various deployment strategies.

Deploy the Production Version of the Target App

  1. First, let’s begin by deploying the “production” v1.0 target app, the load balancer service, and the ingress gateway.

    Tip:
    For simplicity, this guide uses a simple NGINX reverse proxy for the ingress gateway. For production usage and for more advanced ingress control, we recommend using the NGINX Ingress Controller for Kubernetes. Refer to Deploy NGINX Ingress Controller with NGINX Service Mesh to learn more.

    Command:

    kubectl apply -f target-svc.yaml -f target-v1.0.yaml -f gateway.yaml
    

    Expectation: All pods and services deploy successfully.

    Use kubectl to make sure the pods and services deploy successfully.

    Example:

    $ kubectl get pods
    NAME                           READY   STATUS    RESTARTS   AGE
    gateway-58c6c76dd-4mmht        2/2     Running   0          2m
    target-v1-0-6f69fc48f6-mzcf2   2/2     Running   0          2m
    
    $ kubectl get svc
    NAME          TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
    gateway-svc   LoadBalancer   10.0.0.2        1.2.3.4         80:30975/TCP   2m
    target-svc    ClusterIP      10.0.0.3        <none>          80/TCP         2m
    target-v1-0   ClusterIP      10.0.0.4        <none>          80/TCP         2m
    

    To better understand what is going on here, let’s take a quick look at what we deployed here:

    • gateway: simple NGINX reverse proxy that forwards traffic to the target app. Besides providing a single point of ingress to the cluster, using the gateway lets us use the nginx-meshctl top command to check traffic metrics between it and the backend services it is sending traffic to.
    • target-svc: the root service that connects to all the different versions of the target app.
    • target: for our example we will be deploying 3 different versions of the target app. The target app is a basic NGINX server that returns the target version. Each one has its own service tagged with its version number. These are the services that the root target-svc sends requests to.
  2. Once the pods and services are ready, generate traffic to target-svc. Use a different bash window for this step so you can watch the traffic change as you are doing the deployments.

    Commands:

    • Get the external IP for gateway-svc:

      kubectl get svc gateway-svc
      
    • Start a loop that sends a request to that IP once per second for 5 minutes. Rerun as needed:

      for i in $(seq 1 300); do curl <external IP>; sleep 1; done
      

    Expectation: Requests will start to come in to target-svc. At this point you should only see target v1.0 responses.

  3. Back in your original bash window, use the mesh CLI to check traffic metrics.

    Command: nginx-meshctl top
    Expectation: The target-v1-0 deployment will show 100% incoming success rate and the gateway deployment will show 100% outgoing success rate. The top command only shows traffic from the last 30s. top provides a quick look at your services for immediate debugging and to see if there’s any anomalies that need further investigation. For more detailed and accurate traffic monitoring, we recommend using Grafana. Refer to traffic metrics for details.

    Example:

    $ nginx-meshctl top
    Deployment   Incoming Success  Outgoing Success  NumRequests
    gateway                        100.00%           10
    target-v1-0  100.00%                             10
    

Deploy a New Version of the Target App using a Canary Deployment

Using traffic splits we can use a variety of deployment strategies. Whether using a blue-green deployment, canary deployment, or a hybrid of different deployment strategies, traffic splits make the process extremely easy.

For this version of the target app, let’s try using a canary deployment strategy.

  1. Apply the traffic split so that once a new version is deployed, it will not receive any traffic until we are ready. Ideally we would apply this at the same time as the first target version, target-svc, and gateway. To make it easier to see what is happening though, we are applying it in this separate step.

    Command:

    kubectl apply -f trafficsplit.yaml
    

    Expectation: The traffic split is applied successfully. Use kubectl get ts to see the current traffic splits.

    Use kubectl describe ts <traffic split name> to see details about a specific traffic split. Currently the traffic split is configured to send 100% of traffic to target v1.0.

    apiVersion: split.smi-spec.io/v1alpha2
    kind: TrafficSplit
    metadata:
    name: target-ts
    spec:
    service: target-svc
    backends:
    - service: target-v1-0
        weight: 100
    
  2. Now let’s deploy target v2.0. To show a scenario where an upgrade is failing, this version of target is configured to return a 500 error status code instead of a successful 200.

    Command:

    kubectl apply -f target-v2.0-failing.yaml
    

    Expectation: Target v2.0 will deploy to the cluster successfully. You should see the new target-v2-0 pod and svc in the kubectl get pods/kubectl get svc output. Since we deployed the traffic split, if you look at your other bash window where the traffic is being generated you should still only see responses from target v1.0. If you check nginx-meshctl top you should see the same deployments as before. This is because no traffic has been sent to or received from target v2.0.

  3. For this deployment we’ll send 10% of traffic to target v2.0 while 90% is still going to target v1.0. Open trafficsplit.yaml in the editor of your choice and add a new backend for target-v2-0 with a weight of 10. Change the weight of target-v1-0 to 90.

    apiVersion: split.smi-spec.io/v1alpha2
    kind: TrafficSplit
    metadata:
    name: target-ts
    spec:
    service: target-svc
    backends:
    - service: target-v1-0
        weight: 90
    - service: target-v2-0
        weight: 10
    
  4. After updating trafficsplit.yaml, save and apply it.

    Command:

    kubectl apply -f trafficsplit.yaml
    

    Expectation: After applying the updated traffic split, you should start seeing responses from target v2.0 in the other bash where traffic is being generated. Because of the weight we set in the previous step, about 1 out of 10 requests will be sent to v2.0. Something to keep in mind is that these are weighted, so it will not be exactly 1 in 10, but it will be close.

  5. Check the traffic metrics now that v2.0 is available.

    Command:

    nginx-meshctl top
    

    Expectation:

    • target-v1-0 deployment will still show 100% incoming success rate
    • target-v2-0 deployment will show 0% incoming success rate
    • gateway deployment will show the appropriate percentage of successful outgoing requests

    Example:

    $ nginx-meshctl top
    Deployment   Incoming Success  Outgoing Success  NumRequests
    gateway                        90.00%            10
    target-v1-0  100.00%                             9
    target-v2-0  0.00%                               1
    
  6. It looks like v2.0 doesn’t work! We can see that because the incoming success rate to target-v2 is 0%. Thankfully, using traffic splitting, it is easy to redirect all traffic back to v1.0 without doing a complicated rollback. To update the traffic split, simply update trafficsplit.yaml to send 100% of traffic to v1.0 and 0% of traffic to v2.0 and re-apply it.

    You can either explicitly set the weight of target-v2-0 to 0 or remove the target-v2-0 backend completely. The result will be the same.

    At this point you can delete v2.0 from the cluster – kubectl delete -f target-v2.0-failing.yaml – or leave it as-is.

Deploy a New Version of the Target App using a Blue-Green Deployment

For this version of the target app, let’s use a blue-green deployment.

  1. Deploy v2.1 of target, which fixes the issue causing the failing requests that we saw in v2.0.

    Command:

    kubectl apply -f target-v2.1-successful.yaml
    

    Expectation: Target v2.1 will deploy successfully. You should see the new target-v2-1 pod and svc in the kubectl get pods/kubectl get svc output. Just as with target-v2-0 though, we have the traffic split configured to send all traffic to target-v1-0 until we are ready to do the actual deployment and make target-v2-1 available for traffic.

  2. Since we are doing a blue-green deployment, we will configure the traffic split to send all traffic to target v2.1. Open trafficsplit.yaml in the editor of your choice and add a new backend for target-v2-1 with a weight of 100. Change the weight of target-v1-0 to 0. You could also delete the target-v1-0 backend completely, but with this type of deployment it’s easier to set the weight to 0 in case you need to roll back quickly.

    apiVersion: split.smi-spec.io/v1alpha2
    kind: TrafficSplit
    metadata:
    name: target-ts
    spec:
    service: target-svc
    backends:
    - service: target-v1-0
        weight: 0
    - service: target-v2-1
        weight: 100
    
  3. After updating trafficsplit.yaml, save and apply it.

    Command:

    kubectl apply -f trafficsplit.yaml
    

    Expectation: After applying the updated traffic split, you should start seeing responses from target v2.1 in the other bash where traffic is being generated. Because of the weight we set in the previous step, all traffic should be going to v2.1.

  4. Check the traffic metrics now that v2.1 is available.

    Command:

    nginx-meshctl top
    

    Expectation:

    • target-v1-0 deployment will not show up, although keep in mind that it will take a bit for the previous requests to move out of the 30s metric window. If you see target-v1-0, try again in 30s or so.
    • target-v2-1 deployment will show 100% incoming success rate
    • gateway deployment will show 100% outgoing success rate

    Example:

    $ nginx-meshctl top
    Deployment   Incoming Success  Outgoing Success  NumRequests
    gateway                        100.00%           10
    target-v2-1  100.00%                             10
    
  5. Since target v2.1 is working as expected, we can delete v1.0 from the cluster. If v2.1 had started failing, we could have quickly rolled back to v1.0 just as we did earlier.

Summary

These are just a couple examples of how you can use traffic splits for a deployment. Whether you want to do a gradual roll out of 5% increments or send 5% to two staging backends while 90% goes to production or any other combination of splits, traffic splits offer a convenient way to handle almost any deployment strategy you need.

Resources