Linkerd Traffic Split for A/B Testing of AI Models on Kubernetes

Question

Pulumi · Accepted Answer

To implement A/B testing of AI models on Kubernetes with Linkerd, you will need to orchestrate multiple services that run different versions of your models and then use Linkerd's traffic splitting feature to direct a portion of the traffic to each model. With Linkerd, Kubernetes does not directly provide a `TrafficSplit` resource; however, we can define custom resources to configure traffic splits. These manifests can then be applied to your cluster using Pulumi's Kubernetes provider.

Below, I'll outline a program that sets up a traffic split for A/B testing between two versions of an AI model service running on Kubernetes, with the assumption that you have Linkerd installed in your Kubernetes cluster.

Here is what we will do:

1. Define two Kubernetes `Deployment` resources, one for each version of your AI model serving application.
2. Create Kubernetes `Service` resources for both versions for in-cluster service discovery.
3. Set up a Linkerd `TrafficSplit` custom resource, which will define how traffic should be divided between the two services.

Let's begin with the program:

```python
import pulumi
import pulumi_kubernetes as k8s
from pulumi_kubernetes.apiextensions.CustomResource import CustomResource

# The context for the Kubernetes provider may need to be set explicitly
# if not using the default context in ~/.kube/config
# k8s.Provider('my_provider', kubeconfig='<kubeconfig_content>')

# Version 1 of the AI model Deployment and its Service
ai_model_v1 = k8s.apps.v1.Deployment(
    "ai-model-v1",
    spec={
        "selector": {"matchLabels": {"app": "ai-model", "version": "v1"}},
        "replicas": 1,
        "template": {
            "metadata": {"labels": {"app": "ai-model", "version": "v1"}},
            "spec": {
                "containers": [{
                    "name": "model",
                    "image": "my_model_v1:latest",
                    # Define other container properties such as ports, env, volumeMounts etc.
                }],
            },
        },
    },
)

ai_model_service_v1 = k8s.core.v1.Service(
    "ai-model-service-v1",
    spec={
        "selector": {"app": "ai-model", "version": "v1"},
        "ports": [{"port": 80, "targetPort": 8080}],  # Replace with your container's port
    },
)

# Version 2 of the AI model Deployment and its Service
ai_model_v2 = k8s.apps.v1.Deployment(
    "ai-model-v2",
    spec={
        "selector": {"matchLabels": {"app": "ai-model", "version": "v2"}},
        "replicas": 1,
        "template": {
            "metadata": {"labels": {"app": "ai-model", "version": "v2"}},
            "spec": {
                "containers": [{
                    "name": "model",
                    "image": "my_model_v2:latest",
                    # Define other container properties such as ports, env, volumeMounts etc.
                }],
            },
        },
    },
)

ai_model_service_v2 = k8s.core.v1.Service(
    "ai-model-service-v2",
    spec={
        "selector": {"app": "ai-model", "version": "v2"},
        "ports": [{"port": 80, "targetPort": 8080}],  # Replace with your container's port
    },
)

# Traffic split configuration between the two services
traffic_split = CustomResource(
    "ai-model-traffic-split",
    api_version="split.smi-spec.io/v1alpha1",
    kind="TrafficSplit",
    metadata={"name": "ai-model-traffic-split"},
    spec={
        "service": "ai-model-service",
        "backends": [
            {
                "service": "ai-model-service-v1",
                "weight": "500m",  # 50% traffic to version 1
            },
            {
                "service": "ai-model-service-v2",
                "weight": "500m",  # 50% traffic to version 2
            },
        ],
    },
)
```

In this program, we defined two deployments each serving different versions of an AI model (`my_model_v1` and `my_model_v2`). These deployments are accompanied by Kubernetes `Service` resources which are crucial for internal traffic routing and discovery.

Subsequently, a `TrafficSplit` custom resource is created which specifies how the inbound traffic is to be split between these services. In this case, traffic is distributed evenly between `v1` and `v2` services of the AI model.

Please adjust the container images, resource names, and other metadata according to your actual deployment scenario. This program assumes that you have Linkerd and the SMI (Service Mesh Interface) CRDs installed on your Kubernetes cluster as Linkerd's traffic splitting functionality relies on SMI's `TrafficSplit` custom resource definition.

Use the program above as a template to deploy and test your AI models in a real-world setting using Kubernetes and Linkerd's traffic management capabilities.