Multi-cluster service discovery with Cilium & Istio

In the Kubernetes world, the most commonly used purpose for multi-cluster mesh is to ensure resilience. This is done by maintaining consistency in namespaces and services, which ensures that the application is available in both clusters at the same time. This means that if one cluster fails, the other will still be available. However, there has been an interesting recent development where people have been exploring the possibility of calling two different services in two different clusters using only k8s CNI/Service Discovery. It is indeed possible to achieve multi-cluster service discovery using Cilium and Istio. In the following, I will describe in detail how to create a mesh of Kubernetes clusters by linking them together and enabling pod-to-pod connectivity and service proxying across all clusters.

Stack:


Host: Bare metal

OS: Talos

Pod-To-Pod connectivity: Cilium mesh v1.12

Cross-cluster service discovery: Istio (DNS proxying)

Kubernetes clusters:

DEV

~ $k get node
NAME                  STATUS   ROLES                  AGE     VERSION
master-1              Ready    control-plane,master   13m     v1.24.2
master-2              Ready    control-plane,master   7m15s   v1.24.2
master-3              Ready    control-plane,master   13m     v1.24.2
talos-192-168-1-133   Ready    <none>                 10m     v1.24.2
talos-192-168-1-134   Ready    <none>                 11m     v1.24.2
talos-192-168-1-135   Ready    <none>                 12m     v1.24.2
talos-192-168-1-136   Ready    <none>                 6m14s   v1.24.2
talos-192-168-1-137   Ready    <none>                 8m38s   v1.24.2

QA

~ $k get node
NAME                  STATUS   ROLES                  AGE   VERSION
master-1              Ready    control-plane,master   18m   v1.24.2
master-2              Ready    control-plane,master   18m   v1.24.2
master-3              Ready    control-plane,master   18m   v1.24.2
talos-192-168-1-143   Ready    <none>                 15m   v1.24.2
talos-192-168-1-144   Ready    <none>                 11m   v1.24.2
talos-192-168-1-145   Ready    <none>                 13m   v1.24.2
talos-192-168-1-146   Ready    <none>                 14m   v1.24.2
talos-192-168-1-147   Ready    <none>                 16m   v1.24.2

Prerequisites


  • All clusters must be configured with the same datapath mode. Cilium install may default to Encapsulation or Native-Routing mode depending on the specific cloud environment.
  • PodCIDR ranges in all clusters and all nodes must be non-conflicting and unique IP addresses. Furthermore, Cilium in each cluster must be configured with a native routing CIDR that covers all the PodCIDR ranges across all connected clusters.
  • Nodes in all clusters must have IP connectivity between each other. This requirement is typically met by establishing peering or VPN tunnels between the networks of the nodes of each cluster.
  • The network between clusters must allow the inter-cluster communication.

Install the Cilium CLI


Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features such as Cluster Mesh.

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/master/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

MetalLB


In order to enable cluster mesh connection between clusters, you can either use NodePort or LoadBalancer service. I have installed MetalLB with Layer 2 Configuration for each cluster:

~ $k get po -n metallb-system
NAME                          READY   STATUS    RESTARTS   AGE
controller-5689597974-gtwcd   1/1     Running   0          32m
speaker-6988j                 1/1     Running   0          25m
speaker-84pnm                 1/1     Running   0          30m
speaker-cb5q5                 1/1     Running   0          27m
speaker-dlvdv                 1/1     Running   0          29m
speaker-drj6h                 1/1     Running   0          30m
speaker-h5kk7                 1/1     Running   0          31m
speaker-v9s2h                 1/1     Running   0          27m
speaker-vklwd                 1/1     Running   0          24m
---
~ $k get cm config -o yaml -n metallb-system
apiVersion: v1
data:
  config: |-
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 192.168.1.210-192.168.1.220
kind: ConfigMap
metadata:
  creationTimestamp: "2022-08-10T13:56:06Z"
  name: config
  namespace: metallb-system
  resourceVersion: "426"
  uid: 523bbadf-e483-423b-9619-7715a2c46454

Whenever cluster mesh is enabled, LoadBalancer is created for Cluster Mesh API server in each cluster.

Cilium Cluster mesh


For the rest of this tutorial, we will assume that you intend to connect two clusters together with the kubectl configuration context stored in the environment variables $CLUSTER1 and $CLUSTER2. This context name is the same as you typically pass to kubectl --context.

export CTX_CLUSTER1=admin@dev
export CTX_CLUSTER2=admin@qa
export CLUSTER1=admin@dev
export CLUSTER2=admin@qa

Each cluster must be assigned a unique human-readable name as well as a numeric cluster ID (1-255). It is best to assign both these attributes at installation time of Cilium, you can see an example in the previous post https://www.bnovickovs.me/talos-kubernetes-clusters-deployment-on-proxmox-cilium-cni/ (Cilium Configuration).

---
cluster:
  name: dev
  id: 1
---
cluster:
  name: qa
  id: 2

Running cilium clustermesh enable:

~ $cilium clustermesh enable --context $CLUSTER1 -n cilium --service-type LoadBalancer
🔑 Found CA in secret cilium-ca
🔑 Generating certificates for ClusterMesh...
✨ Deploying clustermesh-apiserver from quay.io/cilium/clustermesh-apiserver:v1.12.0...
✅ ClusterMesh enabled!
~ $cilium clustermesh enable --context $CLUSTER2 -n cilium --service-type LoadBalancer
🔑 Found CA in secret cilium-ca
🔑 Generating certificates for ClusterMesh...
✨ Deploying clustermesh-apiserver from quay.io/cilium/clustermesh-apiserver:v1.12.0...
✅ ClusterMesh enabled!
---
~ $cilium clustermesh status --context $CLUSTER1 --wait -n cilium
✅ Cluster access information is available:
  - 192.168.1.221:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
⌛ [dev] Waiting for deployment clustermesh-apiserver to become ready...
🔌 Cluster Connections:
🔀 Global services: [ min:0 / avg:0.0 / max:0 ]
~ $cilium clustermesh status --context $CLUSTER2 --wait -n cilium
✅ Cluster access information is available:
  - 192.168.1.210:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
⌛ [qa] Waiting for deployment clustermesh-apiserver to become ready...
🔌 Cluster Connections:
🔀 Global services: [ min:0 / avg:0.0 / max:0 ]
---
~ $k get svc -n cilium --context $CLUSTER1
NAME                    TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)          AGE
cilium-agent            ClusterIP      None          <none>          9964/TCP         50m
clustermesh-apiserver   LoadBalancer   10.22.22.44   192.168.1.221   2379:31605/TCP   6m14s
hubble-metrics          ClusterIP      None          <none>          9965/TCP         50m
hubble-peer             ClusterIP      10.22.94.37   <none>          443/TCP          50m
~ $k get svc -n cilium --context $CLUSTER2
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)          AGE
cilium-agent            ClusterIP      None            <none>          9964/TCP         49m
clustermesh-apiserver   LoadBalancer   10.24.121.199   192.168.1.210   2379:30250/TCP   6m5s
hubble-metrics          ClusterIP      None            <none>          9965/TCP         49m
hubble-peer             ClusterIP      10.24.172.145   <none>          443/TCP          49m

### Check connection
~ $curl -svk telnet://192.168.1.221:2379
*   Trying 192.168.1.221:2379...
* TCP_NODELAY set
* Connected to 192.168.1.221 (192.168.1.221) port 2379 (#0)

~ $curl -svk telnet://192.168.1.210:2379
*   Trying 192.168.1.210:2379...
* TCP_NODELAY set
* Connected to 192.168.1.210 (192.168.1.210) port 2379 (#0)

Connect Clusters


Finally, connect the clusters. This step only needs to be done in one direction. The connection will automatically be established in both directions:

~ $cilium clustermesh connect --context $CLUSTER1 --destination-context $CLUSTER2 -n cilium
✨ Extracting access information of cluster qa...
🔑 Extracting secrets from cluster qa...
ℹī¸  Found ClusterMesh service IPs: [192.168.1.210]
✨ Extracting access information of cluster dev...
🔑 Extracting secrets from cluster dev...
ℹī¸  Found ClusterMesh service IPs: [192.168.1.221]
✨ Connecting cluster dev -> qa...
🔑 Secret cilium-clustermesh does not exist yet, creating it...
🔑 Patching existing secret cilium-clustermesh...
✨ Patching DaemonSet with IP aliases cilium-clustermesh...
✨ Connecting cluster qa -> dev...
🔑 Secret cilium-clustermesh does not exist yet, creating it...
🔑 Patching existing secret cilium-clustermesh...
✨ Patching DaemonSet with IP aliases cilium-clustermesh...
✅ Connected cluster dev and qa!
---
Checking status again:

~ $cilium clustermesh status --context $CLUSTER1 --wait -n cilium
✅ Cluster access information is available:
  - 192.168.1.221:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
⌛ [dev] Waiting for deployment clustermesh-apiserver to become ready...
⌛ Waiting (11s) for clusters to be connected: unable to determine status of cilium pod "cilium-knb7s": unable to determine cilium status: command terminated with exit code 1
✅ All 8 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]
🔌 Cluster Connections:
- qa: 8/8 configured, 8/8 connected
🔀 Global services: [ min:6 / avg:6.0 / max:6 ]
~ $cilium clustermesh status --context $CLUSTER2 --wait -n cilium
✅ Cluster access information is available:
  - 192.168.1.210:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
⌛ [qa] Waiting for deployment clustermesh-apiserver to become ready...
✅ All 8 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]
🔌 Cluster Connections:
- dev: 8/8 configured, 8/8 connected
🔀 Global services: [ min:6 / avg:6.0 / max:6 ]
---
Checking status in cilium pods:

root@talos-192-168-1-134:/home/cilium# cilium status
#KVStore:                 Ok   Disabled
Kubernetes:              Ok   1.24 (v1.24.2) [linux/amd64]
Kubernetes APIs:         ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement:    Strict    [eth0 192.168.1.134 (Direct Routing)]
Host firewall:           Enabled   [eth0]
CNI Chaining:            none
Cilium:                  Ok   1.12.0 (v1.12.0-9447cd1)
NodeMonitor:             Listening for events on 5 CPUs with 64x4096 of shared memory
Cilium health daemon:    Ok
IPAM:                    IPv4: 2/254 allocated from 10.21.3.0/24,
ClusterMesh:             1/1 clusters ready, 6 global-services
BandwidthManager:        EDT with BPF [CUBIC] [eth0]
Host Routing:            BPF
Masquerading:            BPF (ip-masq-agent)   [eth0]   10.0.0.0/9 [IPv4: Enabled, IPv6: Disabled]
Controller Status:       27/27 healthy
Proxy Status:            OK, ip 10.21.3.164, 0 redirects active on ports 10000-20000
Global Identity Range:   min 256, max 65535
Hubble:                  Ok   Current/Max Flows: 412/4095 (10.06%), Flows/s: 1.81   Metrics: Ok
Encryption:              Disabled
Cluster health:          16/16 reachable   (2022-08-10T14:51:45Z)

Test Pod Connectivity Between Clusters


We have successfully connected clusters together. Now I am going to validate the connectivity by running the connectivity test in multi cluster mode:

~ $kubectl label ns cilium-test  pod-security.kubernetes.io/enforce=privileged --context $CLUSTER1
namespace/cilium-test labeled
~ $kubectl label ns cilium-test  pod-security.kubernetes.io/enforce=privileged --context $CLUSTER2
namespace/cilium-test labeled
---

~ $cilium connectivity test --context $CLUSTER1 --multi-cluster $CLUSTER2
ℹī¸  Monitor aggregation detected, will skip some flow validation steps
✨ [dev] Deploying echo-same-node service...
✨ [dev] Deploying echo-other-node service...
✨ [dev] Deploying DNS test server configmap...
✨ [qa] Deploying DNS test server configmap...
✨ [dev] Deploying same-node deployment...
......
......
📋 Test Report
✅ 0/11 tests failed (36/160 actions), 0 tests skipped, 0 scenarios skipped:

We can already test service discovery with cilium mesh deployed, however, it will be applied following sameness, same namespace and service in both clusters:

Deployments are created following https://docs.cilium.io/en/latest/installation/clustermesh/services/#deploying-a-simple-example-service

~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
~ $kubectl exec -ti deployment/x-wing -- curl rebel-base
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}

We can see replies from pods in both clusters.

Istio Multi-Primary multicluster model

Source: https://istio.io/latest/docs/setup/install/multicluster/multi-primary/

We will install the Istio control plane on cluster1 and cluster2, which will make each of them the main cluster. Both of these clusters are located on network1, which allows for direct communication between the pods in both clusters. This connectivity has already been established through the implementation of the Cilium Cluster Mesh.

Configure Trust


A multicluster service mesh deployment requires that you establish trust between all clusters in the mesh, assuming that you use a common root to generate intermediate certificates for each cluster.

This part demonstrates how to generate and plug in the certificates and key for the Istio CA. These steps can be repeated to provision certificates and keys for Istio CAs running in each cluster.

git clone https://github.com/istio/istio.git
git checkout tags/1.15.0-beta.0 # we need to use Pre-release of Istio due to this https://github.com/istio/istio/commit/31fce3d75686d0f14504c758b83f5bc239cf35bd commit
~ $cd istio && cat pkg/security/security.go | grep Talos
                "/etc/ssl/certs/ca-certificates",                    // Talos Linux
---

mkdir -p certs
pushd certs
make -f ../tools/certs/Makefile.selfsigned.mk root-ca
make -f ../tools/certs/Makefile.selfsigned.mk cluster1-cacerts

certs $ls
cluster1  root-ca.conf  root-cert.csr  root-cert.pem  root-cert.srl  root-key.pem
---

kubectl create namespace istio-system --context $CLUSTER1
kubectl create namespace istio-system --context $CLUSTER2

kubectl label ns istio-system  pod-security.kubernetes.io/enforce=privileged --context $CLUSTER1
kubectl label ns istio-system  pod-security.kubernetes.io/enforce=privileged --context $CLUSTER2

# In each cluster, create a secret cacerts including all the input files ca-cert.pem, ca-key.pem, root-cert.pem and cert-chain.pem in folder cluster1

kubectl create secret generic cacerts -n istio-system       --from-file=cluster1/ca-cert.pem       --from-file=cluster1/ca-key.pem       --from-file=cluster1/root-cert.pem       --from-file=cluster1/cert-chain.pem --context $CLUSTER1
secret/cacerts created

kubectl create secret generic cacerts -n istio-system       --from-file=cluster1/ca-cert.pem       --from-file=cluster1/ca-key.pem       --from-file=cluster1/root-cert.pem       --from-file=cluster1/cert-chain.pem --context $CLUSTER2
secret/cacerts created

Istio deployment


In this setup, each Istio control plane monitors the API Servers in both clusters to identify endpoints. Service workloads communicate directly between pods across cluster boundaries.

To enable this, we will include the Istio CNI plugin and Istio DNS Proxying in the IstioOperator manifest for each cluster. The Istio CNI plugin replaces the istio-init container and provides the same networking functionality, but without requiring elevated Kubernetes RBAC permissions.

When DNS proxying is enabled, all DNS requests from an application will be redirected to the sidecar, which maintains a local map of domain names to IP addresses. If the request can be handled by the sidecar, it will immediately provide a response to the application, without the need for a roundtrip to the upstream DNS server. This means that we can call service1.cluster1.svc.cluster.local from service2.cluster2.svc.cluster.local internally, and vice versa.

istio $cat cluster1.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  hub: gcr.io/istio-release
  tag: 1.15.0-beta.0-distroless # we need to use Pre-release of Istio due to this https://github.com/istio/istio/commit/31fce3d75686d0f14504c758b83f5bc239cf35bd commit
  values:
    global:
      meshID: mesh1
      multiCluster:
        clusterName: cluster1
      network: network1
  meshConfig:
    defaultConfig:
      proxyMetadata:
        # Enable basic DNS proxying
        ISTIO_META_DNS_CAPTURE: "true"
        # Enable automatic address allocation, optional
        ISTIO_META_DNS_AUTO_ALLOCATE: "true"
    accessLogFile: /dev/stdout
  components:
    cni:
      enabled: true
---
istio $cat cluster2.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  hub: gcr.io/istio-release
  tag: 1.15.0-beta.0-distroless
  values:
    global:
      meshID: mesh1
      multiCluster:
        clusterName: cluster2
      network: network1
  meshConfig:
    defaultConfig:
      proxyMetadata:
        # Enable basic DNS proxying
        ISTIO_META_DNS_CAPTURE: "true"
        # Enable automatic address allocation, optional
        ISTIO_META_DNS_AUTO_ALLOCATE: "true"
    accessLogFile: /dev/stdout
  components:
    cni:
      enabled: true

Apply the configuration to dev and qa clusters:

istio $istioctl install --context="${CTX_CLUSTER1}" -f cluster1.yaml
This will install the Istio 1.15.0 default profile with ["Istio core" "Istiod" "CNI" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
✔ CNI installed
✔ Installation complete                                                                                                                                                      Making this installation the default for injection and validation.
---
istio $istioctl install --context="${CTX_CLUSTER2}" -f cluster2.yaml
This will install the Istio 1.15.0 default profile with ["Istio core" "Istiod" "CNI" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
✔ CNI installed
✔ Installation complete                                                                                                                                                      Making this installation the default for injection and validation.
---
~ $kubectl get po -n istio-system
NAME                                   READY   STATUS    RESTARTS   AGE
istio-cni-node-9j4l7                   1/1     Running   0          4m14s
istio-cni-node-9qzpx                   1/1     Running   0          4m14s
istio-cni-node-hljb5                   1/1     Running   0          4m14s
istio-cni-node-jf6nt                   1/1     Running   0          4m14s
istio-cni-node-lnwrb                   1/1     Running   0          4m14s
istio-cni-node-lxw7n                   1/1     Running   0          4m14s
istio-cni-node-n2zn2                   1/1     Running   0          4m14s
istio-cni-node-qw5z7                   1/1     Running   0          4m14s
istio-ingressgateway-749954b6f-88kpb   1/1     Running   0          4m56s
istiod-55f5c545bc-pkpqk                1/1     Running   0          5m9s

Enable Endpoint Discovery


Install a remote secret in qa that provides access to dev API server and vice verse.

istioctl x create-remote-secret     --context="${CTX_CLUSTER1}"     --name=cluster1 |     kubectl apply -f - --context="${CTX_CLUSTER2}"
secret/istio-remote-secret-cluster1 created

istioctl x create-remote-secret     --context="${CTX_CLUSTER2}"     --name=cluster2 |     kubectl apply -f - --context="${CTX_CLUSTER1}"
secret/istio-remote-secret-cluster2 created

istio $k logs istiod-66bd78f4dd-cqvrk | grep remote
2022-08-11T09:50:32.766816Z     info    klog    Config not found: /var/run/secrets/remote/config
2022-08-11T09:50:33.350483Z     info    Generating istiod-signed cert for [istiod.istio-system.svc istiod-remote.istio-system.svc istio-pilot.istio-system.svc]:
2022-08-11T09:50:33.406240Z     info    Starting multicluster remote secrets controller
2022-08-11T09:50:33.423471Z     info    multicluster remote secrets controller cache synced in 17.236681ms
2022-08-11T09:54:48.677497Z     info    processing secret event for secret istio-system/istio-remote-secret-cluster2
2022-08-11T09:54:48.677608Z     info    Adding cluster cluster2 from secret istio-system/istio-remote-secret-cluster2
2022-08-11T09:54:48.778037Z     info    ads     Push debounce stable[23] 1 for config Secret/istio-system/istio-remote-secret-cluster2: 100.581875ms since last change, 100.5817ms since last push, full=false
2022-08-11T09:54:48.896624Z     info    Number of remote clusters: 1

Verify Installation


Hello world samples can be found here - https://github.com/istio/istio/tree/master/samples

kubectl create --context="${CTX_CLUSTER1}" namespace sample
kubectl create --context="${CTX_CLUSTER2}" namespace sample
kubectl label --context="${CTX_CLUSTER1}" namespace sample     istio-injection=enabled
kubectl label --context="${CTX_CLUSTER2}" namespace sample     istio-injection=enabled
kubectl label ns sample  pod-security.kubernetes.io/enforce=privileged --context $CLUSTER2
kubectl label ns sample  pod-security.kubernetes.io/enforce=privileged --context $CLUSTER1
kubectl apply --context="${CTX_CLUSTER1}"     -f samples/helloworld/helloworld.yaml     -l service=helloworld -n sample
kubectl apply --context="${CTX_CLUSTER2}"     -f samples/helloworld/helloworld.yaml     -l service=helloworld -n sample
kubectl apply --context="${CTX_CLUSTER1}"     -f samples/helloworld/helloworld.yaml     -l version=v1 -n sample
kubectl apply --context="${CTX_CLUSTER2}"     -f samples/helloworld/helloworld.yaml     -l version=v2 -n sample
kubectl apply --context="${CTX_CLUSTER1}"     -f samples/sleep/sleep.yaml -n sample
kubectl apply --context="${CTX_CLUSTER2}"     -f samples/sleep/sleep.yaml -n sample

To verify that cross-cluster load balancing works as expected, we call the HelloWorld service several times using the Sleep pod. To ensure load balancing is working properly, we call the HelloWorld service from all clusters in your deployment. Repeat this request several times and verify that the HelloWorld version should toggle between v1 and v2:

istio $kubectl exec --context="${CTX_CLUSTER1}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER1}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v2, instance: helloworld-v2-79bf565586-9hxgh
istio $kubectl exec --context="${CTX_CLUSTER1}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER1}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-dj4c6
istio $kubectl exec --context="${CTX_CLUSTER1}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER1}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v2, instance: helloworld-v2-79bf565586-9hxgh
istio $kubectl exec --context="${CTX_CLUSTER1}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER1}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-dj4c6
---
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-dj4c6
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v2, instance: helloworld-v2-79bf565586-9hxgh
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-dj4c6
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-dj4c6
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-dj4c6
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld.sample:5000/hello
Hello version: v2, instance: helloworld-v2-79bf565586-9hxgh

Istio multi-cluster load balancing is successfully deployed. Nevertheless, now we are going to check if DNS proxying is working properly therefore, we need to avoid having sameness in both clusters. We are going to create 2 different helloworld version in 2 different namespaces in each cluster.

istio $k get po -n sample1 --context="${CTX_CLUSTER1}"
NAME                             READY   STATUS    RESTARTS   AGE
helloworld-v1-77cb56d4b4-vk5vz   2/2     Running   0          77s
sleep-69cfb4968f-gcntp           2/2     Running   0          24s
istio $k get svc -n sample1 --context="${CTX_CLUSTER1}"
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
helloworld-v1   ClusterIP   10.22.132.0     <none>        5000/TCP   78s
sleep           ClusterIP   10.22.254.182   <none>        80/TCP     26s
---

istio $k get po -n sample2 --context="${CTX_CLUSTER2}"
NAME                             READY   STATUS    RESTARTS   AGE
helloworld-v2-79bf565586-r96qs   2/2     Running   0          63s
sleep-69cfb4968f-qdlf7           2/2     Running   0          39s
istio $k get svc -n sample2 --context="${CTX_CLUSTER2}"
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
helloworld-v2   ClusterIP   10.24.196.111   <none>        5000/TCP   77s
sleep           ClusterIP   10.24.248.8     <none>        80/TCP     45s
---

# Calling helloworld-v2 service from cluster1 to cluster2
istio $kubectl exec --context="${CTX_CLUSTER1}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER1}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld-v2.sample2:5000/hello
Hello version: v2, instance: helloworld-v2-79bf565586-r96qs
istio $kubectl exec --context="${CTX_CLUSTER1}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER1}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld-v2.sample2:5000/hello
Hello version: v2, instance: helloworld-v2-79bf565586-r96qs
---

# Calling helloworld-v1 service from cluster2 to cluster1
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld-v1.sample1:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-vk5vz
istio $kubectl exec --context="${CTX_CLUSTER2}" -n sample -c sleep     "$(kubectl get pod --context="${CTX_CLUSTER2}" -n sample -l app=sleep -o jsonpath='{.items[0].metadata.name}')"     -- curl -sS helloworld-v1.sample1:5000/hello
Hello version: v1, instance: helloworld-v1-77cb56d4b4-vk5vz

Congratulations. DNS proxying and multi-cluster mesh are successfully deployed and verified on both clusters.