Networking across clusters

Networking across clusters#

SkyShift builds a network mesh between clusters to allow jobs/services to communicate between themselves. This is applicable in two scenarios:

1) A service’s replicas are scheduled on multiple clusters due to resource constraints, while the frontend is serving and balancing the load from one of the cluster.

  1. Jobs that are deployed across clusters have a need to communicate to fulfil a larger application goal.

This network mesh is constructed on demand using Clusterlink. Clusterlink simplifies the connection between application services that are located in different domains, networks, and cloud infrastructures. It deploys a set of unpriviliged gateways serving connections to and from services according to policies defined in the management plane. ClusterLink gateways represent the remotely deployed services to applications running in a local cluster, acting as L4 proxies. On connection establishment, the control plane components in the source and the target ClusterLink gateways validate and establish a secure mTLS connection based on specified policies.

Future Enhancements#

Currently, SkyShift supports Clusterlink deployment and operation for Kubernetes Cluster. Looking ahead, SkyShift aims to support inter-cluster communications on workloads deployed on SLURM and Ray clusters as well.