Create and Connect a Remote GKE Cluster to Data Lakehouse
Introduction
Ilum empowers you to manage a powerful multi-cluster setup from a single, central control plane. While Ilum automates the deployment and configuration of its core components, setting up the underlying infrastructure requires precise coordination.
This guide provides a comprehensive walkthrough for setting up a multi-cluster architecture on Google Kubernetes Engine (GKE). You will learn how to:
- Provision a central control plane (Master Cluster).
- Set up a dedicated execution environment (Remote Cluster).
- Establish secure communication between them using client certificates and ingress rules.
This guide walks you through the steps required to launch your first Ilum Job on a remote cluster. We use Google Kubernetes Engine (GKE) as an example, but you can follow the same flow with any Kubernetes distribution.
Prerequisites
Before starting the tutorial, make sure you have:
- Access to a Google Cloud project with billing enabled.
- kubectl installed and configured on your machine (version compatible with your GKE cluster).
- Helm installed (v3+).
- The Google Cloud CLI (
gcloud) installed and initialized (you can rungcloud auth loginandgcloud config listwithout errors). - The
gke-gcloud-auth-plugininstalled and available in yourPATHso thatkubectlcan authenticate to GKE clusters. - Permissions in the target Google Cloud project to:
- create and manage GKE clusters (e.g. Kubernetes Engine Cluster Admin or equivalent),
- create and use Cloud Storage buckets if you plan to use GCS for data.
What you'll accomplish in this guide:
| Step | Task | Purpose |
|---|---|---|
| 1 | Create two GKE clusters | Set up master (control plane) and remote (job execution) |
| 2 | Install Ilum on master | Deploy Ilum's core components |
| 3 | Set up authentication | Create secure credentials for remote cluster access |
| 4 | Register remote cluster | Add cluster to Ilum's management interface |
| 5 | Configure networking | Enable communication between clusters |
| 6 | Run your first job | Verify the multi-cluster setup works |
Step 1. Provision Master and Remote GKE Clusters
The foundation of a multi-cluster setup consists of two distinct entities:
- Master Cluster: Hosts the Ilum control plane (UI, API, Scheduler).
- Remote Cluster: dedicated environment where Ilum executes the Spark Jobs dispatched from the master.
Create a project
- Open Google Cloud Console.
- Click the Project selector in the top-left corner.
- Click New Project.
- Enter a project name and (if applicable) select an Organization/Folder.
- Click Create.
- Select the newly created project in the project selector.
Enable Google Kubernetes Engine API
- In the Console search bar, type Kubernetes Engine.
- Open Kubernetes Engine.
- Click Enable to enable the Google Kubernetes Engine API for the selected project.
Switch to the chosen project in gcloud
- In the Console, open the project selector and copy the Project ID.
- In your terminal, set this project as active:
gcloud config set project PROJECT_ID
- (Optional) If you plan to create clusters in a specific region often, set a default region:
gcloud config set compute/region europe-central2
This avoids errors requiring --region / --zone.
Create a cluster
Create the master cluster first:
gcloud container clusters create master-cluster \
--machine-type=n1-standard-8 \
--num-nodes=1
Create the remote cluster with a different name:
gcloud container clusters create remote-cluster \
--machine-type=n1-standard-4 \
--num-nodes=1
Resource Requirements & Architecture:
Why two clusters? The master cluster runs Ilum's control plane (UI, API, scheduler). The remote cluster executes your Spark jobs. This separation allows independent scaling and multi-cluster management from one interface.
Sizing:
- Master cluster: This example uses
n1-standard-8(8 vCPU, 30 GB RAM) for testing only. Minimum recommended: 12 vCPUs and 48 GB RAM (e.g.,n1-standard-12). Production environments with many users need significantly more. - Remote cluster: This example uses
n1-standard-4(4 vCPU, 15 GB RAM) for testing only. Production workloads require larger machines (e.g.,n1-standard-16+) and multiple nodes depending on your Spark job requirements.
Step 2. Install Ilum Control Plane on Master Cluster
Once your clusters are running, the next step is to deploy the Ilum platform on the master cluster.