CI/CD for Ilum Interactive Services with GitHub Actions

This guide demonstrates how to automate the deployment of Ilum Interactive Services using GitHub Actions with self-hosted runners running directly on your Kubernetes cluster. By combining the Ilum REST API with a CI/CD pipeline, you can deploy and update services automatically on every push or merge.

Quick Summary

Self-Hosted Runners: Deploy GitHub Actions Runner Controller (ARC) on Kubernetes so workflows execute inside your cluster with direct access to Ilum.
Service Code in Git: Store your Ilum service implementation (service.py) in a GitHub repository.
Automated Deployment: A GitHub Actions workflow creates/updates Ilum service groups via the REST API on every push to main.
Update via Pull Requests: Modify service logic in a feature branch, open a PR, merge — and the service is automatically redeployed with the new code.

Prerequisites

A running Ilum instance on Kubernetes (with all pods healthy)
kubectl and helm configured for your cluster
A GitHub account with permissions to create repositories and personal access tokens
cert-manager installed on the cluster

Step 1: Install cert-manager

Actions Runner Controller requires cert-manager for TLS certificate management. Install it if not already present:

Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml

Wait for all cert-manager pods to be ready:

Wait for cert-manager
kubectl wait --for=condition=Ready pods --all -n cert-manager --timeout=120s

Step 2: Create a GitHub Personal Access Token

The self-hosted runner needs a GitHub token to register itself with your repository.

Navigate to GitHub → Settings → Developer Settings → Personal access tokens → Tokens (classic).
Click Generate new token (classic).
Set a descriptive name (e.g., ServiceManagementToken).
Select the repo scope (full control of private repositories).
Click Generate token and copy the value immediately — you won't see it again.

Step 3: Install Actions Runner Controller (ARC)

3.1 Add the Helm Repository

Add ARC Helm Repo
helm repo add actions-runner-controller \
  https://actions-runner-controller.github.io/actions-runner-controller

3.2 Create the Namespace and Secret

Create Namespace and Secret
# Create namespace for runners
kubectl create ns arc-runners

# Store the GitHub token as a Kubernetes secret
kubectl create secret generic github-token \
  --namespace arc-runners \
  --from-literal=github_token=<YOUR_GITHUB_TOKEN>

3.3 Install ARC via Helm

Install ARC
helm install arc \
  actions-runner-controller/actions-runner-controller \
  --namespace arc-runners \
  --set authSecret.create=false \
  --set authSecret.name=github-token

Step 4: Create a Runner Deployment

Create a runner-deployment.yaml file that tells ARC to spawn a self-hosted runner for your repository:

runner-deployment.yaml
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: ilum-demo-runner
  namespace: arc-runners
spec:
  replicas: 1
  template:
    spec:
      repository: <your-org>/IlumServiceManagment
      labels:
        - self-hosted
        - linux

Apply it:

Deploy Runner
kubectl apply -f runner-deployment.yaml

The runner pod will register itself with GitHub and appear in your repository's Settings → Actions → Runners as an available self-hosted runner.

Step 5: Create the Service Repository

Create a new GitHub repository (e.g., IlumServiceManagment) with the following structure:

IlumServiceManagment/
├── service.py                      # Ilum service implementation
└── .github/
    └── workflows/
        └── deploy.yml              # GitHub Actions workflow

5.1 Service Implementation (`service.py`)

This file contains the Ilum Interactive Service code. The service extends IlumJob and implements the run method:

service.py
from ilum.api import IlumJob
from pyspark.sql.functions import col, sum as spark_sum


class SparkInteractiveExample(IlumJob):
    def run(self, spark, config) -> str:
        table_name = config.get('table')
        database_name = config.get('database')
        report_lines = []

        if not table_name:
            raise ValueError("Config must provide a 'table' key")

        if database_name:
            spark.catalog.setCurrentDatabase(database_name)
            report_lines.append(f"Using database: {database_name}")

        if table_name not in [t.name for t in spark.catalog.listTables()]:
            raise ValueError(f"Table '{table_name}' not found in catalog")

        df = spark.table(table_name)

        report_lines.append(f"=== Details for table: {table_name} ===")

        total_rows = df.count()
        report_lines.append(f"Total rows: {total_rows}")

        total_columns = len(df.columns)
        report_lines.append(f"Total columns: {total_columns}")

        report_lines.append("Distinct values per column:")
        for c in df.columns:
            distinct_count = df.select(c).distinct().count()
            report_lines.append(f"  {c}: {distinct_count}")

        report_lines.append("Schema:")
        for f in df.schema.fields:
            report_lines.append(f"  {f.name}: {f.dataType}")

        report_lines.append("Sample data (first 5 rows):")
        sample_rows = df.take(5)
        for row in sample_rows:
            report_lines.append(str(row.asDict()))

        report_lines.append("Null counts per column:")
        null_counts_df = df.select(
            [spark_sum(col(c).isNull().cast("int")).alias(c) for c in df.columns]
        )
        null_counts = null_counts_df.collect()[0].asDict()
        for c, v in null_counts.items():
            report_lines.append(f"  {c}: {v}")

        return "\n".join(report_lines)

5.2 GitHub Actions Workflow (`.github/workflows/deploy.yml`)

This workflow runs on every push to main and uses the Ilum REST API to deploy the service:

.github/workflows/deploy.yml
name: Deploy ILUM Group

on:
  push:
    branches: [main]

env:
  ILUM_DEMO_SERVICE: ILUM_DEMO_SERVICE

jobs:
  deploy:
    runs-on: self-hosted
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Check if group ILUM_DEMO_SERVICE exists
        id: check_group
        run: |
          echo "Checking if group ILUM_DEMO_SERVICE exists..."

          RESPONSE=$(curl -s -X GET \
            -w "\nHTTP_STATUS:%{http_code}" \
            "http://ilum-core.ilum.svc.cluster.local:9888/api/v1/group?name=$ILUM_DEMO_SERVICE")

          HTTP_STATUS=$(echo "$RESPONSE" | grep HTTP_STATUS | cut -d':' -f2)
          BODY=$(echo "$RESPONSE" | sed '/HTTP_STATUS/d')

          GROUP_ID=$(echo "$BODY" | jq -r '.content[0].groupId // empty')

          if [ -n "$GROUP_ID" ] && [ "$GROUP_ID" != "null" ]; then
            echo "Group ILUM_DEMO_SERVICE found with ID: $GROUP_ID"
            echo "GROUP_ID=$GROUP_ID" >> $GITHUB_OUTPUT
            echo "GROUP_EXISTS=true" >> $GITHUB_OUTPUT
          else
            echo "Group ILUM_DEMO_SERVICE does not exist."
            echo "GROUP_EXISTS=false" >> $GITHUB_OUTPUT
          fi

      - name: Delete existing group
        if: steps.check_group.outputs.GROUP_EXISTS == 'true'
        run: |
          GROUP_ID="${{ steps.check_group.outputs.GROUP_ID }}"
          echo "Stopping and deleting group ILUM_DEMO_SERVICE (ID: $GROUP_ID)..."

          curl -s -X POST \
            http://ilum-core.ilum.svc.cluster.local:9888/api/v1/group/$GROUP_ID/stop

          RESPONSE=$(curl -s -X DELETE \
            -w "\nHTTP_STATUS:%{http_code}" \
            http://ilum-core.ilum.svc.cluster.local:9888/api/v1/group/$GROUP_ID)

          HTTP_STATUS=$(echo "$RESPONSE" | grep HTTP_STATUS | cut -d':' -f2)

          if [ "$HTTP_STATUS" -ne 200 ]; then
            echo "Error: Failed to delete group (Status: $HTTP_STATUS)"
            exit 1
          fi

          echo "Group ILUM_DEMO_SERVICE deleted successfully."

      - name: Create new group
        run: |
          echo "Creating group ILUM_DEMO_SERVICE with service.py..."

          RESPONSE=$(curl -s -X POST \
            -F "name=ILUM_DEMO_SERVICE" \
            -F "[email protected]" \
            -F "clusterName=default" \
            -F "language=PYTHON" \
            -w "\nHTTP_STATUS:%{http_code}" \
            http://ilum-core.ilum.svc.cluster.local:9888/api/v1/group)

          HTTP_STATUS=$(echo "$RESPONSE" | grep HTTP_STATUS | cut -d':' -f2)
          BODY=$(echo "$RESPONSE" | sed '/HTTP_STATUS/d')

          echo "HTTP Status: $HTTP_STATUS"
          echo "Response Body: $BODY"

          if [ "$HTTP_STATUS" -ne 200 ]; then
            echo "Error: Failed to create group (Status: $HTTP_STATUS)"
            exit 1
          fi

          GROUP_ID=$(echo "$BODY" | jq -r '.groupId // empty')
          echo "Group ILUM_DEMO_SERVICE created successfully with ID: $GROUP_ID"

Commit both files to the repository.

How the Workflow Works

Since the runner pod runs inside the Kubernetes cluster, it can directly access the Ilum Core API via the internal service DNS (ilum-core.ilum.svc.cluster.local:9888). The workflow:

Checks if a service group named ILUM_DEMO_SERVICE already exists.
Deletes the existing group (stops it first, then removes it).
Creates a new group by uploading service.py via the REST API.

Step 6: Verify the Deployment

Once the workflow completes successfully (visible in the Actions tab on GitHub), navigate to the Ilum UI:

Go to Services in the sidebar.
You should see the ILUM_DEMO_SERVICE service group with status Active.
Click on it and go to the Execute Job tab.
Set the Class to service.SparkInteractiveExample.
Provide parameters as JSON:

Execution Parameters
{
    "database": "ilum_example_product_sales",
    "table": "products"
}

Click Execute. The result will display table details including row count, schema, sample data, and null counts.

Step 7: Update the Service via Pull Request

The key benefit of this CI/CD approach is that updating your service is as simple as modifying code and merging a pull request.

Example: Adding Table Listing Feature

Create a new branch (e.g., feature/list-tables) from main.
Edit service.py to add functionality — for example, listing other tables in the database:

Added to service.py (at the end of run method)
        # List other tables in the current database
        current_db = spark.catalog.currentDatabase()
        report_lines.append(f"Tables in database '{current_db}':")
        tables = spark.catalog.listTables(current_db)
        other_tables = [t.name for t in tables if t.name != table_name]
        if other_tables:
            for tname in other_tables:
                report_lines.append(f"  {tname}")
        else:
            report_lines.append("  (no other tables)")

        return "\n".join(report_lines)

Commit the change and open a Pull Request from feature/list-tables into main.
Merge the PR.
The GitHub Actions workflow triggers automatically, stops the old service, and deploys the updated version.
Execute the job again in Ilum — the result now includes the new "Tables in database" section.

Troubleshooting

Runner pod not registering with GitHub

Verify the GitHub token secret is correct: kubectl get secret github-token -n arc-runners -o yaml
Check the runner pod logs: kubectl logs -n arc-runners -l app=ilum-demo-runner
Ensure the repository field in runner-deployment.yaml matches your GitHub repository exactly (including organization/user).

Workflow fails with connection refused to ilum-core

The self-hosted runner must be in the same Kubernetes cluster as Ilum. Verify:

The Ilum namespace and service name are correct. Use kubectl get svc -n ilum to find the correct service DNS.
The runner pod has network access to the Ilum namespace. Check for NetworkPolicies that may block cross-namespace traffic.

Service group creation returns HTTP 400

Ensure the service.py file contains a valid class extending IlumJob.
Check that clusterName matches an existing Ilum cluster (usually default).
Verify the file is being uploaded correctly with -F "[email protected]".

Job execution returns "Table not found in catalog"

Make sure the database and table parameters match existing data in your Ilum environment. You can check available databases and tables via the SQL module or a Jupyter notebook.

Frequently Asked Questions (FAQ)

Why use self-hosted runners instead of GitHub-hosted runners?

GitHub-hosted runners run in GitHub's cloud and cannot access your Kubernetes cluster's internal services. Self-hosted runners (via ARC) run as pods inside your cluster, giving them direct network access to Ilum's internal API (ilum-core.ilum.svc.cluster.local).

Can I use this approach with GitLab CI or other CI/CD systems?

Yes. The core concept — calling the Ilum REST API from a CI/CD pipeline — works with any CI/CD system. You would need to use a GitLab Runner on Kubernetes (or similar) and adapt the workflow syntax to your CI/CD platform.

How do I scale the number of runners?

Increase the replicas field in runner-deployment.yaml. ARC also supports HorizontalRunnerAutoscaler for auto-scaling based on workflow queue depth.

Prerequisites​

Step 1: Install cert-manager​

Step 2: Create a GitHub Personal Access Token​

Step 3: Install Actions Runner Controller (ARC)​

3.1 Add the Helm Repository​

3.2 Create the Namespace and Secret​

3.3 Install ARC via Helm​

Step 4: Create a Runner Deployment​

Step 5: Create the Service Repository​

5.1 Service Implementation (service.py)​

5.2 GitHub Actions Workflow (.github/workflows/deploy.yml)​

Step 6: Verify the Deployment​

Step 7: Update the Service via Pull Request​

Example: Adding Table Listing Feature​

Troubleshooting​

Frequently Asked Questions (FAQ)​