Skip to main content

Production

For production environments, it's recommended to deploy all dependencies in separate namespaces.

Prerequisites

The table below provides necessary prerequisites and related instructions.

PrerequisiteInstruction
MongoDBRefer to https://bitnami.com/stack/mongodb/helm
KafkaRefer to https://bitnami.com/stack/kafka/helm
ObjectStorageRefer to https://min.io/docs/minio/kubernetes/upstream/operations/installation.html

ilum-core

helm install ilum-core --create-namespace -n ilum --set mongo.instances=<mongo uri> --set kafka.address=<kafka broker address> --set s3a.host=<s3 host> --set s3a.port=<s3 port> ilum/ilum-core

ilum-ui

helm install ilum-ui --create-namespace -n ilum ilum/ilum-ui

MongoDB

Ilum employs MongoDB as its storage layer, preserving all data required between restarts within the MongoDB database. Ilum automatically creates all necessary databases and collections during the startup process.

Apache Kafka

Apache Kafka serves as Ilum's communication layer, facilitating interaction between Ilum-Core and Spark jobs, as well as between different Ilum-Core instances when scaled. It is critical to ensure Apache Kafka brokers are accessible by both Ilum-Core and Spark jobs, especially when Spark jobs are launched on a different Kubernetes cluster.

Ilum utilizes Kafka to carry out communication using several topics, all created during Ilum's startup. Therefore, users don't need to manage these topics manually.

MinIO

Ilum uses MinIO as the storage layer for Spark application components. All files (including jars, configurations, data files) needed for the operation of Spark components (driver, executors) are stored and made available for download via MinIO.

MinIO implements the S3 interface, which also enables it to store input/output data.

Security keys

This application uses JSON Web Tokens (JWT) for authentication purposes. By default, the application employs an RSA key pair, which is randomly generated at runtime, to sign these tokens.

In its standard configuration, the application creates a fresh RSA key pair each time it starts. This approach simplifies local development and testing by automatically handling the key generation process. However, it must be emphasized that this approach is not suitable for a production environment.

The primary issue with using randomly generated keys in a production environment is the lack of persistence. Each time the application restarts, it generates a new RSA key pair, invalidating all previously issued tokens. This could lead to an abrupt and unanticipated logout for all users, disrupting user experience and potentially leading to data loss.

Generate private key

For a production environment, a stable and secure key pair should be manually generated and used consistently. This ensures that tokens remain valid across multiple application restarts, thus providing a consistent user experience.

You can generate an RSA key pair manually using tools like OpenSSL. A common command to generate a 2048-bit RSA private key is as follows:

openssl genpkey -algorithm RSA \
-pkeyopt rsa_keygen_bits:2048 \
-pkeyopt rsa_keygen_pubexp:65537 | \
openssl pkcs8 -topk8 -nocrypt -outform pem > private-key.p8

The contents of the private key should look like the following:

-----BEGIN PRIVATE KEY-----
MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQCsRnE83rm6BJya
nTyzVqX0SG+D4zBjkyWsOmGG+CoDdgQ6Z8AaocmnjP1SbRykQsQSMf6SeW+fdpH+
ccmzuHe7pZIa2o2Mg8xbk/UszJDaPztwoQbUt/2gHi/rZP8cIVkquzhnN/yxrMls
...
-----END PRIVATE KEY-----

In order to use private key as the setting security.jwt.privateKey, remove header and footer from the key.

Generate public key

To generate the corresponding public key, use:

openssl pkey -pubout -inform pem -outform pem -in private-key.p8 -out public-key.spki

The contents of the public key should look like the following:

-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArEZxPN65ugScmp08s1al
9Ehvg+MwY5MlrDphhvgqA3YEOmfAGqHJp4z9Um0cpELEEjH+knlvn3aR/nHJs7h3
u6WSGtqNjIPMW5P1LMyQ2j87cKEG1Lf9oB4v62T/HCFZKrs4Zzf8sazJbMN3E/mJ
...
-----END PUBLIC KEY-----

In order to use public key as the setting security.jwt.publicKey, remove header and footer from the key.

Starter Kits

Ilum provides several starter kits to facilitate the integration and deployment of various services.

ilum-livy-proxy

To include the Ilum-Livy-Proxy service in your Ilum installation, you need to specify this during the installation process. Add the --set ilum-livy-proxy.enabled=true flag to your installation command.

Jupyter

Please be aware, that Jupyter notebook is not bundled in ilum package by default. If you want to run this service, add --set ilum-jupyter.enabled=true to your installation command.

If you want to access the Jupyter UI, the best way to do it is by configuring an ingress or using the port-forward command kubectl port-forward svc/ilum-jupyter 8888:8888

Apache Zeppelin

Please be aware, that Zeppelin notebook is not bundled in ilum package by default. If you want to run this service, add --set ilum-zeppelin.enabled=true to your installation command.

If you want to access the Zeppelin UI, the best way to do it is by configuring an ingress or using the port-forward command kubectl port-forward svc/ilum-zeppelin 8080:8080

Apache Airflow

Please be aware, that Airflow is not bundled in ilum package by default. If you want to run this service, add --set airflow.enabled=true to your installation command.

If you want to access the Airflow UI, the best way to do it is by configuring an ingress or using the port-forward command kubectl port-forward svc/ilum-webserver 8080:8080

Marquez

Please be aware, that Marquez is not bundled in ilum package by default. If you want to run this service, add --set global.lineage.enabled=true to your installation command.

If you want to access the Marquez UI, the best way to do it is by configuring an ingress or using the port-forward command kubectl port-forward svc/ilum-marquez-web 9444:9444

PostgreSQL

Please be aware, that PostgreSQL is not bundled in ilum package by default. If you want to run this service, add --set postgresql.enabled=true to your installation command.

Kube Prometheus Stack

Please be aware, that Kube Prometheus Stack is not bundled in ilum package by default. If you want to run this service, add --set kube-prometheus-stack.enabled=true to your installation command.

If you want to access the Prometheus UI, the best way to do it is by configuring an ingress or using the port-forward command kubectl port-forward svc/prometheus-operated 9090:9090

If you want to access the Grafana UI, the best way to do it is by configuring an ingress or using the port-forward command kubectl port-forward svc/ilum-grafana 8080:80