Skip to main content

Upgrade Notes

Upgrade notes

NOTE TEMPLATE

1. Change

Feature:

Feature description

Values deleted - chart name

NameReason
helm.valueHelm value deletion reason

Values added - chart name

Values section description
NameDescriptionValue
helm.valueHelm value descriptiondefault value

⚠️⚠️⚠️ Warnings

NEXT RELEASE

RELEASE 6.2.0-RC2

1. Minio status probe addition

Feature

Added status probe in ilum-core that checks whether minio storage is ready

Values added - ilum-core

NameDescriptionValue
minio.statusProbe.enabledminio status probe enabled flagtrue
minio.statusProbe.imageminio status probe imagecurlimages/curl:8.5.0
minio.statusProbe.baseUrlminio base url"http://ilum-minio:9000"

2. Kyuubi configuration in ilum-core

Feature

Added Kyuubi configuration in ilum-core helm chart. Kyuubi will allow the user to execute SQL queries on many different data sources using ILUM UI.

Values added - ilum-core

NameDescriptionValue
kyuubi.enabledKyuubi enabled flagtrue
kyuubi.urlUrl of Kyuubi's rest servicehttp://ilum-sql-rest:10099

⚠️⚠️⚠️ Warnings

In order to properly manage SQL engines, we need to pass Kyuubi's spark configuration to ilum-core. This is done by configuring Kyuubi's spark in global.kyuubi.sparkConfig and allows the user to write one configuration which can be passed to both Kyuubi and ilum-core.

3. MongoDb uri configuration in ilum-core

Feature

Change the way mongoDb uri is passed to ilum-core. Now it is passed as a single string, which enables the user to provide more granular configuration such as authSource.

Values added - ilum-core

NameDescriptionValue
mongo.uriMongoDb connection stringmongodb://mongo:27017/ilum-default?replicaSet=rs0

Values deleted - ilum-core

NameReason
mongo.instancesUnnecessary after the change
mongo.replicaSetNameUnnecessary after the change

⚠️⚠️⚠️ Warnings

The mongo.uri, if set incorrectly, will cause the application to not work properly. Make sure to provide the correct connection string.

Previously the format was: mongodb://{ mongo.instances }/ilum-{ release_namespace }?replicaSet={ mongo.replicaSetName } By default in the ilum-aio chart these values were:

  • mongo.instances - ilum-mongodb-0.ilum-mongodb-headless:27017,ilum-mongodb-1.ilum-mongodb-headless:27017
  • mongo.replicaSetName - rs0
  • release_namespace - default

4. Autopausing configuration in ilum-core

Feature

Added autopausing in ilum-core, which periodically checks if any groups are idle for the specified time and pauses the group. Each group has to have autopausing exclicitly turned on for this to take place.

Values added - ilum-core

NameDescriptionValue
job.autoPause.enabledFeature flag to enable auto pausingtrue
job.autoPause.periodInterval in seconds to check the idleness groups180
job.autoPause.idleTimeTime in seconds that the group needs to be idle to be auto paused3600

5. Graphite exporter in ilum-aio chart

Feature

Graphite exporter in ilum AIO chart and Graphite configuration in ilum-core chart. Graphite exporter is a Prometheus exporter for metrics exported in the Graphite plaintext protocol.

Values added - graphite-exporter

Newly added whole chart, check its values on the chart's page

6. Graphite configuration in ilum-core

Feature

Added Graphite configuration in ilum-core helm chart. Graphite will allow Spark jobs to send their metrics to graphite sink, which will be scraped by Prometheus.

Values added - ilum-core

NameDescriptionValue
job.graphite.enabledGraphite enabled flagfalse
job.graphite.hostGraphite hostilum-graphite-graphite-tcp
job.graphite.portGraphite port9109
job.graphite.periodInterval between sending job metrics10
job.graphite.unitsTime unitseconds

RELEASE 6.1.7

RELEASE 6.1.6

RELEASE 6.1.5

RELEASE 6.2.0-RC1

RELEASE 6.1.4

RELEASE 6.1.4-RC2

RELEASE 6.1.4-RC1

1. Jupyter default sparkmagic configuration change

Feature

Changed method of passing spark default configs to jupyter notebook, now it is passed as json string

Values added - ilum-jupyter

sparkmagic configuration parameters
NameDescriptionValue
sparkmagic.config.sessionConfigs.confsparkmagic session spark configuration'{ "pyRequirements": "pandas", "spark.jars.packages": "io.delta:delta-core_2.12:2.4.0", "spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension", "spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog"}'
sparkmagic.config.sessionConfigsDefaults.confsparkmagic session defaults spark configuration'{ "pyRequirements": "pandas", "spark.jars.packages": "io.delta:delta-core_2.12:2.4.0", "spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension", "spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog"}'

2. Kyuubi in ilum-aio chart

Feature

Kyuubi in ilum AIO chart. Kyuubi is a distributed multi-tenant gateway providing SQL query services for data warehouses and lakehouses. It provides both JDBC and ODBC interfaces, and a REST API for clients to interact with.

Values added - ilum-kyuubi

Newly added whole chart, check its values on the chart's page

RELEASE 6.1.3

1. Jupyter configuration and persistent storage

Feature

Added extended configuration of jupyter notebook helm chart through helm values. Moreover added persitent storage to jupyter pod. All data saved in work directory will now be available after jupyter restart/update.

Values added - ilum-jupyter

pvc parameters
NameDescriptionValue
pvc.annotationspersistent volume claim annotations{}
pvc.selectorpersistent volume claim selector{}
pvc.accessModespersistent volume claim accessModesReadWriteOnce
pvc.storagepersistent volume claim storage requests4Gi
pvc.storageClassNamepersistent volume claim storageClassName``
sparkmagic configuration parameters
NameDescriptionValue
sparkmagic.config.kernelPythonCredentials.usernamesparkmagic python kernel username""
sparkmagic.config.kernelPythonCredentials.passwordsparkmagic python kernel password""
sparkmagic.config.kernelPythonCredentials.authsparkmagic python kernel auth mode"None"
sparkmagic.config.kernelScalaCredentials.usernamesparkmagic python kernel username""
sparkmagic.config.kernelScalaCredentials.passwordsparkmagic scala kernel password""
sparkmagic.config.kernelScalaCredentials.authsparkmagic scala kernel auth mode"None"
sparkmagic.config.kernelRCredentials.usernamesparkmagic r kernel username""
sparkmagic.config.kernelRCredentials.passwordsparkmagic r kernel password""
sparkmagic.config.waitForIdleTimeoutSecondssparkmagic timeout waiting for idle state15
sparkmagic.config.livySessionStartupTimeoutSecondssparkmagic timeout waiting for the session to start300
sparkmagic.config.ignoreSslErrorssparkmagic ignore ssl errors flagfalse
sparkmagic.config.sessionConfigs.confsparkmagic session spark configuration[pyRequirements: pandas, spark.jars.packages: io.delta:delta-core_2.12:2.4.0, spark.sql.extensions: io.delta.sql.DeltaSparkSessionExtension,spark.sql.catalog.spark_catalog: org.apache.spark.sql.delta.catalog.DeltaCatalog]
sparkmagic.config.sessionConfigs.driverMemorysparkmagic session driver memory1000M
sparkmagic.config.sessionConfigs.executorCoressparkmagic session executor cores2
sparkmagic.config.sessionConfigsDefaults.confsparkmagic session defaults spark configuration[pyRequirements: pandas, spark.jars.packages: io.delta:delta-core_2.12:2.4.0, spark.sql.extensions: io.delta.sql.DeltaSparkSessionExtension,spark.sql.catalog.spark_catalog: org.apache.spark.sql.delta.catalog.DeltaCatalog]
sparkmagic.config.sessionConfigsDefaults.driverMemorysparkmagic session defaults driver memory1000M
sparkmagic.config.sessionConfigsDefaults.executorCoressparkmagic session defaults executor cores2
sparkmagic.config.useAutoVizsparkmagic use auto viz flagtrue
sparkmagic.config.coerceDataframesparkmagic coerce dataframe flagtrue
sparkmagic.config.maxResultsSqlsparkmagic max sql result2500
sparkmagic.config.pysparkDataframeEncodingsparkmagic pyspark dataframe encodingutf-8
sparkmagic.config.heartbeatRefreshSecondssparkmagic heartbeat refresh seconds30
sparkmagic.config.livyServerHeartbeatTimeoutSecondssparkmagic livy server heartbeat timeout seconds0
sparkmagic.config.heartbeatRetrySecondssparkmagic heartbeat retry seconds10
sparkmagic.config.serverExtensionDefaultKernelNamesparkmagic server extension default kernel namepysparkkernel
sparkmagic.config.retryPolicysparkmagic retry policyconfigurable
sparkmagic.config.retrySecondsToSleepListsparkmagic retry seconds to sleep list[0.2, 0.5, 1, 3, 5]
sparkmagic.config.configurableRetryPolicyMaxRetriessparkmagic retry policy max retries8

RELEASE 6.1.2

RELEASE 6.1.2-RC2

1. Hive metastore in ilum-aio chart

Feature

Hive metastore in ilum AIO chart. HMS is a central repository of metadata for Hive tables and partitions in a relational database, and provides clients (including Hive, Impala and Spark) access to this information using the metastore service API. With hive metastore enabled in ilum AIO helm stack spark jobs run by ilum can be configured to autmatically access it.

Values added - ilum-hive-metastore

Newly added whole chart, check its values on chart page

Values added - ilum-core

NameDescriptionValue
hiveMetastore.enabledpassing hive metastore config to ilum spark jobs flagfalse
hiveMetastore.addresshive metastore addressthrift://ilum-hive-metastore:9083

2. Postgres extensions added

Feature

Few of ilum AIO subchars use postgresql, to make it easier to manage deployment of them we have added postgres extension resource to create postgresql databases for ilum sucharts.

Values added - ilum-aio

postgresql extensions parameters
NameDescriptionValue
postgresExtensions.enabledpostgres extensions enabled flagtrue
postgresExtensions.imageimage to run extensions inbitnami/postgresql:16
postgresExtensions.pullPolicyimage pull policyIfNotPresent
postgresExtensions.imagePullSecretsimage pull secrets[]
postgresExtensions.hostpostgresql database hostilum-postgresql-0.ilum-postgresql-hl
postgresExtensions.portpostgresql database port5432
postgresExtensions.databasesToCreatecomma separated list of databases to createmarquez,airflow,metastore
postgresExtensions.auth.usernamepostgresql account usernameilum
postgresExtensions.auth.passwordpostgresql account passwordCHANGEMEPLEASE
postgresExtensions.nodeSelectorpostgresql extensions pods node selector{}
postgresExtensions.tolerationspostgresql extensions pods tolerations[]

3. Loki and promtail in ilum-aio chart

Feature

Loki and promtail in ilum AIO chart. Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. Promtail is an agent which ships the contents of local logs to a Grafana Loki instance. Ilum will now use loki to aggregate logs from spark job pods to be able to clean cluster resources after jobs are done. Loki and promtail are preconfigured to scrap logs only from spark pods run by ilum in order to fetch job logs after their finish.

Values added - ilum-core

log aggregation config
NameDescriptionValue
global.logAggregation.enabledilum log aggregation flag, if enabled Ilum will fetch logs of finished kubernetes spark pods from lokifalse
global.logAggregation.loki.urlloki gateway address to access logshttp://ilum-loki-gateway

Values added - ilum-aio

log aggregation - loki config
NameDescriptionValue
loki.nameOverridesubchart name overrideilum-loki
loki.monitoring.selfMonitoring.enabledself monitoring enabled flagfalse
loki.monitoring.selfMonitoring.grafanaAgent.installOperatorself monitoring grafana agent operator install flagfalse
loki.monitoring.selfMonitoring.lokiCanary.enabledself monitoring canary enabled flagfalse
loki.test.enabledtests enabled flagfalse
loki.loki.auth_enabledauthentication enabled flagfalse
loki.loki.storage.bucketNames.chunksstorage chunks bucketilum-files
loki.loki.storage.bucketNames.rulerstorage ruler bucketilum-files
loki.loki.storage.bucketNames.adminstorage admin bucketilum-files
loki.loki.storage.typestorage types3
loki.loki.s3.endpoints3 storage endpointhttp://ilum-minio:9000
loki.loki.s3.regions3 storage endpointus-east-1
loki.loki.s3.secretAccessKeys3 storage secret access keyminioadmin
loki.loki.s3.accessKeyIds3 storage access key idminioadmin
loki.loki.s3.s3ForcePathStyles3 storage path style access flagtrue
loki.loki.s3.insecures3 storage insecure flagtrue
loki.loki.compactor.retention_enabledlogs retention enabled flagtrue
loki.loki.compactor.deletion_modedeletion modefilter-and-delete
loki.loki.compactor.shared_storeshared stores3
loki.loki.limits_config.allow_deletesallow logs deletion flagtrue
log aggregation - loki config
NameDescriptionValue
promtail.config.clients[0].urlfirst client urlhttp://ilum-loki-write:3100/loki/api/v1/push
promtail.snippets.pipelineStages[0].match.selectorpipeline stage to drop non ilum logs selector{ilum_logAggregation!="true"}
promtail.snippets.pipelineStages[0].match.actionpipeline stage to drop non ilum logs actiondrop
promtail.snippets.pipelineStages[0].match.drop_counter_reasonpipeline stage to drop non ilum logs drop_counter_reasonnon_ilum_log
promtail.snippets.extraRelabelConfigs[0].actionrelabel config to keep ilum pod labels actionlabelmap
promtail.snippets.extraRelabelConfigs[0].regexrelabel config to keep ilum pod labels regex__meta_kubernetes_pod_label_ilum(.*)
promtail.snippets.extraRelabelConfigs[0].replacementrelabel config to keep ilum pod labels replacementilum${1}
promtail.snippets.extraRelabelConfigs[1].actionrelabel config to keep spark pod labels actionlabelmap
promtail.snippets.extraRelabelConfigs[1].regexrelabel config to keep spark pod labels regex__meta_kubernetes_pod_label_spark(.*)
promtail.snippets.extraRelabelConfigs[1].replacementrelabel config to keep spark pod labels replacementspark${1}

RELEASE 6.1.2-RC1

RELEASE 6.1.1

1. Added health checks for ilum interactive jobs

Feature

To prevent situations with unexpected crushes of ilum groups we added healthchecks to make sure they work as they should.

Values added - ilum-core

ilum-job parameters
NameDescriptionValue
job.healthcheck.enabledspark interactive jobs healthcheck enabled flagtrue
job.healthcheck.intervalspark interactive jobs healthcheck interval in seconds300
job.healthcheck.tolerancespark interactive jobs healthcheck response time tolerance in seconds120

2. Parameterized replica scale for ilum scalable services

Feature

The configuration of the number of replicas for ilum scalable services was extracted to helm values.

Values added - ilum-core

ilum-core common parameters
NameDescriptionValue
replicaCountnumber of ilum-core replicas1

Values added - ilum-ui

ilum-ui common parameters
NameDescriptionValue
replicaCountnumber of ilum-ui replicas1

RELEASE 6.1.0

RELEASE 6.1.0-RC4

RELEASE 6.1.0-RC3

RELEASE 6.1.0-RC2

1. Deleted unneeded parameters from ilum cluster wasbs storage

Feature

WASBS storage containers no longer needs to have sas token porvided in helm values as it turned out to be unnecessary

Values deleted - ilum-core

wasbs storage parameters
NameReason
kubernetes.wasbs.sparkContainer.nameMoved to kubernetes.wasbs.sparkContainer value
kubernetes.wasbs.sparkContainer.sasTokenTurned out to be unnecessary
kubernetes.wasbs.dataContainer.nameMoved to kubernetes.wasbs.dataContainer value
kubernetes.wasbs.dataContainer.sasTokenTurned out to be unnecessary

Values added - ilum-core

wasbs storage parameters
NameDescriptionValue
kubernetes.wasbs.sparkContainerdefault kubernetes cluster WASBS storage container name to store spark resourcesilum-files
kubernetes.wasbs.dataContainerdefault kubernetes cluster WASBS storage container name to store ilum tablesilum-tables

2. Added init containers to check service availability

Feature

To make Ilum deployment more gracefully, from now on Ilum containers have containers waiting for the availability of the services they depend on.

Values added - ilum-core

NameDescriptionValue
mongo.statusProbe.enabledmongo status probe enabled flagtrue
mongo.statusProbe.imageinit container that waits for mongodb to be available imagemongo:7.0.5
kafka.statusProbe.enabledkafka status probe enabled flagtrue
kafka.statusProbe.imageinit container that waits for kafka to be available imagebitnami/kafka:3.4.1
historyServer.statusProbe.enabledilum history server ilum-core status probe enabled flagtrue
historyServer.statusProbe.imageilum history server init container that waits for ilum-core to be available imagecurlimages/curl:8.5.0

Values added - ilum-livy-proxy

NameDescriptionValue
statusProbe.enabledilum-core status probe enabled flagtrue
statusProbe.imageinit container that waits for ilum-core to be available imagecurlimages/curl:8.5.0

Values added - ilum-ui

NameDescriptionValue
statusProbe.enabledilum-core status probe enabled flagtrue
statusProbe.imageinit container that waits for ilum-core to be available imagecurlimages/curl:8.5.0

3. Parameterized kafka producers in ilum-core chart

Feature

In kafka communication mode ilum interactive jobs responses to interactive job instances using kafka producers. With newly added helm values kafka producer can be adapted to match user needs.

Values added - ilum-core

kafka parameters
NameDescriptionValue
kafka.maxPollRecordskafka max.poll.records parameter for ilum jobs kafka consumer, it determines how much requests ilum-job kafka consumer will fetch with each poll500
kafka.maxPollIntervalkafka max.poll.interval.ms parameter for ilum jobs kafka consumer, it determines the maximum delay between invocations of poll, which in ilum-job context means time limit for processing requests fetched in poll60000

RELEASE 6.1.0-RC1

1. added support for service annotations

Feature

Ilum helm charts services annotations may now be configured through helm values

Values added - ilum-core

service parameters
NameDescriptionValue
service.annotationsservice annotations{}
grpc.service.annotationsgrpc service annotations{}
historyServer.service.annotationshistory server service annotations{}

Values added - ilum-jupyter

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

Values added - ilum-livy-proxy

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

Values added - ilum-ui

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

Values added - ilum-zeppelin

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

2. Pulled out security oauth2 parameters to global values

Feature

Ilum security oauth2 configuration is now being set through global values

Values added - ilum-aio

security parameters
NameDescriptionValue
global.security.oauth2.clientIdoauth2 client ID``
global.security.oauth2.issuerUrioauth2 URI that can either be an OpenID Connect discovery endpoint or an OAuth 2.0 Authorization Server Metadata endpoint defined by RFC 8414``
global.security.oauth2.audiencesoauth2 audiences``
global.security.oauth2.clientSecretoauth2 client secret``

Values deleted - ilum-core

security parameters
NameReasonValue
security.oauth2.clientIdoauth2 security parameters are now configured through global values``
security.oauth2.issuerUrioauth2 security parameters are now configured through global values``

3. Runtime environment variables for frontend

Feature

Configuration for frontend environment variables throuhg helm ui values.

Values added - ilum-ui

runtime variables
NameDescriptionValue
runtimeVars.defaultConfigMap.enableddefault config map for frontend runtime environment variablestrue
runtimeVars.debugdebug logging flagfalse
runtimeVars.backenUrlilum-core backend urlhttp://ilum-core:9888
runtimeVars.historyServerUrlurl of history server uihttp://ilum-history-server:9666
runtimeVars.jupyterUrlurl of jupyter uihttp://ilum-jupyter:8888
runtimeVars.airflowUrlurl of airflow uihttp://ilum-webserver:8080
runtimeVars.minioUrlurl of minio uihttp://ilum-minio:9001
runtimeVars.mlflowUrlurl of mlflow uihttp://mlflow:5000
runtimeVars.historyServerPathilum-ui proxy path to history server ui/external/history-server/
runtimeVars.jupyterPathilum-ui proxy path to jupyter ui/external/jupyter/lab/tree/work/IlumIntro.ipynb
runtimeVars.airflowPathilum-ui proxy path to airflow ui/external/airflow/
runtimeVars.dataPathilum-ui proxy path to minio ui/external/minio/
runtimeVars.mlflowPathilum-ui proxy path to mlflow ui/external/mlflow/

Values deleted - ilum-ui

NameReason
debugmoved to runtimeVars section
backenUrlmoved to runtimeVars section
historyServerUrlmoved to runtimeVars section
jupyterUrlmoved to runtimeVars section
airflowUrlmoved to runtimeVars section

4. Kube-prometheus-stack in ilum-aio chart

Feature

Kube prometheus stack in ilum AIO chart. Preconfigured to automatically work wiht ilum deployment in order to collect metrics of ilum pods and spark jobs run by ilum. Ilum provides prometheus service monitors to autoamtically scrape metrics from spark driver pods run by ilum and ilum backend services. Additionally ilum_aio chart provides built-in grafana dashboards that can be found in Ilum folder.

Values added - ilum-aio

kube-prometheus-stack variables - for extended configuration check kube-prometheus stack helm chart
NameDescriptionValue
kube-prometheus-stack.enabledkube-prometheus-stack enabled flagfalse
kube-prometheus-stack.releaseLabelkube-prometheus-stack flag to watch resource only from ilum_aio releasetrue
kube-prometheus-stack.kubeStateMetrics.enabledkube-prometheus-stack Component scraping kube state metrics enabled flagfalse
kube-prometheus-stack.nodeExporter.enabledkube-prometheus-stack node exporter daemon set deployment flagfalse
kube-prometheus-stack.alertmanager.enabledkube-prometheus-stack alert manager flagfalse
kube-prometheus-stack.grafana.sidecar.dashboards.folderAnnotationkube-prometheus-stack, If specified, the sidecar will look for annotation with this name to create folder and put graph heregrafana_folder
kube-prometheus-stack.grafana.sidecar.dashboards.provider.foldersFromFilesStructurekube-prometheus-stack, allow Grafana to replicate dashboard structure from filesystemtrue

Values added - ilum-core

NameDescriptionValue
job.prometheus.enabledprometheus enabled flag, If true spark jobs run by Ilum will share metrics in prometheus formattrue

5. Marquez OpenLineage in ilum-aio chart

Feature

Marquez OpenLineage in ilum AIO chart. Marquez enables consuming, storing, and visualizing OpenLineage metadata from across an organization, serving use cases including data governance, data quality monitoring, and performance analytics. With marquez enabled in ilum AIO helm stack spark job run by Ilum will share lineage information with marquez backend. Marquez web interface visualize data lienage information collected from spark jobs and it is accesible through ilum UI as iframe.

Values added - ilum-aio

NameDescriptionValue
global.lineage.enabledmarquez enabled flagfalse

Values added - ilum-core

NameDescriptionValue
job.openLineage.transport.typemarquez communication typehttp
job.openLineage.transport.serverUrlmarquez backend url including namespace name, where event from ilum's spark job should be storedhttp://ilum-marquez:9555/api/v1/namespaces/ilum

Values added - ilum-marquez

Newly added whole chart, check its values on chart page

Values added - ilum-ui

NameDescriptionValue
runtimeVars.lineageUrlurl to provide marquez openlineage UI iframehttp://ilum-marquez-web:9444
runtimeVars.lineagePathilum-ui proxy path to marquez openlineage UI/external/lineage/

RELEASE 6.0.3

1. Parameterized kafka producers max.request.size parameter in ilum-core chart

Feature

In kafka communication mode ilum interactive jobs responses to interactive job instances using kafka producers. With newly added helm value max.request.size kafka producer parameter can be adapted to match responses size needs.

Values added - ilum-core

kafka parameters
NameDescriptionValue
kafka.requestSizekafka max.request.size parameter for ilum jobs kafka producers20000000

RELEASE 6.0.2

1. Support for hdfs, gcs and azure blob storage in ilum-core chart

Feature

Ilum cluster no longer has to be attached to s3 storage, from now default cluster can be configured to use hdfs, gcs or azure blob as storage as well. It can be achieved using newly added values in ilum-core helm chart.

Values deleted - ilum-core

NameReason
kubernetes.s3.bucketFrom now on two separated buckets must be set with new values: kubernetes.s3.sparkBucket, kubernetes.s3.dataBucket

Values added - ilum-core

kubernetes storage parameters
NameDescriptionValue
kubernetes.upgradeClusterOnStartupdefault kubernetes cluster upgrade from values in config map flagfalse
kubernetes.storage.typedefault kubernetes cluster storage type, available options: s3, gcs, wasbs, hdfss3
s3 kubernetes storage parameters
NameDescriptionValue
kubernetes.s3.hostdefault kubernetes cluster S3 storage host to store spark resourcess3
kubernetes.s3.portdefault kubernetes cluster S3 storage port to store spark resources7000
kubernetes.s3.sparkBucketdefault kubernetes cluster S3 storage bucket to store spark resourcesilum-files
kubernetes.s3.dataBucketdefault kubernetes cluster S3 storage bucket to store ilum tablesilum-tables
kubernetes.s3.accessKeydefault kubernetes cluster S3 storage access key to store spark resources""
kubernetes.s3.secretKeydefault kubernetes cluster S3 storage secret key to store spark resources""
gcs kubernetes storage parameters
NameDescriptionValue
kubernetes.gcs.clientEmaildefault kubernetes cluster GCS storage client email""
kubernetes.gcs.sparkBucketdefault kubernetes cluster GCS storage bucket to store spark resources"ilum-files"
kubernetes.gcs.dataBucketdefault kubernetes cluster GCS storage bucket to store ilum tables"ilum-tables"
kubernetes.gcs.privateKeydefault kubernetes cluster GCS storage private key to store spark resources""
kubernetes.gcs.privateKeyIddefault kubernetes cluster GCS storage private key id to store spark resources""
wasbs kubernetes storage parameters
NameDescriptionValue
kubernetes.wasbs.accountNamedefault kubernetes cluster WASBS storage account name""
kubernetes.wasbs.accessKeydefault kubernetes cluster WASBS storage access key to store spark resources""
kubernetes.wasbs.sparkContainer.namedefault kubernetes cluster WASBS storage container name to store spark resources"ilum-files"
kubernetes.wasbs.sparkContainer.sasTokendefault kubernetes cluster WASBS storage container sas token to store spark resources""
kubernetes.wasbs.dataContainer.namedefault kubernetes cluster WASBS storage container name to store ilum tables"ilum-tables"
kubernetes.wasbs.dataContainer.sasTokendefault kubernetes cluster WASBS storage container sas token to store ilum tables""
hdfs kubernetes storage parameters
NameDescriptionValue
kubernetes.hdfs.hadoopUsernamedefault kubernetes cluster HDFS storage hadoop username""
kubernetes.hdfs.configdefault kubernetes cluster HDFS storage dict of config files with name as key and base64 encoded content as value""
kubernetes.hdfs.sparkCatalogdefault kubernetes cluster HDFS storage catalog to store spark resources"ilum-files"
kubernetes.hdfs.dataCatalogdefault kubernetes cluster HDFS storage catalog to store ilum-tables"ilum-tables"
kubernetes.hdfs.keyTabdefault kubernetes cluster HDFS storage keytab file base64 encoded content""
kubernetes.hdfs.principaldefault kubernetes cluster HDFS storage principal name""
kubernetes.hdfs.krb5default kubernetes cluster HDFS storage krb5 file base64 encoded content""
kubernetes.hdfs.trustStoredefault kubernetes cluster HDFS storage trustStore file base64 encoded content""
kubernetes.hdfs.logDirectorydefault kubernetes cluster HDFS storage directory absolute path to store eventLog for history server""

Important! Make sure S3/GCS buckets or WASBS containers are already created and reachable!

2. Added spark history server to ilum-core helm chart

Feature

Spark history server can be deployed from now on along with ilum-core. History server config is being passed to every spark job ilum runs. History server UI can now be accesesed by ilum UI. If enabled it will use default kubernetes cluster storage configured with kubernetes.[STORAGE_TYPE].[PARAMETER] values as eventLog storage.

Values added - ilum-core

history server parameters
NameDescriptionValue
historyServer.enabledspark history server flagtrue
historyServer.imagespark history server imageilum/spark-launcher:spark-3.5.1
historyServer.addressspark history server addresshttp://ilum-history-server:9666
historyServer.pullPolicyspark history server image pull policyIfNotPresent
historyServer.imagePullSecretsspark history server image pull secrets[]
historyServer.parametersspark history server custom spark parameters[]
historyServer.resourcesspark history server pod resources
limits:
memory: "500Mi"
requests:
memory: "300Mi"
historyServer.service.typespark history server service typeClusterIP
historyServer.service.portspark history server service port9666
historyServer.service.nodePortspark history server service nodePort""
historyServer.service.clusterIPspark history server service clusterIP""
historyServer.service.loadBalancerIPspark history server service loadbalancerIP""
historyServer.ingress.enabledspark history server ingress flagfalse
historyServer.ingress.versionspark history server ingress version"v1"
historyServer.ingress.classNamespark history server ingress className""
historyServer.ingress.hostspark history server ingress host"host"
historyServer.ingress.pathspark history server ingress path"/(.*)"
historyServer.ingress.pathTypespark history server ingress pathTypePrefix
historyServer.ingress.annotationsspark history server annotationsnginx.ingress.kubernetes.io/rewrite-target: /$1
nginx.ingress.kubernetes.io/proxy-body-size: "600m"
nginx.org/client-max-body-size: "600m"

Warnings

  1. Make sure HDFS logDirectory (helm value kubernetes.hdfs.logDirectory) is absolute path of configured sparkCatalog with /ilum/logs suffix! Eg for kubernetes.hdfs.sparkCatalog=spark-catalog put hdfs://name-node/user/username/spark-catalog/ilum/logs

3. Job retention in ilum-core chart

Feature

Ilum jobs will be deleted after the configured retention period expires

Values added - ilum-core

job retention parameters
NameDescriptionValue
job.retain.hoursspark jobs retention hours limit168

RELEASE 6.0.1

RELEASE 6.0.0