Skip to main content

Upgrade Notes

Upgrade notes

NOTE TEMPLATE

1. Change

Feature:

Feature description

Values deleted - chart name

NameReason
helm.valueHelm value deletion reason

Values added - chart name

Values section description
NameDescriptionValue
helm.valueHelm value descriptiondefault value

Names changed - chart name

Old NameNew Name
old namenew name

⚠️⚠️⚠️ Warnings

Values changed - chart name

NameOld valueNew Value
helm.valueold valuenew value

NEXT RELEASE

RELEASE 6.3.1

1. Added extra buckets to storage config

Feature

Added ability to include extra buckets to ilum cluster spark storage configuration.

Values added - ilum-core

NameDescriptionValue
kubernetes.s3.extraBucketsilum-core default kubernetes cluster S3 storage extra buckets to include[]
kubernetes.gcs.extraBucketsilum-core default kubernetes cluster GCS storage extra buckets to include[]
kubernetes.wasbs.extraContainersilum-core default kubernetes cluster WASBS storage extra containers to include[]
kubernetes.hdfs.extraCatalogsilum-core default kubernetes cluster HDFS storage extra catalogs to include[]

2. Enhanced LDAP

Added the ability to map users, groups, and roles — along with their properties and relationships — from an LDAP server to Ilum Core based on mapping configurations in Helm. Enabled the option to assign Admin role to LDAP users.

Values added - ilum-core

NameDescriptionValue
security.ldap.userMapping.baseLDAP base of user entries""
security.ldap.userMapping.filterLDAP filter used for users search"uid={0}"
security.ldap.userMapping.usernameName of LDAP attribute that should be mapped to user`s usernameuid
security.ldap.userMapping.passwordName of LDAP attribute that should be mapped to user`s passworduserPassword
security.ldap.userMapping.descriptionName of LDAP attribute that should be mapped to user`s description""
security.ldap.userMapping.fullnameName of LDAP attribute that should be mapped to user`s fullname""
security.ldap.userMapping.departmentName of LDAP attribute that should be mapped to user`s department""
security.ldap.userMapping.emailName of LDAP attribute that should be mapped to user`s email""
security.ldap.userMapping.enabledName of LDAP attribute that should be mapped to user`s state""
security.ldap.userMapping.enabledValueValue of attribute with the name of userMapping.enabled that stands for ENABLED""
security.ldap.groupMapping.baseLDAP base for group entries""
security.ldap.groupMapping.filterLDAP filter used for groups search(member={0})
security.ldap.groupMapping.nameName of LDAP attribute that should be mapped to group`s namecn
security.ldap.groupMapping.descriptionName of LDAP attribute that should be mapped to group`s description""
security.ldap.groupMapping.memberAttributeName of LDAP attribute that lists users having the groupuid
security.ldap.groupMapping.rolesLDAP attribute that lists the roles that the group includes""
security.ldap.groupMapping.roleFilterAttributeLDAP attribute of roles that represents a role in groupMapping.roles attribute""
security.ldap.groupMapping.enabledName of LDAP attribute that should be mapped to group`s state""
security.ldap.groupMapping.enabledTrueValue of attribute from groupMapping.enabled that stands for ENABLED""
security.ldap.roleMapping.baseLDAP base for role entries""
security.ldap.roleMapping.filterLDAP filter used for roles search""
security.ldap.roleMapping.memberAttributeName of LDAP attribute that lists users having the role""
security.ldap.roleMapping.nameName of LDAP attribute that should be mapped to role`s name""
security.ldap.roleMapping.descriptionName of LDAP attribute that should be mapped to role`s description""
security.ldap.roleMapping.enabledName of LDAP attribute that should be mapped to role`s state""
security.ldap.roleMapping.enabledTrueValue of attribute from roleMapping.enabled that stands for ENABLED""

Values deleted - ilum-core

NameReason
security.ldap.userSearchReplaced with security.ldap.userMapping
security.ldap.groupSearchReplaced with security.ldap.groupMapping

Names changed - ilum-core

Old NameNew Name
security.internal.users[*].passwordsecurity.internal.users[*].initialPassword

3. sparkmagic default config

Values changed - ilum-jupyter configuration parameters change because of the introduction of the spark session form.

NameOld valueNew Value
sparkmagic.config.sessionConfigs.conf'{ "pyRequirements": "pandas", "cluster": "default", "autoPause": "false", "spark.example.config": "You can change the default configuration in ilum-jupyter-config k8s configmap" }''{}'

Values deleted - ilum-jupyter

NameReason
sparkmagic.config.sessionConfigs.executorCoresNot needed anymore because of the new spark session form

Values deleted - ilum-jupyter

NameReason
sparkmagic.config.sessionConfigs.driverMemoryNot needed anymore because of the new spark session form

RELEASE 6.3.0

1. Changed image tag version of kyuubi

Values changed - sparkmagic configuration parameters

NameOld valueNew Value
sparkmagic.config.sessionConfigs.conf'{ "pyRequirements": "pandas", "spark.example.config": "You can change the default configuration in ilum-jupyter-config k8s configmap" }''{ "pyRequirements": "pandas", "cluster": "default", "autoPause": "false", "spark.example.config": "You can change the default configuration in ilum-jupyter-config k8s configmap" }'

2. Added property to set kafka address for ilum-core

Feature

Added ability to set kafka address for ilum-core pod, separate from global kafka address configuration for both spark jobs and ilum-core pod set via kafka.address property.

Values added - ilum-core

NameDescriptionValue
kafka.ilum.addressilum-core kafka address only for ilum-core pod, overrides kafka.addressnot defined

3. Changed ilum job healthcheck tolerance time

Values changed - ilum job healthcheck configuration parameters

NameOld valueNew Value
job.healthcheck.tolerance1203600

4. Introducing embedded git repo

Feature

Added Gitea as a module providing build in git server for ilum platform.

Values added - gitea

NameDescriptionValue
gitea.enabledEnable or disable Gitea deploymenttrue
gitea.image.rootlessRun Gitea in rootless modefalse
gitea.gitea.config.database.DB_TYPEDatabase type used for Giteapostgres
gitea.gitea.config.database.HOSTDatabase host and port for Giteailum-postgresql-hl:5432
gitea.gitea.config.database.NAMEDatabase name for Giteagitea
gitea.gitea.config.database.USERDatabase username for Giteailum
gitea.gitea.config.database.PASSWDDatabase password for Gitea (Change required)CHANGEMEPLEASE
gitea.gitea.admin.existingSecretGitea secret to store init credentialsilum-git-credentials
gitea.gitea.admin.emailGitea admin emaililum@ilum
gitea.gitea.admin.passwordModePassword mode for admin accountinitialOnlyNoReset
gitea.gitea.additionalConfigFromEnvs[0].nameEnable push-create userGITEA__REPOSITORY__ENABLE_PUSH_CREATE_USER
gitea.gitea.additionalConfigFromEnvs[0].valueValue for enabling push-create usertrue
gitea.gitea.additionalConfigFromEnvs[1].nameEnable push-create organizationGITEA__REPOSITORY__ENABLE_PUSH_CREATE_ORG
gitea.gitea.additionalConfigFromEnvs[1].valueValue for enabling push-create organizationtrue
gitea.gitea.additionalConfigFromEnvs[2].nameDefault repository branchGITEA__REPOSITORY__DEFAULT_BRANCH
gitea.gitea.additionalConfigFromEnvs[2].valueValue for default repository branchmaster
gitea.gitea.additionalConfigFromEnvs[3].nameRoot URL of the Gitea serverGITEA__SERVER__ROOT_URL
gitea.gitea.additionalConfigFromEnvs[3].valueValue for Gitea server root URLhttp://git.example.com/external/gitea/
gitea.gitea.additionalConfigFromEnvs[4].nameStatic URL prefixGITEA__SERVER__STATIC_URL_PREFIX
gitea.gitea.additionalConfigFromEnvs[4].valueValue for static URL prefix/external/gitea/
gitea.redis-cluster.enabledEnable or disable Redis clusterfalse
gitea.redis.enabledEnable or disable Redisfalse
gitea.postgresql.enabledEnable or disable standalone PostgreSQLfalse
gitea.postgresql-ha.enabledEnable or disable PostgreSQL HAfalse

Values added - ilum-jupyter

NameDescriptionValue
ilum-jupyter.git.enabledEnable or disable Git integrationfalse
ilum-jupyter.git.usernameGit username for authenticationilum
ilum-jupyter.git.passwordGit password for authenticationilum
ilum-jupyter.git.emailGit email addressilum@ilum
ilum-jupyter.git.repositoryGit repository namejupyter
ilum-jupyter.git.addressGit server addressilum-gitea-http:3000
ilum-jupyter.git.init.imageGit initialization imagebitnami/git:2.48.1

Values added - ilum-airflow

NameDescriptionValue
airflow.dags.gitSync.enabledEnable or disable Git synchronization for DAGstrue
airflow.dags.gitSync.repoGit repository URL for DAGshttp://ilum-gitea-http:3000/ilum/airflow.git
airflow.dags.gitSync.branchGit branch to sync frommaster
airflow.dags.gitSync.refGit reference to syncHEAD
airflow.dags.gitSync.depthGit clone depth1
airflow.dags.gitSync.maxFailuresMaximum allowed synchronization failures0
airflow.dags.gitSync.subPathSubpath within the repository to sync""
airflow.dags.gitSync.credentialsSecretSecret used for Git authenticationilum-git-credentials

5. Ilum SQL configuration naming changes

Change the naming of Ilum Sql Configuration to better reflect the current usage of Kyuubi

Names changed - ilum-core

Old NameNew Name
kyuubi.*sql.*

Names changed - ilum-aio

Old NameNew Name
ilum-kyuubi.*ilum-sql.*

⚠️⚠️⚠️ Warnings

Due to the changes in naming and in the inner workings of the SQL engine launching, and restrictions on what can be done via a helm upgrade, it is required to manually delete the old stateful set (e.g. kubectl delete sts ilum-sql) before upgrading to this version. This will ensure that during the update, a new stateful set is created with the correct configuration. The breaking changes are related to the labels and volume mounts that are used by the ilum-sql stateful set.

6. Add configurations for Ilum Submit for Spark Sql engines

Ilum Submit enhances the process of launching Spark SQL engines via both the Ilum Web Application and the JDBC endpoint by automatically applying the configurations of the selected cluster. This improvement eliminates the need to manually provide Kyuubi's Spark configuration to Ilum Core.

Valued deleted - ilum-core

NameReason
sql.sparkConfigUnnecessary after the change

Values added - ilum-kyuubi

NameDescriptionValue
ilumSubmit.enabledFlag to enable ilum submit servicefalse
ilumSubmit.ilum.hostHost of Ilum REST serviceilum-core
ilumSubmit.ilum.portPort of Ilum REST service9888

Values added - ilum-aio

NameDescriptionValue
ilum-sql.ilumSubmit.enabledFlag to enable SQL engine creation through Ilumtrue

⚠️⚠️⚠️ Warnings

Since Kyuubi's Spark config is not needed in Ilum Core anymore, the default spark config should be supplied directly to ilum-sql.config.spark.defaults instead of the global value.

Feature

Security‑related configuration (including internal user credentials, LDAP, OAuth2, JWT, and authorities settings) has been moved from the config map to a dedicated Kubernetes Secret. This improves the security of sensitive data by isolating it from non‑sensitive configuration.

Values added - ilum-core

NameDescriptionValue
security.secret.nameName of the secret that holds security‑related configuration. Use this to override the default secret name.ilum-security

8. Changed ilum-ui service type

Because of the problems with kubectl port-forward we are exposing a NodePort by default.

Values changed - ilum-ui healthcheck configuration parameters

NameOld valueNew Value
service.typeClusterIPNodePort
service.nodePort``31777

RELEASE 6.2.1

1. Change the value of Kyuubi's url

Feature

Change the value of Kyuubi's url in ilum-core. The default value should work now out of the box.

Values changed - ilum-core

NameOld valueNew Value
kyuubi.hostilum-sql-restilum-sql-headless

RELEASE 6.2.1-RC1

1. Spark job's memory settings configuration in ilum-core

Feature

Added spark job's memory settings configuration in ilum-core. When default cluster in ilum-core is being created, it will have memory settings parameters set to those values.

Values added - ilum-core

NameDescriptionValue
job.memorysettings.executorsspark jobs executor count2
job.memorysettings.executorMemoryspark jobs executor memory allocation1g
job.memorysettings.driverMemoryspark jobs driver memory allocation1g
job.memorysettings.executorCoresspark jobs executor core count1
job.memorysettings.driverCoresspark jobs driver core count1
job.memorysettings.dynamicAllocationEnabledspark jobs dynamic allocation enabled flagfalse
job.memorysettings.minExecutorsspark jobs minimum number of executors0
job.memorysettings.initialExecutorsspark jobs initial number of executors0
job.memorysettings.maxExecutorsspark jobs maximum number of executors20

2. Spark history server retention parameters addition

Feature

Added spark history server retention parameters to ilum-core. These parameters allow the user to configure the retention of spark history server logs.

Values added - ilum-core

NameDescriptionValue
historyServer.parameters.spark.history.fs.cleaner.enabledhistory server cleaner enabled flagtrue
historyServer.parameters.spark.history.fs.cleaner.intervalhistory server cleaner interval1d
historyServer.parameters.spark.history.fs.cleaner.maxAgehistory server logs max age7d

3. Split Kyuubi's url into host and port

Feature

Split Kyuubi's url into host and port in ilum-core. This change was necessary for us to be able to create custom engines.

Values added - ilum-core

NameDescriptionValue
kyuubi.hostKyuubi hostilum-sql-rest
kyuubi.portKyuubi port10099

Values deleted - ilum-core

NameReason
kyuubi.urlUnnecessary after the change

4. Extra entries in helm_ui nginx server config map

Feature

Added enabled flags for history server, minio, ilum-jupyter, airflow, mlflow and lineage to ilum-ui. These flags allow the user to enable or disable the access to these services through ilum-ui. These values will be used in nginx server config map.

Values added - ilum-ui chart

NameDescriptionValue
nginx.config.ilum-jupyter.enabledilum-ui nginx config ilum-jupyter enabled flagfalse
nginx.config.airflow.enabledilum-ui nginx config airflow enabled flagfalse
nginx.config.minio.enabledilum-ui nginx config minio enabled flagfalse
nginx.config.historyServer.enabledilum-ui nginx config historyServer enabled flagfalse
nginx.config.mlflow.enabledilum-ui nginx config mlflow enabled flagfalse
nginx.config.lineage.enabledilum-ui nginx config lineage enabled flagfalse

5. Superset in ilum-aio chart

Feature

Superset in ilum AIO chart. Superset is fast, lightweight, intuitive, and loaded with options that make it easy for users of all skill sets to explore and visualize their data, from simple line charts to highly detailed geospatial charts. Superset is one of modules integrated with Ilum platform.

Values added - ilum-ui

log aggregation config
NameDescriptionValue
runtimeVars.supersetUrlsuperset service urlhttp://ilum-superset:8088/
nginx.config.superset.enabledilum-ui nginx config superset enabled flagfalse

4. Ilum default kubernetes cluster config from helm values

Feature

From now on, the default ilum cluster parameters will be set based on the helm values.

Values added - ilum-core chart

NameDescriptionValue
kubernetes.defaultCluster.configilum-core default kubernetes cluster configurationconfig:
  spark.driver.extraJavaOptions: "-Divy.cache.dir=/tmp -Divy.home=/tmp"
  spark.kubernetes.container.image: "ilum/spark:3.5.2-delta"
  spark.databricks.delta.catalog.update.enabled: "true"

RELEASE 6.2.0

1. Changed image tag version of kyuubi

Values changed - ilum-kyuubi chart

NameOld valueNew Value
image.tag1.9.2-spark1.10.0-spark

2. Changed kyuubi spark configuration in ilum-kyuubi chart

Added spark.driver.memory=2g in global.kyuubi.sparkConfig

RELEASE 6.2.0-RC2

1. Minio status probe addition

Feature

Added status probe in ilum-core that checks whether minio storage is ready

Values added - ilum-core

NameDescriptionValue
minio.statusProbe.enabledminio status probe enabled flagtrue
minio.statusProbe.imageminio status probe imagecurlimages/curl:8.5.0
minio.statusProbe.baseUrlminio base url"http://ilum-minio:9000"

2. Kyuubi configuration in ilum-core

Feature

Added Kyuubi configuration in ilum-core helm chart. Kyuubi will allow the user to execute SQL queries on many different data sources using ILUM UI.

Values added - ilum-core

NameDescriptionValue
kyuubi.enabledKyuubi enabled flagtrue
kyuubi.urlUrl of Kyuubi's rest servicehttp://ilum-sql-rest:10099

⚠️⚠️⚠️ Warnings

In order to properly manage SQL engines, we need to pass Kyuubi's spark configuration to ilum-core. This is done by configuring Kyuubi's spark in global.kyuubi.sparkConfig and allows the user to write one configuration which can be passed to both Kyuubi and ilum-core.

3. MongoDb uri configuration in ilum-core

Feature

Change the way mongoDb uri is passed to ilum-core. Now it is passed as a single string, which enables the user to provide more granular configuration such as authSource.

Values added - ilum-core

NameDescriptionValue
mongo.uriMongoDb connection stringmongodb://mongo:27017/ilum-default?replicaSet=rs0

Values deleted - ilum-core

NameReason
mongo.instancesUnnecessary after the change
mongo.replicaSetNameUnnecessary after the change

⚠️⚠️⚠️ Warnings

The mongo.uri, if set incorrectly, will cause the application to not work properly. Make sure to provide the correct connection string.

Previously the format was: mongodb://{ mongo.instances }/ilum-{ release_namespace }?replicaSet={ mongo.replicaSetName } By default in the ilum-aio chart these values were:

  • mongo.instances - ilum-mongodb-0.ilum-mongodb-headless:27017,ilum-mongodb-1.ilum-mongodb-headless:27017
  • mongo.replicaSetName - rs0
  • release_namespace - default

4. Autopausing configuration in ilum-core

Feature

Added autopausing in ilum-core, which periodically checks if any groups are idle for the specified time and pauses the group. Each group has to have autopausing exclicitly turned on for this to take place.

Values added - ilum-core

NameDescriptionValue
job.autoPause.enabledFeature flag to enable auto pausingtrue
job.autoPause.periodInterval in seconds to check the idleness groups180
job.autoPause.idleTimeTime in seconds that the group needs to be idle to be auto paused3600

5. Graphite exporter in ilum-aio chart

Feature

Graphite exporter in ilum AIO chart and Graphite configuration in ilum-core chart. Graphite exporter is a Prometheus exporter for metrics exported in the Graphite plaintext protocol.

Values added - graphite-exporter

Newly added whole chart, check its values on the chart's page

6. Graphite configuration in ilum-core

Feature

Added Graphite configuration in ilum-core helm chart. Graphite will allow Spark jobs to send their metrics to graphite sink, which will be scraped by Prometheus.

Values added - ilum-core

NameDescriptionValue
job.graphite.enabledGraphite enabled flagfalse
job.graphite.hostGraphite hostilum-graphite-graphite-tcp
job.graphite.portGraphite port9109
job.graphite.periodInterval between sending job metrics10
job.graphite.unitsTime unitseconds

RELEASE 6.1.4

1. Jupyter default sparkmagic configuration change

Feature

Changed method of passing spark default configs to jupyter notebook, now it is passed as json string

Values added - ilum-jupyter

sparkmagic configuration parameters
NameDescriptionValue
sparkmagic.config.sessionConfigs.confsparkmagic session spark configuration'{ "pyRequirements": "pandas", "spark.jars.packages": "io.delta:delta-core_2.12:2.4.0", "spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension", "spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog"}'
sparkmagic.config.sessionConfigsDefaults.confsparkmagic session defaults spark configuration'{ "pyRequirements": "pandas", "spark.jars.packages": "io.delta:delta-core_2.12:2.4.0", "spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension", "spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog"}'

2. Kyuubi in ilum-aio chart

Feature

Kyuubi in ilum AIO chart. Kyuubi is a distributed multi-tenant gateway providing SQL query services for data warehouses and lakehouses. It provides both JDBC and ODBC interfaces, and a REST API for clients to interact with.

Values added - ilum-kyuubi

Newly added whole chart, check its values on the chart's page

RELEASE 6.1.3

1. Jupyter configuration and persistent storage

Feature

Added extended configuration of jupyter notebook helm chart through helm values. Moreover added persitent storage to jupyter pod. All data saved in work directory will now be available after jupyter restart/update.

Values added - ilum-jupyter

pvc parameters
NameDescriptionValue
pvc.annotationspersistent volume claim annotations{}
pvc.selectorpersistent volume claim selector{}
pvc.accessModespersistent volume claim accessModesReadWriteOnce
pvc.storagepersistent volume claim storage requests4Gi
pvc.storageClassNamepersistent volume claim storageClassName``
sparkmagic configuration parameters
NameDescriptionValue
sparkmagic.config.kernelPythonCredentials.usernamesparkmagic python kernel username""
sparkmagic.config.kernelPythonCredentials.passwordsparkmagic python kernel password""
sparkmagic.config.kernelPythonCredentials.authsparkmagic python kernel auth mode"None"
sparkmagic.config.kernelScalaCredentials.usernamesparkmagic python kernel username""
sparkmagic.config.kernelScalaCredentials.passwordsparkmagic scala kernel password""
sparkmagic.config.kernelScalaCredentials.authsparkmagic scala kernel auth mode"None"
sparkmagic.config.kernelRCredentials.usernamesparkmagic r kernel username""
sparkmagic.config.kernelRCredentials.passwordsparkmagic r kernel password""
sparkmagic.config.waitForIdleTimeoutSecondssparkmagic timeout waiting for idle state15
sparkmagic.config.livySessionStartupTimeoutSecondssparkmagic timeout waiting for the session to start300
sparkmagic.config.ignoreSslErrorssparkmagic ignore ssl errors flagfalse
sparkmagic.config.sessionConfigs.confsparkmagic session spark configuration[pyRequirements: pandas, spark.jars.packages: io.delta:delta-core_2.12:2.4.0, spark.sql.extensions: io.delta.sql.DeltaSparkSessionExtension,spark.sql.catalog.spark_catalog: org.apache.spark.sql.delta.catalog.DeltaCatalog]
sparkmagic.config.sessionConfigs.driverMemorysparkmagic session driver memory1000M
sparkmagic.config.sessionConfigs.executorCoressparkmagic session executor cores2
sparkmagic.config.sessionConfigsDefaults.confsparkmagic session defaults spark configuration[pyRequirements: pandas, spark.jars.packages: io.delta:delta-core_2.12:2.4.0, spark.sql.extensions: io.delta.sql.DeltaSparkSessionExtension,spark.sql.catalog.spark_catalog: org.apache.spark.sql.delta.catalog.DeltaCatalog]
sparkmagic.config.sessionConfigsDefaults.driverMemorysparkmagic session defaults driver memory1000M
sparkmagic.config.sessionConfigsDefaults.executorCoressparkmagic session defaults executor cores2
sparkmagic.config.useAutoVizsparkmagic use auto viz flagtrue
sparkmagic.config.coerceDataframesparkmagic coerce dataframe flagtrue
sparkmagic.config.maxResultsSqlsparkmagic max sql result2500
sparkmagic.config.pysparkDataframeEncodingsparkmagic pyspark dataframe encodingutf-8
sparkmagic.config.heartbeatRefreshSecondssparkmagic heartbeat refresh seconds30
sparkmagic.config.livyServerHeartbeatTimeoutSecondssparkmagic livy server heartbeat timeout seconds0
sparkmagic.config.heartbeatRetrySecondssparkmagic heartbeat retry seconds10
sparkmagic.config.serverExtensionDefaultKernelNamesparkmagic server extension default kernel namepysparkkernel
sparkmagic.config.retryPolicysparkmagic retry policyconfigurable
sparkmagic.config.retrySecondsToSleepListsparkmagic retry seconds to sleep list[0.2, 0.5, 1, 3, 5]
sparkmagic.config.configurableRetryPolicyMaxRetriessparkmagic retry policy max retries8

RELEASE 6.1.2

1. Hive metastore in ilum-aio chart

Feature

Hive metastore in ilum AIO chart. HMS is a central repository of metadata for Hive tables and partitions in a relational database, and provides clients (including Hive, Impala and Spark) access to this information using the metastore service API. With hive metastore enabled in ilum AIO helm stack spark jobs run by ilum can be configured to autmatically access it.

Values added - ilum-hive-metastore

Newly added whole chart, check its values on chart page

Values added - ilum-core

NameDescriptionValue
hiveMetastore.enabledpassing hive metastore config to ilum spark jobs flagfalse
hiveMetastore.addresshive metastore addressthrift://ilum-hive-metastore:9083
hiveMetastore.warehouseDirhive metastore warehouse directorys3a://ilum-data/

2. Postgres extensions added

Feature

Few of ilum AIO subchars use postgresql, to make it easier to manage deployment of them we have added postgres extension resource to create postgresql databases for ilum sucharts.

Values added - ilum-aio

postgresql extensions parameters
NameDescriptionValue
postgresExtensions.enabledpostgres extensions enabled flagtrue
postgresExtensions.imageimage to run extensions inbitnami/postgresql:16
postgresExtensions.pullPolicyimage pull policyIfNotPresent
postgresExtensions.imagePullSecretsimage pull secrets[]
postgresExtensions.hostpostgresql database hostilum-postgresql-0.ilum-postgresql-hl
postgresExtensions.portpostgresql database port5432
postgresExtensions.databasesToCreatecomma separated list of databases to createmarquez,airflow,metastore
postgresExtensions.auth.usernamepostgresql account usernameilum
postgresExtensions.auth.passwordpostgresql account passwordCHANGEMEPLEASE
postgresExtensions.nodeSelectorpostgresql extensions pods node selector{}
postgresExtensions.tolerationspostgresql extensions pods tolerations[]

3. Loki and promtail in ilum-aio chart

Feature

Loki and promtail in ilum AIO chart. Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. Promtail is an agent which ships the contents of local logs to a Grafana Loki instance. Ilum will now use loki to aggregate logs from spark job pods to be able to clean cluster resources after jobs are done. Loki and promtail are preconfigured to scrap logs only from spark pods run by ilum in order to fetch job logs after their finish.

Values added - ilum-core

log aggregation config
NameDescriptionValue
global.logAggregation.enabledilum log aggregation flag, if enabled Ilum will fetch logs of finished kubernetes spark pods from lokifalse
global.logAggregation.loki.urlloki gateway address to access logshttp://ilum-loki-gateway

Values added - ilum-aio

log aggregation - loki config
NameDescriptionValue
loki.nameOverridesubchart name overrideilum-loki
loki.monitoring.selfMonitoring.enabledself monitoring enabled flagfalse
loki.monitoring.selfMonitoring.grafanaAgent.installOperatorself monitoring grafana agent operator install flagfalse
loki.monitoring.selfMonitoring.lokiCanary.enabledself monitoring canary enabled flagfalse
loki.test.enabledtests enabled flagfalse
loki.loki.auth_enabledauthentication enabled flagfalse
loki.loki.storage.bucketNames.chunksstorage chunks bucketilum-files
loki.loki.storage.bucketNames.rulerstorage ruler bucketilum-files
loki.loki.storage.bucketNames.adminstorage admin bucketilum-files
loki.loki.storage.typestorage types3
loki.loki.s3.endpoints3 storage endpointhttp://ilum-minio:9000
loki.loki.s3.regions3 storage endpointus-east-1
loki.loki.s3.secretAccessKeys3 storage secret access keyminioadmin
loki.loki.s3.accessKeyIds3 storage access key idminioadmin
loki.loki.s3.s3ForcePathStyles3 storage path style access flagtrue
loki.loki.s3.insecures3 storage insecure flagtrue
loki.loki.compactor.retention_enabledlogs retention enabled flagtrue
loki.loki.compactor.deletion_modedeletion modefilter-and-delete
loki.loki.compactor.shared_storeshared stores3
loki.loki.limits_config.allow_deletesallow logs deletion flagtrue
log aggregation - loki config
NameDescriptionValue
promtail.config.clients[0].urlfirst client urlhttp://ilum-loki-write:3100/loki/api/v1/push
promtail.snippets.pipelineStages[0].match.selectorpipeline stage to drop non ilum logs selector{ilum_logAggregation!="true"}
promtail.snippets.pipelineStages[0].match.actionpipeline stage to drop non ilum logs actiondrop
promtail.snippets.pipelineStages[0].match.drop_counter_reasonpipeline stage to drop non ilum logs drop_counter_reasonnon_ilum_log
promtail.snippets.extraRelabelConfigs[0].actionrelabel config to keep ilum pod labels actionlabelmap
promtail.snippets.extraRelabelConfigs[0].regexrelabel config to keep ilum pod labels regex__meta_kubernetes_pod_label_ilum(.*)
promtail.snippets.extraRelabelConfigs[0].replacementrelabel config to keep ilum pod labels replacementilum${1}
promtail.snippets.extraRelabelConfigs[1].actionrelabel config to keep spark pod labels actionlabelmap
promtail.snippets.extraRelabelConfigs[1].regexrelabel config to keep spark pod labels regex__meta_kubernetes_pod_label_spark(.*)
promtail.snippets.extraRelabelConfigs[1].replacementrelabel config to keep spark pod labels replacementspark${1}

RELEASE 6.1.1

1. Added health checks for ilum interactive jobs

Feature

To prevent situations with unexpected crushes of ilum groups we added healthchecks to make sure they work as they should.

Values added - ilum-core

ilum-job parameters
NameDescriptionValue
job.healthcheck.enabledspark interactive jobs healthcheck enabled flagtrue
job.healthcheck.intervalspark interactive jobs healthcheck interval in seconds300
job.healthcheck.tolerancespark interactive jobs healthcheck response time tolerance in seconds120

2. Parameterized replica scale for ilum scalable services

Feature

The configuration of the number of replicas for ilum scalable services was extracted to helm values.

Values added - ilum-core

ilum-core common parameters
NameDescriptionValue
replicaCountnumber of ilum-core replicas1

Values added - ilum-ui

ilum-ui common parameters
NameDescriptionValue
replicaCountnumber of ilum-ui replicas1

RELEASE 6.1.0

1. Deleted unneeded parameters from ilum cluster wasbs storage

Feature

WASBS storage containers no longer needs to have sas token porvided in helm values as it turned out to be unnecessary

Values deleted - ilum-core

wasbs storage parameters
NameReason
kubernetes.wasbs.sparkContainer.nameMoved to kubernetes.wasbs.sparkContainer value
kubernetes.wasbs.sparkContainer.sasTokenTurned out to be unnecessary
kubernetes.wasbs.dataContainer.nameMoved to kubernetes.wasbs.dataContainer value
kubernetes.wasbs.dataContainer.sasTokenTurned out to be unnecessary

Values added - ilum-core

wasbs storage parameters
NameDescriptionValue
kubernetes.wasbs.sparkContainerdefault kubernetes cluster WASBS storage container name to store spark resourcesilum-files
kubernetes.wasbs.dataContainerdefault kubernetes cluster WASBS storage container name to store ilum tablesilum-tables

2. Added init containers to check service availability

Feature

To make Ilum deployment more gracefully, from now on Ilum containers have containers waiting for the availability of the services they depend on.

Values added - ilum-core

NameDescriptionValue
mongo.statusProbe.enabledmongo status probe enabled flagtrue
mongo.statusProbe.imageinit container that waits for mongodb to be available imagemongo:7.0.5
kafka.statusProbe.enabledkafka status probe enabled flagtrue
kafka.statusProbe.imageinit container that waits for kafka to be available imagebitnami/kafka:3.4.1
historyServer.statusProbe.enabledilum history server ilum-core status probe enabled flagtrue
historyServer.statusProbe.imageilum history server init container that waits for ilum-core to be available imagecurlimages/curl:8.5.0

Values added - ilum-livy-proxy

NameDescriptionValue
statusProbe.enabledilum-core status probe enabled flagtrue
statusProbe.imageinit container that waits for ilum-core to be available imagecurlimages/curl:8.5.0

Values added - ilum-ui

NameDescriptionValue
statusProbe.enabledilum-core status probe enabled flagtrue
statusProbe.imageinit container that waits for ilum-core to be available imagecurlimages/curl:8.5.0

3. Parameterized kafka producers in ilum-core chart

Feature

In kafka communication mode ilum interactive jobs responses to interactive job instances using kafka producers. With newly added helm values kafka producer can be adapted to match user needs.

Values added - ilum-core

kafka parameters
NameDescriptionValue
kafka.maxPollRecordskafka max.poll.records parameter for ilum jobs kafka consumer, it determines how much requests ilum-job kafka consumer will fetch with each poll500
kafka.maxPollIntervalkafka max.poll.interval.ms parameter for ilum jobs kafka consumer, it determines the maximum delay between invocations of poll, which in ilum-job context means time limit for processing requests fetched in poll60000

RELEASE 6.1.0-RC1

1. added support for service annotations

Feature

Ilum helm charts services annotations may now be configured through helm values

Values added - ilum-core

service parameters
NameDescriptionValue
service.annotationsservice annotations{}
grpc.service.annotationsgrpc service annotations{}
historyServer.service.annotationshistory server service annotations{}

Values added - ilum-jupyter

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

Values added - ilum-livy-proxy

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

Values added - ilum-ui

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

Values added - ilum-zeppelin

service parameters
NameDescriptionValue
service.annotationsservice annotations{}

2. Pulled out security oauth2 parameters to global values

Feature

Ilum security oauth2 configuration is now being set through global values

Values added - ilum-aio

security parameters
NameDescriptionValue
global.security.oauth2.clientIdoauth2 client ID``
global.security.oauth2.issuerUrioauth2 URI that can either be an OpenID Connect discovery endpoint or an OAuth 2.0 Authorization Server Metadata endpoint defined by RFC 8414``
global.security.oauth2.audiencesoauth2 audiences``
global.security.oauth2.clientSecretoauth2 client secret``

Values deleted - ilum-core

security parameters
NameReasonValue
security.oauth2.clientIdoauth2 security parameters are now configured through global values``
security.oauth2.issuerUrioauth2 security parameters are now configured through global values``

3. Runtime environment variables for frontend

Feature

Configuration for frontend environment variables throuhg helm ui values.

Values added - ilum-ui

runtime variables
NameDescriptionValue
runtimeVars.defaultConfigMap.enableddefault config map for frontend runtime environment variablestrue
runtimeVars.debugdebug logging flagfalse
runtimeVars.backenUrlilum-core backend urlhttp://ilum-core:9888
runtimeVars.historyServerUrlurl of history server uihttp://ilum-history-server:9666
runtimeVars.jupyterUrlurl of jupyter uihttp://ilum-jupyter:8888
runtimeVars.airflowUrlurl of airflow uihttp://ilum-webserver:8080
runtimeVars.minioUrlurl of minio uihttp://ilum-minio:9001
runtimeVars.mlflowUrlurl of mlflow uihttp://mlflow:5000
runtimeVars.historyServerPathilum-ui proxy path to history server ui/external/history-server/
runtimeVars.jupyterPathilum-ui proxy path to jupyter ui/external/jupyter/lab/tree/work/IlumIntro.ipynb
runtimeVars.airflowPathilum-ui proxy path to airflow ui/external/airflow/
runtimeVars.dataPathilum-ui proxy path to minio ui/external/minio/
runtimeVars.mlflowPathilum-ui proxy path to mlflow ui/external/mlflow/

Values deleted - ilum-ui

NameReason
debugmoved to runtimeVars section
backenUrlmoved to runtimeVars section
historyServerUrlmoved to runtimeVars section
jupyterUrlmoved to runtimeVars section
airflowUrlmoved to runtimeVars section

4. Kube-prometheus-stack in ilum-aio chart

Feature

Kube prometheus stack in ilum AIO chart. Preconfigured to automatically work wiht ilum deployment in order to collect metrics of ilum pods and spark jobs run by ilum. Ilum provides prometheus service monitors to autoamtically scrape metrics from spark driver pods run by ilum and ilum backend services. Additionally ilum_aio chart provides built-in grafana dashboards that can be found in Ilum folder.

Values added - ilum-aio

kube-prometheus-stack variables - for extended configuration check kube-prometheus stack helm chart
NameDescriptionValue
kube-prometheus-stack.enabledkube-prometheus-stack enabled flagfalse
kube-prometheus-stack.releaseLabelkube-prometheus-stack flag to watch resource only from ilum_aio releasetrue
kube-prometheus-stack.kubeStateMetrics.enabledkube-prometheus-stack Component scraping kube state metrics enabled flagfalse
kube-prometheus-stack.nodeExporter.enabledkube-prometheus-stack node exporter daemon set deployment flagfalse
kube-prometheus-stack.alertmanager.enabledkube-prometheus-stack alert manager flagfalse
kube-prometheus-stack.grafana.sidecar.dashboards.folderAnnotationkube-prometheus-stack, If specified, the sidecar will look for annotation with this name to create folder and put graph heregrafana_folder
kube-prometheus-stack.grafana.sidecar.dashboards.provider.foldersFromFilesStructurekube-prometheus-stack, allow Grafana to replicate dashboard structure from filesystemtrue

Values added - ilum-core

NameDescriptionValue
job.prometheus.enabledprometheus enabled flag, If true spark jobs run by Ilum will share metrics in prometheus formattrue

5. Marquez OpenLineage in ilum-aio chart

Feature

Marquez OpenLineage in ilum AIO chart. Marquez enables consuming, storing, and visualizing OpenLineage metadata from across an organization, serving use cases including data governance, data quality monitoring, and performance analytics. With marquez enabled in ilum AIO helm stack spark job run by Ilum will share lineage information with marquez backend. Marquez web interface visualize data lienage information collected from spark jobs and it is accesible through ilum UI as iframe.

Values added - ilum-aio

NameDescriptionValue
global.lineage.enabledmarquez enabled flagfalse

Values added - ilum-core

NameDescriptionValue
job.openLineage.transport.typemarquez communication typehttp
job.openLineage.transport.serverUrlmarquez backend urlhttp://ilum-marquez:9555/
job.openLineage.transport.endpointmarquez backend endpoint/external/lineage/api/v1/lineage

Values added - ilum-marquez

Newly added whole chart, check its values on chart page

Values added - ilum-ui

NameDescriptionValue
runtimeVars.lineageUrlurl to provide marquez openlineage UI iframehttp://ilum-marquez-web:9444
runtimeVars.lineagePathilum-ui proxy path to marquez openlineage UI/external/lineage/

RELEASE 6.0.3

1. Parameterized kafka producers max.request.size parameter in ilum-core chart

Feature

In kafka communication mode ilum interactive jobs responses to interactive job instances using kafka producers. With newly added helm value max.request.size kafka producer parameter can be adapted to match responses size needs.

Values added - ilum-core

kafka parameters
NameDescriptionValue
kafka.requestSizekafka max.request.size parameter for ilum jobs kafka producers20000000

RELEASE 6.0.2

1. Support for hdfs, gcs and azure blob storage in ilum-core chart

Feature

Ilum cluster no longer has to be attached to s3 storage, from now default cluster can be configured to use hdfs, gcs or azure blob as storage as well. It can be achieved using newly added values in ilum-core helm chart.

Values deleted - ilum-core

NameReason
kubernetes.s3.bucketFrom now on two separated buckets must be set with new values: kubernetes.s3.sparkBucket, kubernetes.s3.dataBucket

Values added - ilum-core

kubernetes storage parameters
NameDescriptionValue
kubernetes.upgradeClusterOnStartupdefault kubernetes cluster upgrade from values in config map flagfalse
kubernetes.storage.typedefault kubernetes cluster storage type, available options: s3, gcs, wasbs, hdfss3
s3 kubernetes storage parameters
NameDescriptionValue
kubernetes.s3.hostdefault kubernetes cluster S3 storage host to store spark resourcess3
kubernetes.s3.portdefault kubernetes cluster S3 storage port to store spark resources7000
kubernetes.s3.sparkBucketdefault kubernetes cluster S3 storage bucket to store spark resourcesilum-files
kubernetes.s3.dataBucketdefault kubernetes cluster S3 storage bucket to store ilum tablesilum-tables
kubernetes.s3.accessKeydefault kubernetes cluster S3 storage access key to store spark resources""
kubernetes.s3.secretKeydefault kubernetes cluster S3 storage secret key to store spark resources""
gcs kubernetes storage parameters
NameDescriptionValue
kubernetes.gcs.clientEmaildefault kubernetes cluster GCS storage client email""
kubernetes.gcs.sparkBucketdefault kubernetes cluster GCS storage bucket to store spark resources"ilum-files"
kubernetes.gcs.dataBucketdefault kubernetes cluster GCS storage bucket to store ilum tables"ilum-tables"
kubernetes.gcs.privateKeydefault kubernetes cluster GCS storage private key to store spark resources""
kubernetes.gcs.privateKeyIddefault kubernetes cluster GCS storage private key id to store spark resources""
wasbs kubernetes storage parameters
NameDescriptionValue
kubernetes.wasbs.accountNamedefault kubernetes cluster WASBS storage account name""
kubernetes.wasbs.accessKeydefault kubernetes cluster WASBS storage access key to store spark resources""
kubernetes.wasbs.sparkContainer.namedefault kubernetes cluster WASBS storage container name to store spark resources"ilum-files"
kubernetes.wasbs.sparkContainer.sasTokendefault kubernetes cluster WASBS storage container sas token to store spark resources""
kubernetes.wasbs.dataContainer.namedefault kubernetes cluster WASBS storage container name to store ilum tables"ilum-tables"
kubernetes.wasbs.dataContainer.sasTokendefault kubernetes cluster WASBS storage container sas token to store ilum tables""
hdfs kubernetes storage parameters
NameDescriptionValue
kubernetes.hdfs.hadoopUsernamedefault kubernetes cluster HDFS storage hadoop username""
kubernetes.hdfs.configdefault kubernetes cluster HDFS storage dict of config files with name as key and base64 encoded content as value""
kubernetes.hdfs.sparkCatalogdefault kubernetes cluster HDFS storage catalog to store spark resources"ilum-files"
kubernetes.hdfs.dataCatalogdefault kubernetes cluster HDFS storage catalog to store ilum-tables"ilum-tables"
kubernetes.hdfs.keyTabdefault kubernetes cluster HDFS storage keytab file base64 encoded content""
kubernetes.hdfs.principaldefault kubernetes cluster HDFS storage principal name""
kubernetes.hdfs.krb5default kubernetes cluster HDFS storage krb5 file base64 encoded content""
kubernetes.hdfs.trustStoredefault kubernetes cluster HDFS storage trustStore file base64 encoded content""
kubernetes.hdfs.logDirectorydefault kubernetes cluster HDFS storage directory absolute path to store eventLog for history server""

Important! Make sure S3/GCS buckets or WASBS containers are already created and reachable!

2. Added spark history server to ilum-core helm chart

Feature

Spark history server can be deployed from now on along with ilum-core. History server config is being passed to every spark job ilum runs. History server UI can now be accesesed by ilum UI. If enabled it will use default kubernetes cluster storage configured with kubernetes.[STORAGE_TYPE].[PARAMETER] values as eventLog storage.

Values added - ilum-core

history server parameters
NameDescriptionValue
historyServer.enabledspark history server flagtrue
historyServer.imagespark history server imageilum/spark-launcher:spark-3.5.3
historyServer.addressspark history server addresshttp://ilum-history-server:9666
historyServer.pullPolicyspark history server image pull policyIfNotPresent
historyServer.imagePullSecretsspark history server image pull secrets[]
historyServer.parametersspark history server custom spark parameters[]
historyServer.resourcesspark history server pod resources
limits:
memory: "500Mi"
requests:
memory: "300Mi"
historyServer.service.typespark history server service typeClusterIP
historyServer.service.portspark history server service port9666
historyServer.service.nodePortspark history server service nodePort""
historyServer.service.clusterIPspark history server service clusterIP""
historyServer.service.loadBalancerIPspark history server service loadbalancerIP""
historyServer.ingress.enabledspark history server ingress flagfalse
historyServer.ingress.versionspark history server ingress version"v1"
historyServer.ingress.classNamespark history server ingress className""
historyServer.ingress.hostspark history server ingress host"host"
historyServer.ingress.pathspark history server ingress path"/(.*)"
historyServer.ingress.pathTypespark history server ingress pathTypePrefix
historyServer.ingress.annotationsspark history server annotationsnginx.ingress.kubernetes.io/rewrite-target: /$1
nginx.ingress.kubernetes.io/proxy-body-size: "600m"
nginx.org/client-max-body-size: "600m"

Warnings

  1. Make sure HDFS logDirectory (helm value kubernetes.hdfs.logDirectory) is absolute path of configured sparkCatalog with /ilum/logs suffix! Eg for kubernetes.hdfs.sparkCatalog=spark-catalog put hdfs://name-node/user/username/spark-catalog/ilum/logs

3. Job retention in ilum-core chart

Feature

Ilum jobs will be deleted after the configured retention period expires

Values added - ilum-core

job retention parameters
NameDescriptionValue
job.retain.hoursspark jobs retention hours limit168