Object Storage in Ilum
Overview
Ilum runs entirely on S3-compatible object storage. Every bundled component that reads or writes objects (Trino, Nessie, MLflow, Airflow, Kestra, Langfuse, Jupyter, Loki, Hive Metastore, the Spark History Server, the Ilum core service, and the embedded notebook environment) targets the same Service alias, which routes to whichever provider is currently active. Switching backends is therefore a Helm flag, not a fleet-wide reconfiguration.
This section explains the storage model, lists the supported providers, and links to the operational guides for choosing, migrating between, and extending them.
- Default in 6.7.2 and later: RustFS (Apache-2.0, bundled).
- Still supported and shippable: MinIO (AGPL-3.0, bundled, opt-in).
- Bring your own: any S3-compatible backend reachable from the cluster (AWS S3, Wasabi, Backblaze B2, on-prem MinIO, Cloud Storage via S3 interop, and others).
- Planned bundled additions: Garage and SeaweedFS (registry-ready).
The ilum-objectstorage Service alias
A stable Service named ilum-objectstorage is provisioned in the
release namespace. It is a label-selector alias that points at whichever
provider is currently active. The Service exposes two ports:
9000for the S3 API.9001for the provider's web console (proxied through the Ilum UI'snginxreverse proxy as part of the Object Storage view).
Bundled consumers target this alias by hostname (http://ilum-objectstorage:9000)
rather than provider-specific names like ilum-minio or ilum-rustfs-svc.
Flipping the active provider is therefore a single change to the alias
selector. No consumer reconfiguration is required.
The provider registry
The set of providers known to the chart lives under objectStorage.providers
in the helm_aio values:
objectStorage:
providers:
rustfs:
consolePath: /rustfs/console/
consoleMode: same-origin
minio:
consolePath: /external/minio/
consoleMode: nginx-rewrite
Each entry declares the provider's iframe path and routing mode for the Ilum UI's Object Storage view. The chart ships entries for the two bundled providers. Adding a third (Garage, SeaweedFS, or any S3-compatible backend) is a values-file edit. See Add a New Object Storage Provider.
Active provider, previous provider, cutover
Two flags determine which provider the alias targets when more than one is enabled:
objectStorage.activeProvider— explicit override. When set to a provider name (rustfs,minio, ...) the alias targets that provider unconditionally. The default valueautodefers to the resolution rules below.objectStorage.previousProvider— names the data-bearing side during a cutover. Defaults tominiofor back-compat with installs that predated the registry.objectStorage.cutoverAcknowledged— flips the alias frompreviousProviderto the other enabled provider once the operator has finished migrating data. Defaults tofalse.
A legacy rustfs.migrationAcknowledged flag continues to be accepted as
an alias for cutoverAcknowledged so existing values overlays survive
the upgrade unchanged.
Resolution rules
With activeProvider=auto (the default), the chart resolves the active
provider from the set of enabled providers:
| Enabled providers | cutoverAcknowledged | Alias targets |
|---|---|---|
| None | (irrelevant) | no alias rendered (BYO external S3) |
| One | (irrelevant) | that provider |
| Two | false | previousProvider (data-bearing side) |
| Two | true | the other one (post-cutover) |
| Three or more | (irrelevant) | render-time error asking for an explicit activeProvider |
With an explicit activeProvider, the alias targets the named provider
verbatim. If no pods carry the matching app.kubernetes.io/name label,
the alias has no endpoints and the ilum-core readiness probe surfaces
the misconfiguration in pod logs.
Common operator scenarios
The following flows are documented elsewhere in this section. Use this list as a map to the right guide for the situation at hand.
- Net-new install. Nothing to do; RustFS is the default. See Provider reference: RustFS.
- Bring your own external S3 from day 0. Disable both bundled providers and point the chart at an external endpoint. See Provider reference: External S3.
- Compare backends before committing. Use the Choose a Provider decision guide and the comparison matrix sourced from the Phase 8 parity report.
- Switch the active provider on a running install. Follow Migrate Between Providers. MinIO to RustFS is the canonical example; the same procedure applies to any any-source-to-any-target pair.
- Add a third backend. Follow Add a New Provider. The chart ships placeholder rows for Garage and SeaweedFS so operators can adopt them as soon as upstream stabilizes.
- Rotate credentials. See Rotate Object Storage Credentials.
- Back up and restore bucket data. See Back Up and Restore Object Storage.
- Recover from a misconfiguration. See Troubleshoot Object Storage.
Default bucket layout
The chart provisions seven default buckets on the active provider. Each bucket is owned by one or more bundled consumers. Operators tuning the chart's defaults should consult the table below before renaming or removing entries.
| Bucket | Consumers | Configurable via |
|---|---|---|
ilum-files | ilum-core (Spark jars, job artifacts), ilum-jupyter, ilum-kyuubi, ilum-hive-metastore, Loki | objectStorage.defaultBuckets |
ilum-data | ilum-core warehouse, Hive Metastore, Unity Catalog, Trino, Airflow logs | objectStorage.defaultBuckets, per-consumer warehouseDir overrides |
ilum-tables | ilum-core data tables (Iceberg, Delta, Hudi) | ilum-core.kubernetes.s3.dataBucket |
ilum-mlflow | MLflow tracking artifacts | mlflow.s3.bucket |
ilum-kestra | Kestra workflow internal storage | kestra.config.storage_driver.defaults.s3.bucket |
ilum-ducklake | DuckDB DuckLake catalog | duckdb.ducklake.location |
ilum-langfuse | Langfuse trace storage | langfuse.s3.bucket |
The bucket list is created idempotently by the bundled init Job
(init-rustfs-buckets for RustFS, init-minio-policies for MinIO).
For external S3 backends, the operator creates the buckets manually
before installing Ilum. See Provider reference: External S3.
How the alias plugs into the stack
The same alias hostname routes both the Ilum UI's iframe traffic and every consumer's S3 API calls. Switching providers updates the alias selector; no consumer rewiring is required.
Reference
- Helm values reference: Object Storage Helm Values
- Migration playbook: Migrate Between Providers
- Decision guide: Choose a Provider
- Provider catalog: MinIO · RustFS · External S3 · Garage (planned) · SeaweedFS (planned)
- Operational recovery: Troubleshoot Object Storage