Skip to main content

Target Architecture for Hadoop Migration: Iceberg, Trino, Airflow

Modernize and Direct paths land a Kubernetes-native lakehouse built on Ilum. This page describes that target platform from the perspective of someone planning a Bifrost migration. For the general-purpose description of the platform itself, see Ilum architecture.

The intent of this page is to make four things concrete:

  • Which services exist on the target platform and how they relate.
  • How a query executes against a migrated table.
  • How data moves from the source cluster into the target platform during migration.
  • How the dual-read bridge lets legacy and migrated tables coexist while the estate is transitioning.

High-Level Topology

The target platform is a set of interoperating Kubernetes services. Identity, query routing, orchestration, catalog, and storage each have a dedicated layer. Ilum is the Spark management plane.

                              +-----------------+
| KEYCLOAK |
| (OIDC IdP) |
+--------+--------+
|
+------------------------+------------------------+
| | |
+---------v--------+ +----------v--------+ +----------v--------+
| SUPERSET | | TRINO GATEWAY | | AIRFLOW 3 |
| (BI / SQL Lab) | | (L7 Router) | | (Orchestration) |
+---------+--------+ +----+----------+---+ +----------+--------+
| | | |
| +-------v---+ +---v--------+ |
| | TRINO | | TRINO | |
| | Interactive| | ETL | |
| | (no retry)| | (FT exec) | |
| +-------+---+ +---+--------+ |
| | | |
+------------------+---------+--------------------+
|
+----------v----------+
| APACHE POLARIS |
| (Iceberg REST |
| Catalog) |
+----------+----------+
|
+------------------+------------------+
| |
+---------v--------+ +---------v--------+
| CEPH OBJECT | | OPENMETADATA |
| STORAGE (S3A) | | (Governance) |
+------------------+ +------------------+

+------------------+------------------+
| | |
+---------v--------+ +------v-------+ +--------v--------+
| SPARK OPERATOR | | YUNIKORN | | CELEBORN |
| (Job CRDs) | | (Scheduler) | | (Shuffle Svc) |
+------------------+ +--------------+ +-----------------+

+------------------+------------------+
| | |
+---------v--------+ +------v-------+ +--------v--------+
| ILUM CORE | | KYUUBI | | dbt + COSMOS |
| (Spark Mgmt) | | (SQL GW) | | (Transform) |
+------------------+ +--------------+ +-----------------+

Each block is deployed as a Kubernetes workload, typically via Helm. Most components are configured centrally by Bifrost during the bifrost modernize land (or bifrost direct land) phase.

Query Data Flow

When a user runs a query against a migrated Iceberg table, the following chain executes:

User Browser
|
v
[1] Superset (authenticates user via Keycloak OIDC)
| SQLAlchemy connection to Trino
v
[2] Trino Gateway (routes to interactive or ETL cluster based on query properties)
|
v
[3] Trino Coordinator (authenticated via OAuth2 token from Keycloak)
| Parses SQL, resolves table via the catalog
v
[4] Polaris REST Catalog (authenticates Trino via OAuth2 client_credentials)
| Returns Iceberg metadata: manifest list location in object storage
| Optionally vends scoped S3 credentials
v
[5] Trino Workers read Parquet / ORC files directly from object storage via S3A
|
v
[6] Results returned to Superset, rendered to the user

The user never interacts with the object store directly. Authorization and row-level or column-level policies are enforced by Trino in collaboration with an OPA sidecar (see Identity and access).

Migration Data Flow

Table migration during Modernize or Direct follows a consistent pattern:

[1] bifrost modernize migrate-table --table db.customers --strategy snapshot
|
v
[2] Migration driver invokes a Spark job on the target cluster via Ilum
| Spark reads Hive Metastore metadata from the legacy catalog
| Spark reads data file locations from the legacy storage
| Spark creates Iceberg metadata pointing to the existing files (no data copy for snapshot)
| Spark registers the table in the Polaris REST catalog
|
v
[3] bifrost modernize validate runs data-diff
| Row-count comparison: source vs. target
| Value-level hash sampling on a configurable percentage of partitions
|
v
[4] The decision engine evaluates the results
| PROCEED: the table is marked migrated and table redirection is enabled
| ABORT: the table is reverted (Iceberg metadata deleted), an alert is sent
|
v
[5] Trino table redirection resolves queries to the Iceberg location
| Queries to db.customers on the legacy catalog transparently hit the migrated copy

Three strategies are available for table migration: snapshot, migrate, and add_files. Each is described in Per-component migration reference — catalog.

Dual-Read Bridge

During Modernize and Direct, the dual-read bridge is the mechanism that lets legacy and migrated tables coexist. Trino is configured with two catalogs: one pointing at the legacy Hive Metastore and one pointing at the Iceberg REST catalog. Table redirection routes queries to the correct catalog per table, transparently.

+-------------------+                    +-------------------+
| LEGACY HADOOP | | KUBERNETES |
| (bare-metal) | | PLATFORM |
| | | |
| +-------------+ | Sync jobs | +-------------+ |
| | HDFS |--+------------------->| | Ceph RGW | |
| | NameNode | | | | (S3A) | |
| +-------------+ | | +------+------+ |
| | | | |
| +-------------+ | Catalog feed | +------v------+ |
| | Hive |--+------------------->| | Polaris | |
| | Metastore | | | | (Iceberg) | |
| +-------------+ | | +------+------+ |
| | | | |
| +-------------+ | | +------v------+ |
| | YARN | | Dual-mode Spark | | Spark on | |
| | (legacy |<-+- - - - - - - - - -| | K8s (Ilum) | |
| | workloads) | | | +-------------+ |
| +-------------+ | | |
+-------------------+ +-------------------+
| |
+-------- Trino (table redirection) -----+
| reads from BOTH catalogs |
v v
hive_hdfs catalog iceberg_s3 catalog
(legacy tables) (migrated tables)

The consequence is that BI tools, dbt models, and application queries continue to work unchanged during the transition. Table redirection makes the migration transparent at the query layer.

During this phase, the legacy HDFS cluster remains authoritative. The object storage side is kept in sync by periodic DistCp or continuous replication jobs. Only after validation passes and a silence period elapses is the legacy service decommissioned.

Identity and Access

Authentication across the platform is centralized through Keycloak using OAuth2 and OpenID Connect.

User (browser)
|
| [OIDC authorization code flow]
v
Keycloak --> issues JWT access token with claims:
- sub: [email protected]
- groups: [data-engineers, team-analytics]
- realm_roles: [lakehouse-user]
|
+---> Superset: validates JWT, maps groups to Superset roles
|
+---> Trino Coordinator: validates JWT via OIDC discovery
|
+---> Trino -> Polaris: OAuth2 client_credentials (service-to-service)
|
+---> Polaris -> Object storage: static S3 credentials on-prem,
| or vended scoped credentials on cloud
|
+---> OPA: Trino coordinator calls an OPA sidecar for every query
through four possible endpoints:
* opa.policy.uri — allow / deny decisions
* opa.policy.row-filters-uri — row-level filter expressions
* opa.policy.column-masking-uri — per-column mask expressions
* opa.policy.batch-column-masking-uri — batched column masks;
overrides the per-column
endpoint when set
Row-filter and column-mask endpoints are consulted only for the
tables and columns involved in the query, and only after the main
allow check returns allow. OPA evaluates policies against the full
input structure (input.action, input.context, input.context.identity)
and returns one response per endpoint call.

Credential vending: cloud vs. on-prem

  • Cloud deployments support credential vending end to end. The catalog generates temporary, scoped object storage credentials per table access.
  • On-premises deployments typically use static credentials for object storage access combined with fine-grained OPA policies at the query layer. Ceph RGW has supported STS-based credential vending since the Reef release, but end-to-end integration with the Iceberg REST catalog on-prem still requires extra role-assumption wiring; most production on-prem deployments therefore fall back to static keys. Catalog-level authorization always controls which principals can see which namespaces and tables.

Service account matrix

ServiceAuthenticates toMethod
SupersetKeycloakOIDC authorization code flow
Trino (user-facing)KeycloakOIDC (JWT validation)
Trino to PolarisKeycloakOAuth2 client_credentials
Trino to object storageObject storageStatic S3 access or vended credentials
Trino to OPAOPA sidecarlocalhost HTTP (trusted-network co-location)
AirflowKeycloakOIDC authorization code flow
Airflow to TrinoTrinoService account
Spark (via Ilum) to object storageObject storageCredentials injected via Ilum cluster config
Spark to PolarisKeycloakOAuth2 client_credentials
OpenMetadataKeycloakOIDC authorization code flow
OpenMetadata to TrinoTrinoService account for metadata ingestion

For the underlying Ilum identity model, see Ilum security.

Networking and Ingress

Bifrost provisions an ingress controller (or reuses the customer's existing one) with the following default hostnames. The domain is customer-specific and comes from the global.domain value in the deployment configuration.

HostnameBackend serviceNotes
superset.lakehouse.internalSupersetWebSocket support for async SQL Lab
trino.lakehouse.internalTrino GatewayRoutes to interactive or ETL cluster
airflow.lakehouse.internalAirflow webserver
polaris.lakehouse.internalPolarisREST API and OAuth2 token endpoint
openmetadata.lakehouse.internalOpenMetadata
ilum.lakehouse.internalIlum UI
keycloak.lakehouse.internalKeycloakMust be reachable from all clients

Legacy to Kubernetes connectivity

During the dual-read bridge phase, legacy Hadoop nodes must reach Kubernetes-hosted services (the object storage endpoint for sync writes), and Kubernetes pods must reach legacy HDFS (for Spark reads during the transition).

  • Kubernetes to Hadoop — Spark on Kubernetes reads HDFS by mounting the legacy Hadoop configuration as ConfigMaps. The Kubernetes pod network must reach the source NameNode and DataNodes.
  • Hadoop to Kubernetes — the object storage endpoint is exposed via a NodePort or LoadBalancer service with a stable address. DistCp uses that endpoint for large data transfers.

Default network policies

Bifrost ships a default set of NetworkPolicy resources for namespace isolation:

  • Spark executor pods can communicate with the shuffle service, the object storage endpoint, Polaris, Ilum Core, and their own driver pod.
  • Trino workers can communicate with the Trino coordinator, the object storage endpoint, and Polaris.
  • The OPA sidecar accepts traffic from localhost only (it is co-located with the Trino coordinator).
  • Shuffle service workers accept connections from Spark executor pods in the same namespace.

These defaults can be extended or tightened by the customer's platform team; they are a starting point, not a lock.

Custom Resources

Bifrost tracks migration progress using three Custom Resource Definitions installed under the bifrost.ilum.cloud/v1 API group. These are visible via kubectl get and reflect the state the CLI operates on.

TableMigration

Represents a single table in flight or migrated.

  • Spec fields: sourceTable, sourceCatalog, targetCatalog, strategy (snapshot / migrate / add_files), wave, priority.
  • Status fields: phase (one of discovered, planned, bridged, migrating, validating, migrated, decommissioned, failed), rowCountSource, rowCountTarget, dataDiffResult, lastTransitionTime (RFC-3339), revertAvailable.

WorkflowMigration

Represents an Oozie-to-Airflow conversion unit.

  • Spec fields: sourceType (oozie), sourcePath, targetType (airflow), targetDagId.
  • Status fields: phase (discovered, analyzing, converting, converted, testing, deployed), actionsTotal, actionsConverted, actionsManual, todoAnnotations.

ServiceMigration

Represents a bulk service-level migration (for example, HDFS-to-Ceph storage migration).

  • Spec fields: sourceService, targetService, sourceCluster.
  • Status fields: phase, storageMigratedTB, storageTotalTB, percentComplete, lastSyncTime.

Inspect in-flight migration state with kubectl get tablemigrations -n bifrost or bifrost modernize status, which reads the same objects.

Further Reading