Skip to main content

Migration

Migration from Apache Hadoop to Ilum involves several steps, typically starting with the setup of the new environment, followed by the migration of data and applications, and finally testing and optimization. Here's a general outline of the process:

  1. Preparation: Understand your current Hadoop deployment, including the data, applications, and dependencies it contains. Document all relevant details to ensure nothing is lost in the transition.

  2. Setup Kubernetes Environment: Install and configure your Kubernetes cluster as per your organizational needs. This will serve as the foundation for your Ilum-managed Spark clusters.

  3. Install Ilum: Deploy Ilum on your Kubernetes cluster using Helm, a package manager for Kubernetes. Ensure that Ilum is properly configured to manage your Spark clusters.

  4. Data Migration: Begin migrating data from your Hadoop cluster to your new environment. This could involve moving data to a distributed file system accessible by your Kubernetes cluster, or to an S3 compatible storage system if that's part of your new architecture.

  5. Application Migration: Migrate your Spark applications from the Hadoop environment to the new Kubernetes environment. This might involve changes to your applications to adapt them to the differences between Hadoop Yarn and Kubernetes.

  6. Update Dependencies: Update any dependencies your applications have, such as changing data sources from HDFS to the new storage location.

  7. Testing: Conduct thorough testing to ensure that your applications are running correctly in the new environment. This should include functional testing as well as performance testing to ensure your applications are performing at least as well as they did in the Hadoop environment.

  8. Optimization: Based on your testing, optimize your Kubernetes and Ilum configurations for the best performance.

  9. Monitoring: Once everything is migrated and optimized, continue monitoring your applications and infrastructure to ensure everything is working smoothly. Ilum provides a web interface that makes it easy to monitor your Spark clusters and jobs.

This is a high-level outline and the specifics will vary depending on your current Hadoop setup, your specific use cases, and the architecture of your new environment. It's also worth noting that migration can be a complex process, and it may be beneficial to work with experts or seek out detailed guides or resources to assist with the migration.

Migration Support

Transitioning from Apache Hadoop to a new environment managed by Ilum may seem challenging, but you are not alone in this process. We understand that migrating data and applications, setting up a new environment, and ensuring everything works as expected can be a complex task.

To assist you in this process, our team at Ilum is ready to provide comprehensive support. If you need help with setting up Ilum, migrating your Spark clusters, or any other aspect of the transition process, please feel free to reach out to us. We can provide a Helm chart for easy deployment of Ilum, and guide you through the steps needed to migrate your existing Hadoop cluster to the new environment.

We're committed to making the migration process as smooth as possible for you. Whether you have technical questions, need guidance on best practices, or encounter any issues during the migration, we're here to help.

Please contact us at [email protected] at any time for assistance with your migration to Ilum. Our dedicated support team is ready and eager to assist you in your journey towards efficient and manageable Apache Spark cluster management with Ilum.

Migration Notes

Migrating 5.*.* to 6.0.0

With the release of version 6.0.0, we have introduced new security implementation that require attention during the migration process. Existing user accounts must be recreated if any changes have been made to the default admin account.

Follow the steps below to successfully migrate to version 6.0.0. An example command creates two accounts: one for an admin and a second for a regular user.

helm upgrade \
--set ilum-core.security.internal.users[0].username=admin \
--set ilum-core.security.internal.users[0].password=adminPassword \
--set ilum-core.security.internal.users[0].roles[0]=ADMIN \
--set ilum-core.security.internal.users[1].username=user \
--set ilum-core.security.internal.users[1].password=userPassword \
--set ilum-core.security.internal.users[1].roles[0]=USER \
--reuse-values ilum ilum/ilum

To check all supported authentication methods and their parameters visit README.md files in ilum-core charts.

Migrating 6.0.* to 6.1.0

With the release of version 6.1.0, we introduced a new ilum spark storage implementation that requires attention during the migration process. The existing bucket configuration must be formatted to match new schema.

Previously s3 bucket used by ilum for storing spark resources was configured using the ilum-core.kubernetes.s3.bucket helm value. Since version 6.1.0 it has been replaced with two new parameters:

  1. ilum-core.kubernetes.s3.sparkBucket - plays the same role as the previous parameter
  2. ilum-core.kubernetes.s3.dataBucket - used to configure bucket for storing ilum-tables