How to create a storage

Ilum allows you to link GCS, S3, WASDB and HDFS storages to your clusters. Such a link allows Ilum to automatically configure all your jobs to make use of the storages that you have. This way you won't need to add additional spark parameters in order to load and save data from the storage.

Simple Storage

Google Cloud Storage

1. Creating a Google Cloud Storage

Demo:

Guide In Full Screen

Create a Google Cloud project

Click on the Project Selector in the top left corner
Click on create button and create a project or choose an existing one

Create a GCS

Type GCS into the search bar and choose Cloud Storage option
Click Create Bucket
Specify the region configurations and a Bucket name

Create Service Account

Type IAM into the search bar and move to IAM dashboard
Move to Service Account section and create a Service Account by specifying its name
Choose the Service Account that you have just created and add keys to it

The keys will be installed as JSON (by default) to your Donwloads folder.

This JSON will have private_key, private_key_id and client_email properties - they will be used to connect to GCS from Ilum.

2. Adding GCS to your default cluster

Demo:

Guide in Full Screen

Go to storage edditing page

Choose a cluster where you want to add a storage
Click "Add Storage"

Speicfy the bucket and the name
Provide GCS authorization details:

Get the JSON file from previous step
Fit in the email
Fit in the private key of your Service Account
Fir in the private key id of your Service Account

3. Testing if the conneciton works

Create a Code group
Paste this code:

val data = Seq(
  ("Alice", 34),
  ("Bob", 45),
  ("Cathy", 29)
)
val columns = Seq("name", "age")
val df = spark.createDataFrame(data).toDF(columns: _*)

df.write.format("csv").save("gs://gcs-test-ilum/output/")

Remember to replace the bucket in the url with your GCS bucket

Click execute

If the code does not throw an erorr, then everythin works. Congratulations!

Simple Storage​

Google Cloud Storage​

1. Creating a Google Cloud Storage​

2. Adding GCS to your default cluster​

3. Testing if the conneciton works​

Simple Storage

Google Cloud Storage

1. Creating a Google Cloud Storage

2. Adding GCS to your default cluster

3. Testing if the conneciton works