📄️ SQL Viewer
Execute SQL queries directly in your browser with Ilum's SQL Viewer. Explore data, debug Spark applications, and visualize results without writing code, utilizing Spark SQL or Trino engines.
📄️ Schedule
Automate your Spark jobs with Ilum's built-in scheduler. Easily configure periodic tasks for data ingestion, ETL pipelines, and reporting using intuitive UI or custom CRON expressions.
📄️ File Explorer
Browse and manage files across multiple storage systems, profile data quality, and create tables from raw files. Supports S3, GCS, HDFS, and Azure Storage.
📄️ Table Explorer
Browse and monitor your datasets with the Table Explorer. Inspect table metadata, view data lineage, and interactively visualize data samples using the built-in Data Exploration Tool.
📄️ Data Lineage
Data lineage is the process of tracking data flow from source to destination. Learn how to visualize data, understand the flow of data through pipelines, and implement data lineage for your organization.
📄️ Ilum Tables
Ilum Tables provides a unified Spark wrapper for Delta, Iceberg, and Hudi data formats. Learn how to easily read, write, and stream data using a consistent interface for flexible data management.
🗃️ Notebooks
Discover Ilum's notebook capabilities. Compare JupyterLab, JupyterHub, and Zeppelin environments to choose the best tool for your interactive analytics, data science, and collaborative workflows.
🗃️ Data Catalogs
Explore Ilum's supported data catalogs. Learn about the default Hive Catalog for stability and compatibility, and the optional Nessie Catalog for Git-like version control of your data lake.
📄️ Clusters and Storages
Manage your multi-cluster Spark infrastructure and diverse storage systems from a centralized control plane. Simplify access, enhance security, and streamline job deployment across local, GKE, and other clusters.
📄️ Spark Connect Server
Leverage Spark Connect in Ilum to decouple client applications from Spark clusters. Learn how to run remote Spark jobs, build interactive applications, and connect securely from local environments.
📄️ Airflow
Technical guide for orchestrating Spark on Kubernetes using Apache Airflow and Ilum. Covers LivyOperator configuration, dependency management, logging architecture, and Git Sync.
📄️ Monitoring
A comprehensive guide to monitoring Apache Spark on Kubernetes with Ilum. Learn to configure Prometheus metrics, visualize data in Grafana, analyze logs with Loki, and debug performance issues like OOM errors and data skew.
📄️ Superset
Configure Apache Superset on Kubernetes with Ilum. Comprehensive guide on Spark/Trino integration, SQLAlchemy tuning, Helm deployment strategies, and performance optimization for enterprise BI.
📄️ Tableau
Technical integration guide for connecting Tableau Desktop to Ilum via JDBC.
📄️ MLflow
Technical documentation for MLflow integration in Ilum. Architecture, Spark autologging, model registry operations, and batch inference implementation on Kubernetes.
📄️ NiFi
A comprehensive technical guide on integrating Apache NiFi with Ilum. Learn to orchestrate event-driven data pipelines, trigger Spark jobs via REST API, and manage robust data flows.
📄️ AI Data Analyst Agent
Discover Ilum's AI Data Analyst Agent, an intelligent assistant powered by LLMs that translates natural language into optimized SQL queries, integrates with your data catalog, and simplifies complex data analysis for all users.
📄️ n8n
Automate your data workflows with Ilum's n8n integration. Build visual pipelines connecting Apache Spark, APIs, and databases using a low-code interface with custom nodes for deep platform integration.
📄️ Data Science Platform
Explore Ilum's unified Data Science Platform, offering seamless data access, pre-configured notebook environments, automated MLOps with MLflow, and tools to build and deploy AI applications at scale.
📄️ Kestra
Orchestrate Apache Spark jobs on Kubernetes with Kestra and Ilum. Learn to build declarative, event-driven data pipelines using YAML workflows and optimizing submission latency.
📄️ Resource Control & Governance
Master Kubernetes resource management in Ilum. Deep dive into Resource Quotas, Limit Ranges, and their impact on Spark workloads for multi-tenant cluster stability.
📄️ Mage
Technical guide for integrating Mage with Ilum. Learn how to architect production-grade ETL pipelines using Spark Connect, manage Kubernetes resources, and configure custom Docker environments for distributed data processing.
📄️ Streamlit
Build and deploy enterprise-grade data applications with Streamlit in Ilum. Comprehensive guide on integrating Spark Connect, optimizing performance with caching, and configuring Kubernetes deployments for scalable data dashboards.