RBAC Security Modes
Ilum Core provides two distinct RBAC (Role-Based Access Control) security modes for Kubernetes permissions: Unrestricted and Restricted. Understanding the security implications of each mode is crucial for implementing appropriate security controls in your environment.
Security Overview
The choice between unrestricted and restricted RBAC modes represents a fundamental security decision that affects your deployment's attack surface, compliance posture, and operational security.
Security Principles
- Principle of Least Privilege: Restricted mode implements minimal necessary permissions
- Defense in Depth: Multiple layers of security controls reduce overall risk
- Namespace Isolation: Containment of potential security breaches
- Audit Trail: Clear permission boundaries for security auditing
Security Mode Comparison
Unrestricted RBAC Mode (Default)
Security Profile: High-privilege, cluster-wide access
Risk Level: ⚠️ Higher Risk - Suitable for development and trusted environments
Characteristics:
- Cluster-wide permissions via ClusterRole/ClusterRoleBinding
- Broad access across all namespaces
- Administrative capabilities for cluster resources
- Simplified deployment and management
Restricted RBAC Mode
Security Profile: Minimal-privilege, namespace-scoped access
Risk Level: ✅ Lower Risk - Recommended for production environments
Characteristics:
- Namespace-scoped permissions via Role/RoleBinding
- Limited to deployment namespace only
- Reduced administrative capabilities
- Enhanced security posture
Detailed Security Analysis
Permission Scope Comparison
- Unrestricted Mode
- Restricted Mode
# Cluster-wide permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: spark-submit-cluster-role-{{ .Release.Namespace }}
rules:
# spark submit related roles
- apiGroups: [ "" ]
resources: [ "pods","configmaps","services","persistentvolumeclaims" ]
verbs: [ "create","delete","deletecollection","get","list","patch","update","watch" ]
# monitoring spark
- apiGroups: [ "" ]
resources: [ "pods/log" ]
verbs: [ "get","list","watch" ]
# managing schedules
- apiGroups: [ "batch" ]
resources: [ "cronjobs" ]
verbs: [ "create","delete","update" ]
# cluster force deletion
- apiGroups: [ "batch" ]
resources: [ "jobs" ]
verbs: [ "create","delete","get" ]
# cluster metrics
- apiGroups: [ "metrics.k8s.io" ]
resources: [ "nodes", "pods" ]
verbs: [ "get","list" ]
# yarn spark jobs prometheus metrics
- apiGroups: [ "" ]
resources: [ "endpoints" ]
verbs: [ "create","delete","patch","update" ]
- apiGroups: [ "monitoring.coreos.com" ]
resources: [ "servicemonitors" ]
verbs: [ "create","delete","get","list","patch","update" ]
# cluster namespace management
- apiGroups: [ "" ]
resources: [ "resourcequotas" ]
verbs: [ "create","delete","get","update" ]
- apiGroups: [ "" ]
resources: [ "limitranges" ]
verbs: [ "create","delete","get","update" ]
# cluster namespace management sensitive rules
- apiGroups: [ "" ]
resources: [ "namespaces" ]
verbs: [ "create", "get" ]
- apiGroups: [ "" ]
resources: [ "serviceaccounts" ]
verbs: [ "create" ]
- apiGroups: [ "rbac.authorization.k8s.io" ]
resources: [ "clusterrolebindings" ]
verbs: [ "create", "get", "patch", "update" ]
Security Implications:
- ❌ Full cluster access
- ❌ Cross-namespace operations
- ❌ Administrative privileges
- ❌ Broad attack surface
# Namespace-scoped permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: spark-submit-ilum-restricted-role-{{ .Release.Namespace }}
rules:
# spark submit related roles
- apiGroups: [ "" ]
resources: [ "pods","configmaps","services","persistentvolumeclaims" ]
verbs: [ "create","delete","deletecollection","get","list","patch","update","watch" ]
# monitoring spark
- apiGroups: [ "" ]
resources: [ "pods/log" ]
verbs: [ "get","list","watch" ]
# managing schedules
- apiGroups: [ "batch" ]
resources: [ "cronjobs" ]
verbs: [ "create","delete","update" ]
# cluster force deletion
- apiGroups: [ "batch" ]
resources: [ "jobs" ]
verbs: [ "create","delete","get" ]
# cluster metrics
- apiGroups: [ "metrics.k8s.io" ]
resources: [ "nodes", "pods" ]
verbs: [ "get","list" ]
# yarn spark jobs prometheus metrics
- apiGroups: [ "" ]
resources: [ "endpoints" ]
verbs: [ "create","delete","patch","update" ]
- apiGroups: [ "monitoring.coreos.com" ]
resources: [ "servicemonitors" ]
verbs: [ "create","delete","get","list","patch","update" ]
# cluster namespace management
- apiGroups: [ "" ]
resources: [ "resourcequotas" ]
verbs: [ "create","delete","get","update" ]
- apiGroups: [ "" ]
resources: [ "limitranges" ]
verbs: [ "create","delete","get","update" ]
Security Implications:
- ✅ Namespace-limited access
- ✅ Specific resource permissions
- ✅ Minimal required verbs
- ✅ Reduced attack surface
Security Risk Matrix
| Security Aspect | Unrestricted Mode | Restricted Mode |
|---|---|---|
| Blast Radius | 🔴 Entire Cluster | 🟢 Single Namespace |
| Privilege Escalation Risk | 🔴 High | 🟢 Low |
| Cross-Namespace Access | 🔴 Allowed | 🟢 Blocked |
| Compliance Readiness | 🟡 Moderate | 🟢 High |
| Audit Complexity | 🔴 Complex | 🟢 Simple |
| Multi-Tenancy Support | 🔴 Poor | 🟢 Excellent |
Security Benefits of Restricted Mode
1. Principle of Least Privilege Implementation
Restricted mode grants only the minimum permissions required for Spark job execution:
# Example: Minimal pod permissions
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "update", "watch"]
2. Namespace Isolation
Security Boundary: All operations are contained within the deployment namespace
# Restricted mode prevents cross-namespace access
kubectl auth can-i create pods --as=system:serviceaccount:ilum:ilum-core -n other-namespace
# Result: no
3. Reduced Attack Surface
Eliminated Capabilities:
- Namespace creation/deletion
- ServiceAccount management across namespaces
- ClusterRoleBinding modifications
- Cross-namespace resource access
- Pod command execution (
pods/exec)
4. Enhanced Audit Trail
Restricted permissions provide clearer audit boundaries:
# Audit query example
kubectl get events --field-selector involvedObject.kind=Pod,involvedObject.namespace=ilum
Restricted RBAC Guide
The restricted RBAC feature in Ilum Core provides enhanced security by implementing the principle of least privilege for Kubernetes permissions. This feature limits the service account permissions to only what's necessary for Spark job execution, making it ideal for production and security-sensitive environments.
Overview
By default, Ilum Core uses cluster-wide permissions (ClusterRole/ClusterRoleBinding) that provide broad access across the entire Kubernetes cluster. The restricted RBAC mode switches to namespace-scoped permissions (Role/RoleBinding) and removes sensitive cluster-level operations, significantly reducing the security footprint.
Security Benefits
- Principle of Least Privilege: Grants only the minimum permissions required for Spark job execution
- Namespace Isolation: Limits permissions to the specific namespace where Ilum is deployed
- Reduced Attack Surface: Eliminates unnecessary cluster-wide permissions
- Production Ready: Designed for security-conscious production environments
- Compliance Friendly: Helps meet security compliance requirements
Configuration and Implementation
Enabling Restricted RBAC
Before updating to restricted RBAC mode, you MUST delete any existing ClusterRole and ClusterRoleBinding resources from previous Ilum installations to avoid permission conflicts.
# Delete existing cluster-wide RBAC resources before upgrade
kubectl delete clusterrole spark-submit-cluster-role-NAMESPACE --ignore-not-found=true
kubectl delete clusterrolebinding spark-submit-cluster-role-binding-NAMESPACE --ignore-not-found=true
Failure to delete these resources may result in permission conflicts and deployment issues.
To enable restricted RBAC mode, use the complete Helm command with detailed parameter descriptions:
- Helm Upgrade Command
- Fresh Installation
- values.yaml Configuration
# Complete Helm upgrade command for restricted RBAC
helm upgrade ilum ilum/ilum \
--set ilum-core.rbac.restricted.enabled=true \
--reuse-values
# Fresh installation with restricted RBAC
helm install ilum ilum/ilum \
--set ilum-core.rbac.restricted.enabled=true \
--namespace ilum \
--create-namespace
# values-restricted.yaml
ilum-core:
rbac:
restricted:
enabled: true
Deploy with:
helm install ilum ilum/ilum -f values-restricted.yaml --namespace ilum --create-namespace
Configuration Parameters
| Parameter | Description | Type | Default | Valid Options |
|---|---|---|---|---|
rbac.restricted.enabled | Enable restricted RBAC mode with namespace-scoped permissions | boolean | false | true, false |
What Changes in Restricted Mode
Permissions Granted
In restricted mode, Ilum Core receives the following namespace-scoped permissions:
Core Spark Operations
- Pods:
create,delete,deletecollection,get,list,patch,update,watch - ConfigMaps:
create,delete,deletecollection,get,list,patch,update,watch - Services:
create,delete,deletecollection,get,list,patch,update,watch - PersistentVolumeClaims:
create,delete,deletecollection,get,list,patch,update,watch
Monitoring and Logging
- Pod Logs:
get,list,watch - Endpoints:
create,delete,patch,update - ServiceMonitors:
create,delete,get,list,patch,update
Job Scheduling
- CronJobs:
create,delete,update - Jobs:
create,delete,get
Resource Management
- ResourceQuotas:
create,delete,get,update - LimitRanges:
create,delete,get,update
Metrics Access (Cluster-scoped)
- Nodes:
get,list(via ClusterRole) - Pods:
get,list(via ClusterRole for metrics)
Permissions Removed
The following sensitive cluster-level permissions are removed in restricted mode:
- Namespace Management: Cannot create or manage namespaces
- ServiceAccount Creation: Cannot create service accounts in other namespaces
- ClusterRoleBinding Management: Cannot create or modify cluster-wide role bindings
- Cross-Namespace Access: Cannot access resources in other namespaces
- Pod Execution: Cannot execute commands in pods (
pods/exec)
Important: When RBAC restricted mode is enabled, the ability to change the default ilum cluster namespace is disabled. This occurs because the permissions required to create the necessary RBAC Kubernetes resources are no longer available.
Manual Intervention Required: Users must manually create the required RBAC Kubernetes resources to grant themselves the permissions needed to modify the Kubernetes cluster's namespace. Without this manual step, the namespace modification functionality will not work.
Migration Guide
From Unrestricted to Restricted Mode
CRITICAL: Before migrating to restricted RBAC mode, you must delete existing ClusterRole and ClusterRoleBinding resources to prevent conflicts:
# Required cleanup before migration
kubectl delete clusterrole spark-submit-cluster-role-NAMESPACE --ignore-not-found=true
kubectl delete clusterrolebinding spark-submit-cluster-role-binding-NAMESPACE --ignore-not-found=true
-
Assess Current Usage: Review your current Spark jobs to ensure they don't require cross-namespace operations
-
Clean Up Existing Resources: Delete previous cluster-wide RBAC resources:
kubectl delete clusterrole spark-submit-cluster-role-NAMESPACE --ignore-not-found=true
kubectl delete clusterrolebinding spark-submit-cluster-role-binding-NAMESPACE --ignore-not-found=true -
Test in Development: Enable restricted mode in a development environment first:
helm upgrade ilum-dev ilum/ilum \
--set ilum-core.rbac.restricted.enabled=true \
--reuse-values -
Validate Functionality: Run your typical Spark workloads to ensure they work correctly
-
Apply to Production: Once validated, apply to production:
helm upgrade ilum-prod ilum/ilum \
--set ilum-core.rbac.restricted.enabled=true \
--reuse-values
Rollback Procedure
If you need to revert to unrestricted mode:
helm upgrade ilum ilum/ilum \
--set ilum-core.rbac.restricted.enabled=false \
--reuse-values
Troubleshooting
Common Issues
Permission Denied Errors
Symptom: Spark jobs fail with permission denied errors
Error: pods is forbidden: User "system:serviceaccount:ilum:ilum-core" cannot create resource "pods" in API group "" in the namespace "other-namespace"
Solution: Ensure all Spark jobs are configured to run in the same namespace as Ilum Core
Missing Resources
Symptom: Jobs cannot find required ConfigMaps or Services
Error: configmaps "spark-config" not found
Solution: Verify that all required resources exist in the Ilum namespace
RBAC Resource Conflicts
Symptom: Deployment fails with RBAC conflicts
Error: clusterroles.rbac.authorization.k8s.io "spark-submit-ilum-cluster-role" already exists
Solution: Delete existing cluster-wide RBAC resources before upgrade:
kubectl delete clusterrole spark-submit-ilum-cluster-role --ignore-not-found=true
kubectl delete clusterrolebinding spark-submit-ilum-cluster-role-binding-ilum --ignore-not-found=true
Cross-Namespace Spark Jobs Disabled
Symptom: Cannot run Spark jobs in namespaces other than the Ilum deployment namespace, or cannot change the default cluster namespace within the Ilum application interface
Error: Insufficient permissions to create Spark resources in target namespace
Error: pods is forbidden: User "system:serviceaccount:ilum:ilum-core" cannot create resource "pods" in API group "" in the namespace "spark-jobs"
Cause: Restricted RBAC mode limits permissions to the Ilum deployment namespace only, preventing Spark job execution in other namespaces
Solution: Create additional ClusterRole and ClusterRoleBinding to allow Ilum to create Spark-related resources in different namespaces:
# spark-cross-namespace-clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ilum-spark-cross-namespace-role
rules:
- apiGroups: [""]
resources: ["pods", "configmaps", "services", "persistentvolumeclaims"]
verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["resourcequotas", "limitranges"]
verbs: ["create", "delete", "get", "update"]
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["create", "get", "list"]
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["create", "get", "list"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["roles", "rolebindings"]
verbs: ["create", "get", "list", "patch", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ilum-spark-cross-namespace-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ilum-spark-cross-namespace-role
subjects:
- kind: ServiceAccount
name: ilum-core # Replace with your actual service account name
namespace: ilum # Replace with your Ilum deployment namespace
Apply the configuration:
kubectl apply -f spark-cross-namespace-clusterrole.yaml
Update Spark Configuration: Modify your Ilum cluster's Spark configuration to allow cross-namespace job execution through ilum-ui
Diagnostic Commands
Check RBAC permissions:
kubectl auth can-i create pods --as=system:serviceaccount:ilum:ilum-core -n ilum
Check service account permissions:
kubectl describe rolebinding spark-submit-ilum-role-binding-ilum -n ilum
Risk Assessment and Mitigation
Security Risks by Mode
Unrestricted Mode Risks
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Privilege Escalation | High | Medium | Network segmentation, monitoring |
| Cross-Namespace Breach | High | Medium | Additional RBAC controls |
| Cluster Compromise | Critical | Low | Defense in depth, monitoring |
Restricted Mode Risks
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Functionality Limitation | Medium | High | Thorough testing, documentation |
| Namespace Modification Disabled | Medium | High | Manual RBAC resource creation |
| Operational Complexity | Low | Medium | Training, automation |
| Namespace Lock-in | Low | Low | Proper architecture planning |
Risk Mitigation Strategies
- Gradual Migration: Implement restricted mode in stages
- Comprehensive Testing: Validate all use cases before production
- Monitoring Enhancement: Implement security-focused monitoring
- Documentation: Maintain clear operational procedures
- Training: Ensure team understands security implications
Best Practices
Security Recommendations
- Always Use in Production: Enable restricted RBAC for all production deployments
- Regular Audits: Periodically review and audit the granted permissions
- Namespace Isolation: Use dedicated namespaces for different environments
- Monitor Access: Implement monitoring for unusual permission usage
- Clean Migration: Always delete existing ClusterRole/ClusterRoleBinding before upgrade
Operational Guidelines
- Test Thoroughly: Always test in non-production environments first
- Document Dependencies: Maintain documentation of namespace-specific resources
- Backup Configurations: Keep backups of your Helm values and configurations
- Monitor Performance: Watch for any performance impacts after enabling
- Resource Cleanup: Properly clean up old RBAC resources during migration
- Namespace Modification: Be aware that restricted mode disables the ability to change cluster namespaces through the Ilum interface - prepare manual RBAC resources if this functionality is needed
Conclusion
The choice between unrestricted and restricted RBAC modes represents a critical security decision. While unrestricted mode offers operational simplicity, restricted mode provides significant security benefits that align with modern security best practices and compliance requirements.
Key Takeaways:
- Critical Warning: Always delete existing ClusterRole and ClusterRoleBinding resources before upgrading to restricted mode
- Production Recommendation: Restricted mode is strongly recommended for security-conscious production environments
- Compliance: Restricted mode is likely mandatory for regulated industries
- Migration: Follow the gradual migration approach with proper testing and validation
Recommendations:
- Development/Testing: Unrestricted mode acceptable with proper network isolation
- Production: Restricted mode strongly recommended for security-conscious environments
- Compliance-Required: Restricted mode likely mandatory for regulated industries
The restricted RBAC feature provides a significant security enhancement for Ilum deployments by implementing least-privilege access controls. While it may limit some advanced cluster-wide operations, it's the recommended configuration for production environments where security is a primary concern.