Feature Overview

The Immutable Infrastructure support provides comprehensive cluster and cloud infrastructure management capabilities designed for enterprise-grade Kubernetes deployments. This platform leverages advanced automation and infrastructure-as-code principles to deliver reliable, scalable, and maintainable infrastructure solutions.

Cluster Management

Our platform offers end-to-end Kubernetes cluster lifecycle management with immutable OS principles, ensuring consistent and reproducible deployments across environments.

Cluster Creation

  • Immutable OS Support: Create clusters using immutable OS patterns for enhanced security and consistency
  • Automated Compute Provisioning: Automatic provisioning of compute instances with pre-configured specifications
  • Bootstrap Automation: Automated cluster bootstrapping with minimal manual intervention

Cluster Deletion

  • Complete Resource Cleanup: Comprehensive deletion process that removes all associated resources
  • Provider Resource Release: Proper deallocation of provider resources to prevent orphaned instances

Cluster Scaling

  • Horizontal Scaling: Add or remove worker nodes to meet workload demands
  • Automated Compute Management: Automatic creation and release of compute instances during scaling operations
  • Zero-Downtime Scaling: Scale operations without service interruption

Cluster Upgrades

  • Kubernetes Version Management: Seamless upgrades to newer Kubernetes versions

Supported Infrastructure Providers

Our platform follows a pluggable provider model aligned with Cluster API infrastructure providers. It is designed for multiple infrastructure platforms. Today, the DCS and VMware vSphere infrastructure providers are supported, with additional providers in progress.

  • Provider-Agnostic Design: Core workflows are consistent across providers
  • Current Support: DCS infrastructure provider, VMware vSphere infrastructure provider
  • Roadmap: Additional providers are being added

Compute Resource Management

Advanced virtual machine lifecycle management with enterprise-grade features for optimal resource utilization and performance.

Compute Lifecycle Operations

  • Create Compute Instances: Provision instances with customizable specifications and configurations
  • Delete Compute Instances: Secure deletion with proper resource cleanup

Compute Configuration Options

  • Instance/Flavor Selection: Choose from predefined instance types or flavors optimized for different workloads
  • Size Customization: Flexible sizing options from small development instances to large production workloads
  • Resource Allocation: Precise control over CPU, memory, and storage allocation
  • Network Configuration: Advanced networking options including custom subnets and security groups
  • Storage Options: Multiple storage types and classes (for example: SSD, HDD, NVMe) for different performance requirements

Use Cases

Use Case 1: Highly Available Control Plane

Scenario: Deploy a production cluster with a highly available control plane to ensure cluster stability.

Implementation:

  • Deploy a 3-node control plane with automatic failover
  • Control plane nodes are distributed across different availability zones (when supported by the infrastructure)
  • Load balancer automatically distributes API server traffic
  • Automatic recovery of failed control plane components

Benefits:

  • No single point of failure in the control plane
  • Cluster remains operational even if one control plane node fails
  • Automatic recovery reduces manual intervention

Use Case 2: Horizontal Scaling for Workload Demands

Scenario: Respond to increased application load by adding worker nodes, then scale down when demand decreases.

Implementation:

  • Adjust the replicas field in the MachineDeployment resource
  • Cluster API automatically provisions new nodes based on the Machine Template
  • New nodes automatically join the cluster and become ready for workloads
  • When scaling down, nodes are drained and deleted gracefully

Benefits:

  • Respond to workload changes in minutes, not hours
  • Automated scaling reduces operational overhead

Use Case 3: Rolling Upgrades with Zero Downtime

Scenario: Upgrade the Kubernetes version or VM template without disrupting running applications.

Implementation:

  • Update the Machine Template or Kubernetes version in the control plane/MachineDeployment
  • Cluster API performs a rolling upgrade: creates new nodes, waits for them to be ready, then deletes old nodes
  • Configurable maxSurge and maxUnavailable parameters control upgrade behavior
  • Pods are automatically drained from old nodes and rescheduled on new nodes

Benefits:

  • Zero-downtime upgrades for mission-critical applications
  • Gradual rollout allows for early problem detection
  • Easy rollback if issues are discovered
  • No manual node re-provisioning required

Use Case 4: Multi-Node Pool Management

Scenario: Run different types of workloads on dedicated node pools with different configurations.

Implementation:

  • Create multiple MachineDeployments, each with different Machine Templates
  • Configure different resource allocations (CPU, memory, storage) per pool
  • Use node labels and taints to control workload placement
  • Scale each pool independently based on workload requirements

Example Pools:

  • General Purpose Pool: Balanced CPU/memory for typical workloads
  • Compute-Optimized Pool: High CPU for batch processing or build workloads
  • Memory-Optimized Pool: High memory for databases or caching

Benefits:

  • Optimize resource allocation for different workload types
  • Isolate workloads for security and performance
  • Independent scaling per workload type
  • Cost optimization through right-sized resources