Hardware Multiplane Spectrum-X Quick Start (Tech Preview)

Warning

Hardware Multiplane (hwplb) is tech preview in Network Operator 26.4.0 and is not part of the validated Spectrum-X Reference Architecture. Use only for evaluation purposes.

Note

You can automate the configuration of this use case with NVIDIA Kubernetes Launch Kit. For more details, see Configuration Assistance with Kubernetes Launch Kit.

This walkthrough deploys a Hardware Multiplane Spectrum-X cluster on Kubernetes using ConnectX-8 SuperNICs (nicType: 1023). NIC LAG and Hardware Plane Load Balancing (hwplb) handle per-plane fan-out at the hardware layer, so each rail still uses a single CIDRPool while exposing multiple per-plane PFs. Used on B300 and GB300 platforms — set numberOfPlanes: 2 for Dual-Plane or numberOfPlanes: 4 for Quad-Plane (B300 only). The configuration uses RA 2.2 with multiplaneMode: hwplb. The example below uses numberOfPlanes: 2. Replace TODO_* values with your cluster-specific values before applying.

Step 1: Install the Helm Chart

Add the NVIDIA NGC Helm repository:

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update

Install the Network Operator. Spectrum-X Operator and NIC Configuration Operator are deployed via the same chart and enabled later through NicClusterPolicy.

helm install network-operator nvidia/network-operator \
  -n nvidia-network-operator \
  --create-namespace \
  --version 26.4.0-rc.1 \
  --set sriovNetworkOperator.enabled=true \
  --wait

Verify the installation:

kubectl -n nvidia-network-operator get pods

Step 2: Apply NicClusterPolicy

Enable the NIC Configuration Operator, NV-IPAM, Spectrum-X Operator (with XPlane), and the secondary network components.

apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
  name: nic-cluster-policy
spec:
  nicConfigurationOperator:
    operator:
      image: nic-configuration-operator
      repository: nvcr.io/nvstaging/mellanox
      version: network-operator-v26.4.0-rc.1
    configurationDaemon:
      image: nic-configuration-operator-daemon
      repository: nvcr.io/nvstaging/mellanox
      version: network-operator-v26.4.0-rc.1
    nicFirmwareStorage:
      create: true
      pvcName: nic-fw-storage-pvc
      storageClassName: nic-fw-storage-class
      availableStorageSize: 1Gi
    logLevel: info
  nvIpam:
    image: nvidia-k8s-ipam
    repository: nvcr.io/nvstaging/mellanox
    version: network-operator-v26.4.0-rc.1
    enableWebhook: false
  spectrumXOperator:
    image: spectrum-x-operator
    repository: nvcr.io/nvstaging/mellanox
    version: network-operator-v26.4.0-rc.1
    xPlane:
      image: xplane
      repository: nvcr.io/nvstaging/mellanox
      version: network-operator-v26.4.0-rc.1
  secondaryNetwork:
    cniPlugins:
      image: plugins
      repository: nvcr.io/nvstaging/mellanox
      version: network-operator-v26.4.0-rc.1
    multus:
      image: multus-cni
      repository: nvcr.io/nvstaging/mellanox
      version: network-operator-v26.4.0-rc.1
kubectl apply -f nicclusterpolicy.yaml

Step 3: NicInterfaceNameTemplate

Map PCI addresses to rail/plane indices and define interface naming. Replace TODO_PCI_* with the PCI addresses of the Spectrum-X NICs on your nodes.

apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicInterfaceNameTemplate
metadata:
  name: spectrum-x-interface-names
  namespace: nvidia-network-operator
spec:
  pfsPerNic: 2
  rdmaDevicePrefix: "rdma_rail%rail_id%_plane%plane_id%"
  netDevicePrefix: "net_rail%rail_id%_plane%plane_id%"
  railPciAddresses:
    - ["TODO_PCI_RAIL0_NIC0", "TODO_PCI_RAIL0_NIC1"]
    - ["TODO_PCI_RAIL1_NIC0", "TODO_PCI_RAIL1_NIC1"]
kubectl apply -f nicinterfacenametemplate.yaml

Step 4: NicConfigurationTemplate

Configure the ConnectX-8 SuperNICs for Spectrum-X RA 2.2 with hwplb multiplane mode. hwplb requires ConnectX-8 SuperNIC (nicType: 1023); it is not supported on BlueField-3 SuperNIC or ConnectX-7 NIC. For Quad-Plane (B300 only), set numberOfPlanes: 4.

apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicConfigurationTemplate
metadata:
  name: spectrum-x-configuration
  namespace: nvidia-network-operator
spec:
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  nicSelector:
    nicType: "1023"  # ConnectX-8 SuperNIC (B300, GB300)
  template:
    numVfs: 1
    linkType: Ethernet
    spectrumXOptimized:
      enabled: true
      version: "RA2.2"
      overlay: "none"
      multiplaneMode: "hwplb"
      numberOfPlanes: 2
kubectl apply -f nicconfigurationtemplate.yaml

Step 5: CIDRPool (per rail)

With hwplb, load balancing happens at the NIC layer, so each rail uses a single CIDRPool covering all of its planes. Replace TODO_* with subnets that match your cluster’s east-west topology.

apiVersion: nv-ipam.nvidia.com/v1alpha1
kind: CIDRPool
metadata:
  name: rail-0
  namespace: nvidia-network-operator
spec:
  cidr: TODO_RAIL0_CIDR             # e.g., 10.0.0.0/15
  gatewayIndex: 0
  perNodeNetworkPrefix: 31
  perNodeExclusions:
    - startIndex: 1
      endIndex: 1
  routes:
    - dst: TODO_RAIL0_SUBNET        # same as cidr
    - dst: TODO_EAST_WEST_SUBNET
---
apiVersion: nv-ipam.nvidia.com/v1alpha1
kind: CIDRPool
metadata:
  name: rail-1
  namespace: nvidia-network-operator
spec:
  cidr: TODO_RAIL1_CIDR
  gatewayIndex: 0
  perNodeNetworkPrefix: 31
  perNodeExclusions:
    - startIndex: 1
      endIndex: 1
  routes:
    - dst: TODO_RAIL1_SUBNET
    - dst: TODO_EAST_WEST_SUBNET
kubectl apply -f cidrpool.yaml

Step 6: SpectrumXRailPoolConfig

With hwplb, railTopology has one entry per rail. Each entry lists all per-plane PF netdev names belonging to that rail and references a single per-rail CIDRPool.

apiVersion: spectrumx.nvidia.com/v1alpha2
kind: SpectrumXRailPoolConfig
metadata:
  name: rails
  namespace: nvidia-network-operator
spec:
  draEnabled: true
  networkNamespace: default
  numVfs: 1
  railTopology:
    - name: rail0
      nicSelector:
        pfNames: ["net_rail0_plane0", "net_rail0_plane1"]
      cidrPoolRef: rail-0
      mtu: 9216
    - name: rail1
      nicSelector:
        pfNames: ["net_rail1_plane0", "net_rail1_plane1"]
      cidrPoolRef: rail-1
      mtu: 9216
kubectl apply -f spectrumxrailpoolconfig.yaml

Step 7: Deploy a Test Pod

Request one VF per rail. The network annotation references the rails created by SpectrumXRailPoolConfig.

apiVersion: v1
kind: Pod
metadata:
  name: spectrum-x-test
  namespace: default
  annotations:
    k8s.v1.cni.cncf.io/networks: rail0,rail1
spec:
  containers:
    - name: spectrum-x-test
      image: nvcr.io/nvidia/doca/doca:3.3.0-full-rt-host
      command: ["/bin/bash", "-c", "sleep infinity"]
      securityContext:
        capabilities:
          add: ["IPC_LOCK", "NET_RAW"]
      resources:
        requests:
          nvidia.com/rail_0: "1"
          nvidia.com/rail_1: "1"
        limits:
          nvidia.com/rail_0: "1"
          nvidia.com/rail_1: "1"
kubectl apply -f pod.yaml
kubectl -n default exec -it spectrum-x-test -- rdma link