[TECH PREVIEW] NVIDIA Spectrum-X NIC Configuration

NVIDIA NIC Configuration Operator offers NVIDIA Spectrum-X-specific NIC configuration for different versions of the Reference Architecture.

Note

Currently, only ConnectX-8 (device ID 1023) and BlueField-3 SuperNIC (device ID a2dc) devices are supported for this configuration.

Warning

Tech Preview feature.

Install and configure the NIC Configuration Operator

To install the operator and for more information about the CRDs follow the NIC FW Configuration and Configuration Details doc articles.

Provision the DOCA SPC-X CC algorithm package

To enable the DOCA SPC-X CC algorithm on NIC devices, the DOCA SPC-X CC .deb package for ubuntu 22.04 is required. This configuration step will be removed in the future, once the DOCA SPC-X CC algorithm will be publicly available. To access the package, contact your NVIDIA CPM. The package should be made available in the cluster and then its URL should be provided in the packageUrlSource field of the SpectrumXOperator CR.

apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicFirmwareSource
metadata:
  name: spectrum-x-configuration
  namespace: nvidia-network-operator
spec:
  # should point to the URL of the DOCA SPC-X CC .deb package for Ubuntu 22.04
  docaSpcXCCUrlSource: "https://example.com/doca-spcx-cc_3.1.0105-1_amd64.deb"

If firmware on the devices also needs to be updated, extend the NicFirmwareSource CR with fields for ConnectX and BlueField firmware. Please, use the correct firmware for your devices.

apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicFirmwareSource
metadata:
  name: spectrum-x-configuration
  namespace: nvidia-network-operator
spec:
  # should point to the URL of the DOCA SPC-X CC .deb package for Ubuntu 22.04
  docaSpcXCCUrlSource: "https://example.com/doca-spcx-cc_3.1.0105-1_amd64.deb"
  # a list of firmware binaries zip archives from the Mellanox website, can point to any URL accessible from the cluster
  binUrlSources:
    - https://www.mellanox.com/downloads/firmware/fw-ConnectX8-rel-40_46_3048-900-9X85E-00NX-MC0_Ax-UEFI-14.39.14-FlexBoot-3.8.100.signed.bin.zip
  # a URL to the BlueField Bundle (BFB) file, can point to any URL accessible from the cluster
  bfbUrlSource:
    - https://example.com/bf-fwbundle-3.1.0-77_25.07-prod.bfb

Configure and apply the NicFirmwareTemplate CR:

apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicFirmwareTemplate
metadata:
  name: spectrum-x-configuration
  namespace: nvidia-network-operator
spec:
  nicSelector:
    nicType: "a2dc" # BlueField-3 SuperNIC, Can also be "1023" for ConnectX-8
  template:
    nicFirmwareSourceRef: spectrum-x-configuration
    updatePolicy: Update

Enable SPC-X optimizations for devices

apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicConfigurationTemplate
metadata:
  name: spectrum-x-configuration
  namespace: nvidia-network-operator
spec:
  nodeSelector:
      feature.node.kubernetes.io/network-sriov.capable: "true"
  nicSelector:
      nicType: a2dc # BlueField-3 SuperNIC, Can also be "1023" for ConnectX-8
  template:
      numVfs: 1
      linkType: Ethernet
      spectrumXOptimized:
          enabled: true
          version: "RA2.0" # For Reference Architecture v1.3, use "RA1.3" value for this field.
          overlay: "none" # For L3 overlay, use "l3" value for this field.

Configuration details

Following configuration parameters are applied with spectrumXOptimized.enabled == true and spectrumXOptimized.version == “RA2.0”:

  - name: NIC mode
    value: NIC
    dmsPath: /nvidia/mode/config/mode
    valueType: string
    deviceId: "a2dc"
  - name: RoCE Adaptive Routing
    value: true
    dmsPath: /nvidia/roce/config/adaptive-routing
    valueType: bool
  - name: Programmable Congestion Control
    value: true
    dmsPath: /nvidia/cc/config/user-programmable
    valueType: bool
  - name: RoCE TX Scheduling Locality Mode
    value: TX_SCHED_LOCALITY_ACCUMULATIVE
    dmsPath: /nvidia/roce/config/tx-sched-locality-mode
    valueType: string
  - name: RoCE Multipath DSCP
    value: MULTIPATH_DSCP_DEFAULT
    dmsPath: /nvidia/roce/config/multipath-dscp
    valueType: string
  - name: CNP DSCP
    value: 0
    dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp
    valueType: int
  - name: CNP DSCP mode
    value: RTT_RESP_DSCP_DEFAULT
    dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp-mode
    valueType: string
  - name: RoCE CC Steering Ext
    value: ENABLED
    dmsPath: /nvidia/roce/config/cc-steering-ext
    valueType: string
runtimeConfig:
  roce:
    - name: Trust
      value: dscp
      dmsPath: /interfaces/interface/nvidia/qos/config/trust-mode
      valueType: string
      alternativeValue: QOS_TRUST_MODE_DSCP
    - name: PFC
      value: "00010000"
      dmsPath: /interfaces/interface/nvidia/qos/config/pfc
      valueType: string
    - name: Type of Service
      value: 96
      dmsPath: /interfaces/interface/nvidia/roce/config/tos
      valueType: int
  adaptiveRouting:
    - name: Adaptive Retransmission
      value: true
      dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-retransmission
      valueType: bool
    - name: Tx Window
      value: true
      dmsPath: /interfaces/interface/nvidia/roce/config/tx-window
      valueType: bool
    - name: Slow Restart
      value: false
      dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart
      valueType: bool
    - name: Slow Restart Idle
      value: false
      dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart-idle
      valueType: bool
    - name: Adaptive Routing Force
      value: true
      dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-routing-force
      valueType: bool
  congestionControl:
    - name: Congestion Control on RP points
      value: true
      dmsPath:  /interfaces/interface/nvidia/cc/config/priority/rp_enabled # priority[id=0..7]
      valueType: bool
      alternativeValue: "1"
    - name:  Congestion Control on NP points
      value: true
      dmsPath: /interfaces/interface/nvidia/cc/config/priority/np_enabled # priority[id=0..7]
      valueType: bool
      alternativeValue: "1"
    - name: Congestion Control  
      value: true
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/enabled
      valueType: bool
    - name: Congestion Control with Counters
      value: true
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/counter_enable
      valueType: bool
    - name: DCQCN
      value: false
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=15]/config/enabled
      valueType: bool
    - name: Bandwidth
      value: 400
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value
      valueType: int
    - name: Responsiveness Alpha Factor
      value: 6553
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=1]/config/value
      valueType: int
    - name: Maximum Decrease Factor
      value: 63570
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=2]/config/value
      valueType: int
    - name: Maximum Increase Factor
      value: 69468
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=3]/config/value
      valueType: int
    - name: Additive Increase Step Size
      value: 36
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=4]/config/value
      valueType: int
    - name: High Additive Increase Step Size
      value: 1200
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=5]/config/value
      valueType: int
    - name: High Additive Increase Interval Period
      value: 7000000
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=6]/config/value
      valueType: int
    - name: Base Round Trip Time
      value: 15000
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=7]/config/value
      valueType: int
    - name: Maximum Queuing Delay
      value: 250000
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=8]/config/value
      valueType: int
    - name: Rate on First Congestion
      value: 524288
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=9]/config/value
      valueType: int
    - name: Delay Only
      value: 0
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=10]/config/value
      valueType: int
    - name: CNP Validity
      value: 1
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=11]/config/value
      valueType: int
    - name: Transmit Rate Decrement Step
      value: 0
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=12]/config/value
      valueType: int
    - name: Fixed Transmission Rate
      value: 0
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=13]/config/value
      valueType: int
    - name: Fast Scheduling Factor
      value: 2097152
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=14]/config/value
      valueType: int
    - name: Topology Awareness
      value: 1
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=15]/config/value
      valueType: int
    - name: Advanced Features
      value: 1
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=16]/config/value
      valueType: int
    - name: Troubleshooting Capabilities
      value: 0
      dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=17]/config/value
      valueType: int
  interPacketGap:
     pureL3:
        name: Inter Packet Gap for no overlay
        value: 25
        dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap
        valueType: int
     l3EVPN:
        name: Inter Packet Gap for L3 EVPN overlay
        value: 33
        dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap
        valueType: int
docaCCVersion: 3.1.0105-1
useSoftwareCCAlgorithm: true