Skip to content

Commit

Permalink
Add virt capacity benchmark test (#180)
Browse files Browse the repository at this point in the history
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
  • Loading branch information
ygalblum authored Feb 25, 2025
1 parent 1fb71b2 commit 6e31617
Show file tree
Hide file tree
Showing 13 changed files with 744 additions and 4 deletions.
73 changes: 70 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This plugin is a very opinionated OpenShift wrapper designed to simplify the exe
Executed with `kube-burner-ocp`, it looks like:

```console
$ kube-burner-ocp help
$ kube-burner-ocp --help
kube-burner plugin designed to be used with OpenShift clusters as a quick way to run well-known workloads

Usage:
Expand All @@ -29,6 +29,8 @@ Available Commands:
pvc-density Runs pvc-density workload
udn-density-l3-pods Runs udn-density-l3-pods workload
version Print the version number of kube-burner
virt-capacity-benchmark Runs capacity-benchmark workload
virt-density Runs virt-density workload
web-burner-cluster-density Runs web-burner-cluster-density workload
web-burner-init Runs web-burner-init workload
web-burner-node-density Runs web-burner-node-density workload
Expand Down Expand Up @@ -86,7 +88,7 @@ kube-burner-ocp cluster-density-v2 --iterations=1 --churn-duration=2m0s --churn-
### metrics-endpoints.yaml

```yaml
- endpoint: prometheus-k8s-openshift-monitoring.apps.rook.devshift.org
- endpoint: prometheus-k8s-openshift-monitoring.apps.rook.devshift.org
metrics:
- metrics.yml
alerts:
Expand All @@ -97,7 +99,7 @@ kube-burner-ocp cluster-density-v2 --iterations=1 --churn-duration=2m0s --churn-
defaultIndex: {{.ES_INDEX}}
type: opensearch
- endpoint: https://prometheus-k8s-openshift-monitoring.apps.rook.devshift.org
token: {{ .TOKEN }}
token: {{ .TOKEN }}
metrics:
- metrics.yml
indexer:
Expand Down Expand Up @@ -387,6 +389,71 @@ Input parameters specific to the workload:
| dpdk-cores | Number of cores assigned for each DPDK pod (should fill all the isolated cores of one NUMA node) | 2 |
| performance-profile | Name of the performance profile implemented on the cluster | default |


## Virt Workloads

This workload family is a focused on Virtualization creating different objects across the cluster.

The different variants are:
- [virt-density](#virt-density)
- [virt-capacity-benchmark](#virt-capacity-benchmark).

### Virt Density

### Virt Capacity Benchmark

Test the capacity of Virtual Machines and Volumes supported by the cluster and a specific storage class.

#### Environment Requirements

In order to verify that the `VirtualMachine` completed their boot and that volume resize propagated successfully, the test uses `virtctl ssh`.
Therefore, `virtctl` must be installed and available in the `PATH`.

See the [Temporary SSH Keys](#temporary-ssh-keys) for details on the SSH keys used for the test

#### Test Sequence

The test runs a workload in a loop without deleting previously created resources. By default it will continue until a failure occurs.
Each loop is comprised of the following steps:
- Create VMs
- Resize the root and data volumes
- Restart the VMs
- Snapshot the VMs
- Migrate the VMs

#### Tested StorageClass

By default, the test will search for the `StorageClass` to use:

1. Use the default `StorageClass` for Virtualization annotated with `storageclass.kubevirt.io/is-default-virt-class`
2. If does not exist, use general default `StorageClass` annotated with `storageclass.kubernetes.io/is-default-class`
3. If does not exist, fail the test before starting

To use a different one, use `--storage-class` to provide a different name.

Please note that regardless to which `StorageClass` is used, it must:
- Support Volume Expansion: `allowVolumeExpansion: true`.
- Have a corresponding `VolumeSnapshotClass` using the same provisioner

#### Test Namespace

All `VirtualMachines` are created in the same namespace.

By default, the namespace is `virt-capacity-benchmark`. Set it by passing `--namespace` (or `-n`)

#### Test Size Parameters

Users may control the workload sizes by passing the following arguments:
- `--max-iterations` - Maximum number of iterations, or 0 (default) for infinite. In any case, the test will stop upon failure
- `--vms` - Number of VMs for each iteration (default 5)
- `--data-volume-count` - Number of data volumes for each VM (default 9)

#### Temporary SSH Keys

The test generated the SSH keys automatically.
By default, it stores the pair in a temporary directory.
Users may choose the store the key in a specified directory by setting `--ssh-key-path`

## Custom Workload: Bring your own workload

To kickstart kube-burner-ocp with a custom workload, `init` becomes your go-to command. This command is equipped with flags that enable to seamlessly integrate and run your personalized workloads. Here's a breakdown of the flags accepted by the init command:
Expand Down
107 changes: 107 additions & 0 deletions cmd/config/virt-capacity-benchmark/check.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
#!/usr/bin/env bash
COMMAND=$1
LABEL_KEY=$2
LABEL_VALUE=$3
NAMESPACE=$4
IDENTITY_FILE=$5
REMOTE_USER=$6
EXPECTED_ROOT_SIZE=$7
EXPECTED_DATA_SIZE=$8

# Wait up to ~60 minutes
MAX_RETRIES=130
# In the first reties use a shorter sleep
MAX_SHORT_WAITS=12
SHORT_WAIT=5
LONG_WAIT=30

if virtctl ssh --help | grep -qc "\--local-ssh " ; then
LOCAL_SSH="--local-ssh"
else
LOCAL_SSH=""
fi

get_vms() {
local namespace=$1
local label_key=$2
local label_value=$3

local vms
vms=$(kubectl get vm -n "${namespace}" -l "${label_key}"="${label_value}" -o json | jq .items | jq -r '.[] | .metadata.name')
local ret=$?
if [ $ret -ne 0 ]; then
echo "Failed to get VM list"
exit 1
fi
echo "${vms}"
}

remote_command() {
local namespace=$1
local identity_file=$2
local remote_user=$3
local vm_name=$4
local command=$5

local output
output=$(virtctl ssh ${LOCAL_SSH} --local-ssh-opts="-o StrictHostKeyChecking=no" --local-ssh-opts="-o UserKnownHostsFile=/dev/null" -n "${namespace}" -i "${identity_file}" -c "${command}" --username "${remote_user}" "${vm_name}" 2>/dev/null)
local ret=$?
if [ $ret -ne 0 ]; then
return 1
fi
echo "${output}"
}

check_vm_running() {
local vm=$1
remote_command "${NAMESPACE}" "${IDENTITY_FILE}" "${REMOTE_USER}" "${vm}" "ls"
return $?
}

check_resize() {
local vm=$1

local blk_devices
blk_devices=$(remote_command "${NAMESPACE}" "${IDENTITY_FILE}" "${REMOTE_USER}" "${vm}" "lsblk --json -v --output=NAME,SIZE")
local ret=$?
if [ $ret -ne 0 ]; then
return $ret
fi

local size
size=$(echo "${blk_devices}" | jq .blockdevices | jq -r --arg name "vda" '.[] | select(.name == $name) | .size')
if [[ $size != "${EXPECTED_ROOT_SIZE}" ]]; then
return 1
fi

local datavolume_sizes
datavolume_sizes=$(echo "${blk_devices}" | jq .blockdevices | jq -r --arg name "vda" '.[] | select(.name != $name) | .size')
for datavolume_size in ${datavolume_sizes}; do
if [[ $datavolume_size != "${EXPECTED_DATA_SIZE}" ]]; then
return 1
fi
done

return 0
}

VMS=$(get_vms "${NAMESPACE}" "${LABEL_KEY}" "${LABEL_VALUE}")

for vm in ${VMS}; do
for attempt in $(seq 1 $MAX_RETRIES); do
if ${COMMAND} "${vm}"; then
break
fi
if [ "${attempt}" -lt $MAX_RETRIES ]; then
if [ "${attempt}" -lt $MAX_SHORT_WAITS ]; then
sleep "${SHORT_WAIT}"
else
sleep "${LONG_WAIT}"
fi
else
echo "Failed waiting on ${COMMAND} for ${vm}" >&2
exit 1
fi
done
echo "${COMMAND} finished successfully for ${vm}"
done
6 changes: 6 additions & 0 deletions cmd/config/virt-capacity-benchmark/templates/resize_pvc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: PersistentVolumeClaim
spec:
resources:
requests:
storage: {{ .storageSize }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: v1
kind: Secret
metadata:
name: "{{ .name }}-{{ .counter }}"
type: Opaque
data:
key: {{ .publicKeyPath | ReadFile | b64enc }}
14 changes: 14 additions & 0 deletions cmd/config/virt-capacity-benchmark/templates/vm-snapshot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: snapshot.kubevirt.io/v1beta1
kind: VirtualMachineSnapshot
metadata:
name: "{{ .name }}-{{ .counter }}-{{ .Replica }}"
labels:
{{range $key, $value := .snapshotLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
deletionPolicy: delete
source:
apiGroup: kubevirt.io
kind: VirtualMachine
name: "{{ .name }}-{{ .counter }}-{{ .Replica }}"
106 changes: 106 additions & 0 deletions cmd/config/virt-capacity-benchmark/templates/vm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{{- $storageClassName := .storageClassName -}}
{{- $dataVolumeLabels := .dataVolumeLabels -}}
{{- $dataVolumeSize := (default "1Gi" .dataVolumeSize) -}}
{{- $name := .name -}}
{{- $counter := .counter -}}
{{- $replica := .Replica }}
{{- $accessMode := .accessMode -}}

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}"
labels:
{{range $key, $value := .vmLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
dataVolumeTemplates:
- metadata:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}-root"
labels:
{{range $key, $value := .rootVolumeLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
source:
registry:
url: "docker://{{ .rootDiskImage }}"
storage:
accessModes:
- {{ $accessMode }}
storageClassName: {{ .storageClassName }}
resources:
requests:
storage: {{ default "10Gi" .rootVolumeSize }}
{{ range $dataVolumeIndex := .dataVolumeCounters }}
- metadata:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}-data-{{ $dataVolumeIndex }}"
labels:
{{range $key, $value := $dataVolumeLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
source:
blank: {}
storage:
accessModes:
- {{ $accessMode }}
storageClassName: {{ $storageClassName }}
resources:
requests:
storage: {{ $dataVolumeSize }}
{{ end }}
running: true
template:
spec:
accessCredentials:
- sshPublicKey:
propagationMethod:
noCloud: {}
source:
secret:
secretName: "{{ .sshPublicKeySecret }}-{{ .counter }}"
architecture: amd64
domain:
resources:
requests:
memory: {{ default "512Mi" .vmMemory }}
devices:
disks:
- disk:
bus: virtio
name: rootdisk
bootOrder: 1
{{ range $dataVolumeIndex := .dataVolumeCounters }}
- disk:
bus: virtio
name: "data-{{ $dataVolumeIndex }}"
{{ end }}
interfaces:
- name: default
masquerade: {}
bootOrder: 2
machine:
type: pc-q35-rhel9.4.0
networks:
- name: default
pod: {}
volumes:
- dataVolume:
name: "{{ .name }}-{{ .counter }}-{{ .Replica }}-root"
name: rootdisk
{{ range $dataVolumeIndex := .dataVolumeCounters }}
- dataVolume:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}-data-{{ $dataVolumeIndex }}"
name: "data-{{ . }}"
{{ end }}
- cloudInitNoCloud:
userData: |
#cloud-config
chpasswd:
expire: false
password: {{ uuidv4 }}
user: fedora
runcmd: []
name: cloudinitdisk
Loading

0 comments on commit 6e31617

Please sign in to comment.