Skip to content

Latest commit

 

History

History
1093 lines (799 loc) · 47.4 KB

k8s-metrics.md

File metadata and controls

1093 lines (799 loc) · 47.4 KB

Semantic conventions for Kubernetes metrics

Status: Development

K8s Metrics

This document describes instruments and attributes for common K8s level metrics in OpenTelemetry. These metrics are collected from technology-specific, well-defined APIs (e.g. Kubelet's API).

Metrics in k8s. instruments SHOULD be attached to a K8s Resource and therefore inherit its attributes, like k8s.pod.name and k8s.pod.uid.

Pod Metrics

Description: Pod level metrics captured under the namespace k8s.pod.

Metric: k8s.pod.uptime

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.pod.uptime Gauge s The time the Pod has been running [1] Development

[1]: Instrumentations SHOULD use a gauge with type double and measure uptime in seconds as a floating point number with the highest precision available. The actual accuracy would depend on the instrumentation and operating system.

Metric: k8s.pod.cpu.time

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.pod.cpu.time Counter s Total CPU time consumed [1] Development

[1]: Total CPU time consumed by the specific Pod on all available CPU cores

Metric: k8s.pod.cpu.usage

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.pod.cpu.usage Gauge {cpu} Pod's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1] Development

[1]: CPU usage of the specific Pod on all available CPU cores, averaged over the sample window

Metric: k8s.pod.memory.usage

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.pod.memory.usage Gauge By Memory usage of the Pod [1] Development

[1]: Total memory usage of the Pod

Metric: k8s.pod.network.io

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.pod.network.io Counter By Network bytes for the Pod Development
Attribute Type Description Examples Requirement Level Stability
network.interface.name string The network interface name. lo; eth0 Recommended Development
network.io.direction string The network IO operation direction. transmit Recommended Development

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value Description Stability
receive receive Development
transmit transmit Development

Metric: k8s.pod.network.errors

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.pod.network.errors Counter {error} Pod network errors Development
Attribute Type Description Examples Requirement Level Stability
network.interface.name string The network interface name. lo; eth0 Recommended Development
network.io.direction string The network IO operation direction. transmit Recommended Development

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value Description Stability
receive receive Development
transmit transmit Development

Node Metrics

Description: Node level metrics captured under the namespace k8s.node.

Metric: k8s.node.uptime

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.node.uptime Gauge s The time the Node has been running [1] Development

[1]: Instrumentations SHOULD use a gauge with type double and measure uptime in seconds as a floating point number with the highest precision available. The actual accuracy would depend on the instrumentation and operating system.

Metric: k8s.node.cpu.time

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.node.cpu.time Counter s Total CPU time consumed [1] Development

[1]: Total CPU time consumed by the specific Node on all available CPU cores

Metric: k8s.node.cpu.usage

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.node.cpu.usage Gauge {cpu} Node's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1] Development

[1]: CPU usage of the specific Node on all available CPU cores, averaged over the sample window

Metric: k8s.node.memory.usage

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.node.memory.usage Gauge By Memory usage of the Node [1] Development

[1]: Total memory usage of the Node

Metric: k8s.node.network.io

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.node.network.io Counter By Network bytes for the Node Development
Attribute Type Description Examples Requirement Level Stability
network.interface.name string The network interface name. lo; eth0 Recommended Development
network.io.direction string The network IO operation direction. transmit Recommended Development

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value Description Stability
receive receive Development
transmit transmit Development

Metric: k8s.node.network.errors

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.node.network.errors Counter {error} Node network errors Development
Attribute Type Description Examples Requirement Level Stability
network.interface.name string The network interface name. lo; eth0 Recommended Development
network.io.direction string The network IO operation direction. transmit Recommended Development

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value Description Stability
receive receive Development
transmit transmit Development

Deployment Metrics

Description: Deployment level metrics captured under the namespace k8s.deployment.

Metric: k8s.deployment.desired_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.deployment.desired_pods UpDownCounter {pod} Number of desired replica pods in this deployment [1] Development

[1]: This metric aligns with the replicas field of the K8s DeploymentSpec.

This metric SHOULD, at a minimum, be reported against a k8s.deployment resource.

Metric: k8s.deployment.available_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.deployment.available_pods UpDownCounter {pod} Total number of available replica pods (ready for at least minReadySeconds) targeted by this deployment [1] Development

[1]: This metric aligns with the availableReplicas field of the K8s DeploymentStatus.

This metric SHOULD, at a minimum, be reported against a k8s.deployment resource.

ReplicaSet Metrics

Description: ReplicaSet level metrics captured under the namespace k8s.replicaset.

Metric: k8s.replicaset.desired_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.replicaset.desired_pods UpDownCounter {pod} Number of desired replica pods in this replicaset [1] Development

[1]: This metric aligns with the replicas field of the K8s ReplicaSetSpec.

This metric SHOULD, at a minimum, be reported against a k8s.replicaset resource.

Metric: k8s.replicaset.available_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.replicaset.available_pods UpDownCounter {pod} Total number of available replica pods (ready for at least minReadySeconds) targeted by this replicaset [1] Development

[1]: This metric aligns with the availableReplicas field of the K8s ReplicaSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.replicaset resource.

ReplicationController Metrics

Description: ReplicationController level metrics captured under the namespace k8s.replicationcontroller.

Metric: k8s.replicationcontroller.desired_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.replicationcontroller.desired_pods UpDownCounter {pod} Number of desired replica pods in this replication controller [1] Development

[1]: This metric aligns with the replicas field of the K8s ReplicationControllerSpec

This metric SHOULD, at a minimum, be reported against a k8s.replicationcontroller resource.

Metric: k8s.replicationcontroller.available_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.replicationcontroller.available_pods UpDownCounter {pod} Total number of available replica pods (ready for at least minReadySeconds) targeted by this replication controller [1] Development

[1]: This metric aligns with the availableReplicas field of the K8s ReplicationControllerStatus

This metric SHOULD, at a minimum, be reported against a k8s.replicationcontroller resource.

StatefulSet Metrics

Description: StatefulSet level metrics captured under the namespace k8s.statefulset.

Metric: k8s.statefulset.desired_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.statefulset.desired_pods UpDownCounter {pod} Number of desired replica pods in this statefulset [1] Development

[1]: This metric aligns with the replicas field of the K8s StatefulSetSpec.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

Metric: k8s.statefulset.ready_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.statefulset.ready_pods UpDownCounter {pod} The number of replica pods created for this statefulset with a Ready Condition [1] Development

[1]: This metric aligns with the readyReplicas field of the K8s StatefulSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

Metric: k8s.statefulset.current_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.statefulset.current_pods UpDownCounter {pod} The number of replica pods created by the statefulset controller from the statefulset version indicated by currentRevision [1] Development

[1]: This metric aligns with the currentReplicas field of the K8s StatefulSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

Metric: k8s.statefulset.updated_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.statefulset.updated_pods UpDownCounter {pod} Number of replica pods created by the statefulset controller from the statefulset version indicated by updateRevision [1] Development

[1]: This metric aligns with the updatedReplicas field of the K8s StatefulSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

HorizontalPodAutoscaler Metrics

Description: HorizontalPodAutoscaler level metrics captured under the namespace k8s.hpa.

Metric: k8s.hpa.desired_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.hpa.desired_pods UpDownCounter {pod} Desired number of replica pods managed by this horizontal pod autoscaler, as last calculated by the autoscaler [1] Development

[1]: This metric aligns with the desiredReplicas field of the K8s HorizontalPodAutoscalerStatus

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

Metric: k8s.hpa.current_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.hpa.current_pods UpDownCounter {pod} Current number of replica pods managed by this horizontal pod autoscaler, as last seen by the autoscaler [1] Development

[1]: This metric aligns with the currentReplicas field of the K8s HorizontalPodAutoscalerStatus

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

Metric: k8s.hpa.max_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.hpa.max_pods UpDownCounter {pod} The upper limit for the number of replica pods to which the autoscaler can scale up [1] Development

[1]: This metric aligns with the maxReplicas field of the K8s HorizontalPodAutoscalerSpec

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

Metric: k8s.hpa.min_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.hpa.min_pods UpDownCounter {pod} The lower limit for the number of replica pods to which the autoscaler can scale down [1] Development

[1]: This metric aligns with the minReplicas field of the K8s HorizontalPodAutoscalerSpec

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

DaemonSet Metrics

Description: DaemonSet level metrics captured under the namespace k8s.daemonset.

Metric: k8s.daemonset.current_scheduled_nodes

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.daemonset.current_scheduled_nodes UpDownCounter {node} Number of nodes that are running at least 1 daemon pod and are supposed to run the daemon pod [1] Development

[1]: This metric aligns with the currentNumberScheduled field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Metric: k8s.daemonset.desired_scheduled_nodes

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.daemonset.desired_scheduled_nodes UpDownCounter {node} Number of nodes that should be running the daemon pod (including nodes currently running the daemon pod) [1] Development

[1]: This metric aligns with the desiredNumberScheduled field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Metric: k8s.daemonset.misscheduled_nodes

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.daemonset.misscheduled_nodes UpDownCounter {node} Number of nodes that are running the daemon pod, but are not supposed to run the daemon pod [1] Development

[1]: This metric aligns with the numberMisscheduled field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Metric: k8s.daemonset.ready_nodes

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.daemonset.ready_nodes UpDownCounter {node} Number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready [1] Development

[1]: This metric aligns with the numberReady field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Job Metrics

Description: Job level metrics captured under the namespace k8s.job.

Metric: k8s.job.active_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.job.active_pods UpDownCounter {pod} The number of pending and actively running pods for a job [1] Development

[1]: This metric aligns with the active field of the K8s JobStatus.

This metric SHOULD, at a minimum, be reported against a k8s.job resource.

Metric: k8s.job.failed_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.job.failed_pods UpDownCounter {pod} The number of pods which reached phase Failed for a job [1] Development

[1]: This metric aligns with the failed field of the K8s JobStatus.

This metric SHOULD, at a minimum, be reported against a k8s.job resource.

Metric: k8s.job.successful_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.job.successful_pods UpDownCounter {pod} The number of pods which reached phase Succeeded for a job [1] Development

[1]: This metric aligns with the succeeded field of the K8s JobStatus.

This metric SHOULD, at a minimum, be reported against a k8s.job resource.

Metric: k8s.job.desired_successful_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.job.desired_successful_pods UpDownCounter {pod} The desired number of successfully finished pods the job should be run with [1] Development

[1]: This metric aligns with the completions field of the K8s JobSpec.

This metric SHOULD, at a minimum, be reported against a k8s.job resource.

Metric: k8s.job.max_parallel_pods

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.job.max_parallel_pods UpDownCounter {pod} The max desired number of pods the job should run at any given time [1] Development

[1]: This metric aligns with the parallelism field of the K8s JobSpec.

This metric SHOULD, at a minimum, be reported against a k8s.job resource.

CronJob Metrics

Description: CronJob level metrics captured under the namespace k8s.cronjob.

Metric: k8s.cronjob.active_jobs

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.cronjob.active_jobs UpDownCounter {job} The number of actively running jobs for a cronjob [1] Development

[1]: This metric aligns with the active field of the K8s CronJobStatus.

This metric SHOULD, at a minimum, be reported against a k8s.cronjob resource.

Namespace Metrics

Description: Namespace level metrics captured under the namespace k8s.namespace.

Metric: k8s.namespace.phase

This metric is recommended.

Name Instrument Type Unit (UCUM) Description Stability
k8s.namespace.phase UpDownCounter {namespace} Describes number of K8s namespaces that are currently in a given phase. [1] Development

[1]: This metric SHOULD, at a minimum, be reported against a k8s.namespace resource.

Attribute Type Description Examples Requirement Level Stability
k8s.namespace.phase string The phase of the K8s namespace. [1] active; terminating Required Development

[1] k8s.namespace.phase: This attribute aligns with the phase field of the K8s NamespaceStatus


k8s.namespace.phase has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value Description Stability
active Active namespace phase as described by K8s API Development
terminating Terminating namespace phase as described by K8s API Development