Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): generate docs/monitoring/metrics.md file #5117

Merged

Conversation

ivankatliarchuk
Copy link
Contributor

@ivankatliarchuk ivankatliarchuk commented Feb 23, 2025

Description

No issue. On one of the proposals, there were mentioning about metrics #5079 (comment). So I've tried to understand how metrics/observability is configured. Turns out, that there were no page about that, only FAQ with outdated data.

This pull request is very similar to #4983

It adds

  • generation of all available metrics to a document
  • monitoring documentation
  • added debug logs for metric instrumentation

This PR is massive, I'm could split it if required, just need a guidance how to slice it.

Checklist

  • Unit tests updated
  • End user documentation updated

Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 23, 2025
@ivankatliarchuk
Copy link
Contributor Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 23, 2025
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@ivankatliarchuk ivankatliarchuk changed the title chore(docs): generate docs/monitoring/metrics.md file WIP chore(docs): generate docs/monitoring/metrics.md file Feb 23, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2025
@ivankatliarchuk ivankatliarchuk marked this pull request as draft February 23, 2025 16:05
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@ivankatliarchuk ivankatliarchuk changed the title WIP chore(docs): generate docs/monitoring/metrics.md file chore(docs): generate docs/monitoring/metrics.md file Feb 23, 2025
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@ivankatliarchuk ivankatliarchuk marked this pull request as ready for review February 23, 2025 17:27
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2025
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@mloiseleur
Copy link
Contributor

Overall PR LGTM.
Wdyt of moving the template in a separate file ?

It's the case for the chart and it's clearly easier to review & update.

Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@ivankatliarchuk
Copy link
Contributor Author

Changed to a template file.

Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
Copy link
Contributor

@mloiseleur mloiseleur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please add CI on this ? (like for flags)
On my system, it generates two additional metrics:

diff --git a/docs/monitoring/metrics.md b/docs/monitoring/metrics.md
index 6c7782af..effbd242 100644
--- a/docs/monitoring/metrics.md
+++ b/docs/monitoring/metrics.md
@@ -80,6 +80,8 @@ curl https://localhost:7979/metrics
 | http_request_duration_seconds |
 | process_cpu_seconds_total |
 | process_max_fds |
+| process_network_receive_bytes_total |
+| process_network_transmit_bytes_total |
 | process_open_fds |
 | process_resident_memory_bytes |
 | process_start_time_seconds |

@ivankatliarchuk
Copy link
Contributor Author

The Go runtime's behavior differs considerably across systems due to factors like Docker containerization, the host operating system, and architecture (ARM, AMD64, etc.). This inconsistency is why its tests are disabled. We can either add them manually or remove them entirely.

The following Go runtime metrics are available for scraping. Please note that they may change over time and they are OS dependent.

@ivankatliarchuk
Copy link
Contributor Author

ivankatliarchuk commented Feb 25, 2025

process_network_receive_bytes_total
+| process_network_receive_bytes_total |
+| process_network_transmit_bytes_total |

I'm unsure how to catch them from within a code, they are only available when promhttp.Handler() is activated.

Signed-off-by: ivan katliarchuk <ivan.katliarchuk@gmail.com>
@ivankatliarchuk
Copy link
Contributor Author

I've added manually this two metrics. It's a best effort at the moment.

@ivankatliarchuk
Copy link
Contributor Author

Current behaviour for CI tests

  • Will fail if new custom metric is added to a code but not to a file
  • Will ingore if Go adds/removes runtime metrics. So no CI tests to fail in such case

@ivankatliarchuk
Copy link
Contributor Author

Need guidance here.

Co-authored-by: Michel Loiseleur <97035654+mloiseleur@users.noreply.github.com>
@mloiseleur
Copy link
Contributor

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 4, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mloiseleur

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 4, 2025
@k8s-ci-robot k8s-ci-robot merged commit 1e8e5e0 into kubernetes-sigs:master Mar 4, 2025
13 checks passed
@ivankatliarchuk ivankatliarchuk deleted the chore-docs-generation-metrics branch March 4, 2025 15:49
ivankatliarchuk added a commit to gofogo/k8s-sigs-external-dns-fork that referenced this pull request Mar 6, 2025
* master: (198 commits)
  fix(aws-sd): service instances registration and deregistration (kubernetes-sigs#5135)
  chore(docs): generate docs/monitoring/metrics.md file (kubernetes-sigs#5117)
  feat(chart): add helm-unittest framework (kubernetes-sigs#5137)
  feat(chart): add helm-unittest framework
  feat(aws): always create AAAA alias records in route53 (kubernetes-sigs#5111)
  feat(aws): fetch zones with tags batching (kubernetes-sigs#5058)
  docs: openwrt webhook (kubernetes-sigs#5132)
  docs(proposal): ipv6 internal node ip rollback plan (kubernetes-sigs#5081)
  docs(proposal): update date format
  chore(deps): bump the dev-dependencies group across 1 directory with 7 updates
  Update README.md with proper link to dev guide
  Add OpenStack Designate webook provider to readme
  chore(deps): bump the dev-dependencies group with 3 updates
  chore(deps): bump the dev-dependencies group with 20 updates
  chore(deps): bump azure/setup-helm in the dev-dependencies group
  style: formatting
  fix: remove broken test
  fix test name
  chore: upgrade ExternalDNS to go 1.24
  chore-makefile-coverage
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants