Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cos_agent lib with generic HostHealth rules #232

Merged
merged 7 commits into from
Feb 5, 2025

Conversation

MichaelThamm
Copy link
Contributor

@MichaelThamm MichaelThamm commented Jan 13, 2025

Issue

Currently, the grafana-agent host health rules are hard-coded in a .rule file. Once the tandem PR is merged, the UX will differ between vm and k8s charms.

Solution

Match the same UX of k8s charms by injecting the alert rules on the fly in the cos_agent Provider. Remove the host_health .rule file to avoid collisions and dedupe conflicts.

Context

In tandem with:

Testing Instructions

  1. Copy lib/charms/grafana_agent/v0/cos_agent.py into zookeeper (since it is the provider side of the relation) and pack the charm
  2. Pack the grafana-agent charm
  3. Deploy the bundles:

k8s (in a model named "prom")

bundle: kubernetes
applications:
  alertmanager:
    charm: alertmanager-k8s
    channel: latest/stable
    revision: 138
    scale: 1
    trust: true
  prom:
    charm: prometheus-k8s
    channel: latest/stable
    revision: 221
    scale: 1
    trust: true
  traefik:
    charm: traefik-k8s
    channel: latest/stable
    revision: 223
    scale: 1
    trust: true
relations:
- - traefik:ingress-per-unit
  - prom:ingress
- - traefik:ingress
  - alertmanager:ingress
- - prom:alertmanager
  - alertmanager:alerting
- - prom:metrics-endpoint
  - traefik:metrics-endpoint
--- # overlay.yaml
applications:
  prom:
    offers:
      prom:
        endpoints:
        - receive-remote-write
        acl:
          admin: admin

lxd

series: jammy

saas:
  prom:
    url: microk8s:admin/prom.prom
applications:
  gagent:
    # charm path is relative to the bundle file
    charm: ./grafana-agent_ubuntu-22.04-amd64.charm
    series: jammy
  zoo:
    # charm path is relative to the bundle file
    charm: ./zookeeper_ubuntu@22.04-amd64.charm
    series: jammy
    scale: 1
relations:
- - zoo:cos-agent
  - gagent:cos-agent
- - gagent:send-remote-write
  - prom:receive-remote-write
  1. juju exec --application zoo "sudo snap stop charmed-zookeeper"

    1. wait for HostDown alert to fire
    2. start the zookeeper snap again
      image
  2. juju exec --application gagent "sudo snap stop grafana-agent"

    1. wait for HostMetricsMissing alert to fire
    2. start the grafana-agent snap again
      image

Upgrade Notes

By fetching the new libs you would get a set of new alerts automatically. If charms already had up/absent alerts, this will result in duplication of alerts and rules.

  • up/absent alerts are ubiquitous and are handled by the libs modified in this PR. Any custom alerts duplicating this behaviour can be removed.

* Inject generic alert rules via cos_agent
Signed-off-by: Michael Thamm <mike.thamm@canonical.com>
@MichaelThamm MichaelThamm merged commit 4de78c4 into main Feb 5, 2025
12 checks passed
@MichaelThamm MichaelThamm deleted the feature/host-health branch February 5, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants