Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't fail merticbeat/windows/perfmon when no data is available #42803

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

stefans-elastic
Copy link
Contributor

@stefans-elastic stefans-elastic commented Feb 20, 2025

Proposed commit message

The bug causes issue in iis/webserver module when IIS isn't installed (meaning there is no data to collect), it makes elastic-agent to go to DEGRATED state:
Screenshot 2025-02-17 at 12 56 07 PM

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Closes #42802

Use cases

Screenshots

Screenshot 2025-02-20 at 3 01 53 PM

Logs

@stefans-elastic stefans-elastic self-assigned this Feb 20, 2025
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 20, 2025
@stefans-elastic stefans-elastic added the Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team label Feb 20, 2025
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 20, 2025
Copy link
Contributor

mergify bot commented Feb 20, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @stefans-elastic? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@stefans-elastic stefans-elastic added the backport-active-all Automated backport with mergify to all the active branches label Feb 20, 2025
@stefans-elastic stefans-elastic marked this pull request as ready for review February 20, 2025 14:36
@stefans-elastic stefans-elastic requested a review from a team as a code owner February 20, 2025 14:36
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Feb 21, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Copy link
Contributor

mergify bot commented Feb 21, 2025

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b iis-module-agent-issue upstream/iis-module-agent-issue
git merge upstream/main
git push upstream iis-module-agent-issue

re.log.Warnf("%s %v", collectFailedMsg, err)
} else if err == pdh.PDH_NO_DATA { //nolint:errorlint // the same thing as above ^
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance I could convince you to turn this into a switch statement? Usually once you need an else if a switch ends up being a little cleaner.

re.log.Warnf("%s %v", collectFailedMsg, err)

// without the return statement here it still fails when trying to get counter values
return nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like for PDH_NO_COUNTERS, we still try to get values, which would return error.
Why are we trying getvalues here for this case ?

IMO, we are doing the right thing by returning when PDH_NO_DATA is hit and not propagating the error to getvalues.
thoughts on this behaviour difference between the two error handling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on this behaviour difference between the two error handling?

I'm not sure. It is the reason I've implemented it this way (instead of doing something like if err == pdh.PDH_NO_COUNTERS || err == pdh.PDH_NO_DATA) - I din't want to change original behavior since I'm not sure 'why' it was done this way. Although now that I'm thinking about this again I think it is a bug in original behavior and I probably should change to something like

if err == pdh.PDH_NO_COUNTERS ||  err == pdh.PDH_NO_DATA {
    re.log.Warnf("%s %v", collectFailedMsg, err)

    // without the return statement here it still fails when trying to get counter values
    return nil, nil
}

WDYT?

@ishleenk17 ishleenk17 requested a review from a team February 24, 2025 06:20
@ishleenk17
Copy link
Contributor

@stefans-elastic : It would be good to add tests for the check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-active-all Automated backport with mergify to all the active branches bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[IIS]: Agent displayed as unhealthy when no data found
5 participants