Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from netdata:master #388

Merged
merged 3 commits into from
Feb 25, 2025
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Feb 25, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

Summary by Sourcery

This pull request enhances the daemon status file with host and OS information, improves the daemon startup process with detailed logging, fixes issues related to disk space and read-only conditions, and ensures the machine GUID is always available. It also refactors directory verification and updates signal handling for better logging and status management.

Bug Fixes:

  • Fixes an issue where Netdata could crash on startup due to disk read-only or full conditions by checking disk space and read-only status and reporting it in the daemon status file.
  • Fixes a bug where the runtime directory was not correctly detected, causing issues with spawn server creation.

Enhancements:

  • The daemon status file now includes host and OS fields for better reporting and crash analysis.
  • Improves the daemon startup process by adding more detailed logging and status updates to the daemon status file, which helps in diagnosing startup issues.
  • The registry now generates a machine GUID upon creation, ensuring it is always available for claiming and other processes.

Chores:

  • Refactors directory verification to use a single function, improving code maintainability and reducing redundancy.
  • Updates the handling of signals to ensure proper logging and saving of daemon status before exiting on fatal signals.

ktsaou and others added 3 commits February 25, 2025 10:57
* fix runtime directory; annotate daemon status file

* chmod the spawn server socket

* do not collect system info during initialization
* spawn an init spawn server while netdata runs; then stop it and run the final one

* stop the old one before dropping permissions

* remove the leading dot from spawn server filenames

* save the status file on every step during startup

* minor update

* add clarity about the double use of the function
@pull pull bot added the ⤵️ pull label Feb 25, 2025
@pull pull bot merged commit 728e365 into webfutureiorepo:master Feb 25, 2025
Copy link

sourcery-ai bot commented Feb 25, 2025

Reviewer's Guide by Sourcery

This pull request includes several important changes to improve the reliability and security of Netdata. It introduces more granular logging of the startup process, enhances directory validation and creation, updates the spawn server to use NETDATA_RUN_DIR, modifies the way the machine GUID is retrieved and stored, improves signal handling, and ensures the registry directory is created with the correct permissions.

Sequence diagram for Netdata startup process

sequenceDiagram
    participant Netdata Main
    participant Daemon Status File
    participant Registry

    Netdata Main->>Daemon Status File: daemon_status_file_check_crash()
    Daemon Status File->>Daemon Status File: daemon_status_file_load()
    alt last_session_status.host_id is zero
        Daemon Status File->>Registry: registry_get_this_machine_guid(false)
        Registry-->>Daemon Status File: machine_guid
    end
    Netdata Main->>Netdata Main: Various initialization steps
    loop For each static thread
        Netdata Main->>Netdata Main: Initialize static threads
    end
    Netdata Main->>Daemon Status File: daemon_status_file_startup_step(step)
    Daemon Status File->>Daemon Status File: daemon_status_file_save(DAEMON_STATUS_NONE)
    Netdata Main->>Daemon Status File: daemon_status_file_save(DAEMON_STATUS_RUNNING)
Loading

Updated class diagram for directory validation

classDiagram
    class verify_required_directory {
        +verify_required_directory(env: const char*, dir: const char*, create_it: bool, perms: int)
    }
    note for verify_required_directory "Validates and optionally creates required directories with specified permissions."
Loading

File-Level Changes

Change Details Files
Introduces more granular logging of the Netdata startup process by adding a delta_startup_time macro that logs the time elapsed between startup steps and updates a status file.
  • Adds daemon_status_file_startup_step to update the status file with the current startup step.
  • Adds logging of time elapsed between startup steps.
  • Adds a startup step to the daemon status file.
src/daemon/main.c
src/daemon/daemon-status-file.c
src/daemon/daemon-status-file.h
Modifies the environment setup for plugins and scripts to use verify_required_directory for directory validation and creation, enhancing security and error handling.
  • Replaces the previous directory verification and creation logic with verify_required_directory.
  • The verify_required_directory function now receives an environment variable name, the directory path, a flag to create the directory if it doesn't exist, and the permissions to set on the directory.
  • The function now checks each component of the path to ensure it exists and is a directory.
  • The function now checks if the directory is accessible.
src/daemon/environment.c
src/daemon/daemon.h
Updates the spawn server to use NETDATA_RUN_DIR instead of NETDATA_CACHE_DIR and modifies the socket path.
  • The spawn server now uses NETDATA_RUN_DIR to create the socket.
  • The socket path is changed to netdata-spawn-<name>.sock.
  • Adds a chmod call to the socket path to set the permissions to 0770.
src/libnetdata/spawn_server/spawn_server_nofork.c
Modifies the way the machine GUID is retrieved and stored, ensuring it is created if it doesn't exist and setting it as an environment variable.
  • The registry_get_this_machine_guid function now receives a flag to create the GUID if it doesn't exist.
  • The registry_get_this_machine_guid function now sets the NETDATA_REGISTRY_UNIQUE_ID environment variable.
src/registry/registry_internals.c
src/registry/registry.h
Fixes a potential crash during startup by ensuring system info is available before proceeding.
  • The get_daemon_status_fields_from_system_info function now checks if the system info is available before proceeding.
src/database/rrdhost-system-info.c
Improves signal handling by ensuring fatal signals trigger an immediate exit and save the daemon status.
  • The nd_process_signals function now calls daemon_status_file_save before exiting on a fatal signal.
  • The nd_process_signals function now calls nd_log_limits_unlimited before exiting on a fatal signal.
src/daemon/signals.c
Ensures the registry directory is created with the correct permissions during initialization.
  • The registry_init function now uses verify_required_directory to create the registry directory.
src/registry/registry_init.c
Updates the claiming process to ensure the machine GUID is available.
  • The claim_agent function now calls registry_get_this_machine_guid with the create_it flag set to true.
src/claim/claim-with-api.c
src/claim/cloud-conf.c
Updates the RRD initialization process to ensure the machine GUID is available.
  • The rrd_init function now calls registry_get_this_machine_guid with the create_it flag set to true.
src/database/rrd.c
Updates the OS run directory detection to set the environment variable.
  • The detect_run_dir function now calls nd_setenv to set the NETDATA_RUN_DIR environment variable.
src/libnetdata/os/run_dir.c

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants