Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IFC-873 Remove shared storage requirements for git workers #4867

Merged
merged 22 commits into from
Nov 9, 2024

Conversation

gmazoyer
Copy link
Contributor

@gmazoyer gmazoyer commented Nov 6, 2024

Fixes #4789
Fixes #4296

@github-actions github-actions bot added the group/backend Issue related to the backend (API Server, Git Agent) label Nov 6, 2024
Copy link

codspeed-hq bot commented Nov 6, 2024

CodSpeed Performance Report

Merging #4867 will not alter performance

Comparing gma-20241106-4789 (c521456) with develop (c5e1456)

Summary

✅ 10 untouched benchmarks

@@ -58,6 +60,8 @@
"git.repository.import_objects": GitRepositoryImportObjects,
"schema.migration.path": SchemaMigrationPath,
"schema.validator.path": SchemaValidatorPath,
"refresh.git.clone": GitRepositoryClone,
"refresh.git.fetch": GitRepositoryFetch,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment on the naming here. We've always followed the naming conventions so that the class matches the routing key.

So following the current convention it would be:

    "refresh.git.clone": RefreshGitClone,
    "refresh.git.fetch": RefreshGitFetch,

This might sound a bit off and it could be that we should find other names. In this case it would probably have been better if we hadn't already "used" git.*.* as a binding.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it make sense to create a new binding for all git messages like git.*.* that would be reserved for the task-workers
Currently all consumers even the API server are subscribing to refresh.*.*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit different:

class InfrahubMessageBus:
    DELIVER_TIMEOUT: int = 30 * 60  # 30 minutes
    worker_bindings: list[str] = [
        "check.*.*",
        "event.*.*",
        "finalize.*.*",
        "git.*.*",
        "refresh.webhook.*",
        "request.*.*",
        "send.*.*",
        "schema.*.*",
        "transform.*.*",
        "trigger.*.*",
    ]
    event_bindings: list[str] = ["refresh.registry.*"]

As it is not anything under refresh.registry.* would be sent to all worker types both API workers and Prefect workers. And anything under git.*.* would be sent to the competing work queue that gets pick up by the first Prefect worker available. Which means that in order to use anything under "git." for broadcasts we'd have to modify the existing routing (i.e. have a call to .unbind() for that queue at startup, alternatively if it's a fresh instance of RabbitMQ everything would be recreated from scratch)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clarification
I think we need to convert git.*.* to a broadcasts queue and ensure that only the task workers will subscribe to it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to move some of the tasks, for instance we wouldn't want git.file.get to be a broadcast task. That one we'd only want to arrive at one worker. I.e. it should be sent to the queue that all workers subscribe to but only one worker should process the task. Perhaps we can spend a few minutes on this in the backend meeting.

@github-actions github-actions bot added the type/documentation Improvements or additions to documentation label Nov 7, 2024
@gmazoyer gmazoyer changed the title Remove shared storage requirements for git workers IFC-873 Remove shared storage requirements for git workers Nov 7, 2024
@gmazoyer gmazoyer marked this pull request as ready for review November 8, 2024 14:44
Copy link
Collaborator

@dgarros dgarros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great

@gmazoyer gmazoyer merged commit 37f378c into develop Nov 9, 2024
31 checks passed
@gmazoyer gmazoyer deleted the gma-20241106-4789 branch November 9, 2024 23:36
fatih-acar added a commit that referenced this pull request Dec 2, 2024
Fixes #4867

Signed-off-by: Fatih Acar <fatih@opsmill.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
group/backend Issue related to the backend (API Server, Git Agent) type/documentation Improvements or additions to documentation
Projects
None yet
3 participants