Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci as a state of the art, part 2 #20

Merged
merged 57 commits into from
Jan 31, 2025
Merged

Conversation

fedordikarev
Copy link
Collaborator

@fedordikarev fedordikarev commented Jan 23, 2025

ok, that branch and PR were started as "lets have a matrix for os and arch, so we can have workflow to cross-build binaries and do it in parallel".
Eventually it evolved to the bigger one, so now it's time to merge it and continue will smaller changes at a time.

What that PR does:

  1. Introduces cross-build for Golang program, by passing GOOS and GOARCH env variables.
  2. Giving empty input for the workflow will lead to binaries build based on the runner os and arch, I think it's safe approach.
  3. Adding os and arch into cache keys to avoid issues when build cache is platform dependant
  4. Splitting cache into go-mod-cache and go-build-cache.
  5. Go mod cache depends on go.mod and go.sum files, and should be updated when these files changed
  6. For the go-build cache it's harder to predict what is the best strategy for the key here, so use the approach when for Pull Requests we take cache from the base branch, as we expect PRs close enough to the base branch.
  7. And recreate build cache from scratch on push event, so only actual data is there.
  8. Thanks @jcgruenhage for the hint and to improve readability and maintability, setting cache keys moved into separate step
  9. For the packing result into docker image, currently there is a placeholder.
  10. I expect to add docker part as part 4, part 3 will be for the job with updating go-mod-cache only once, and all following parallel builds will reuse that cache.
  11. For now arch and os are limited to linux / {amd64, arm64} only. Need an option with kind of "preset" so it could be easily adjusted by the user.

Copy link

@orca-security-us orca-security-us bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

goos: 'darwin'
- goarch: 'arm'
goos: 'darwin'
uses: ./.github/workflows/50_go_build-one-arch-one-os-one-path.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a big benefit to splitting .github/workflows/50_go_build-one-component.yml and .github/workflows/50_go_build-one-arch-one-os-one-path.yml. I'd keep them in one workflow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current thoughts are:

  1. build-one-arch-one-os-one-path is already pretty big, and with assumption unit testing and packing into Docker will be added there, will make it even more complicated
  2. so my preference were not overcomplicate it as much as possible
  3. I also keep an idea in my head, to make each workflow useful by itself.
  4. and thinking about next scenario: we have workflow for regular PR, when we want by the end to have containers for all the aarchs we use in production
  5. for the PRs in 'Draft' mode, dev may want faster feedback, in charge of not all the archs build.
  6. For that build-one-arch can be good fit, while keeping 50_go_build-one-component for the main workflow
  7. also I have an idea of adding arch-os presets, so for example user can set: build for linux only. or having workflow for i386 only.. that will add some complexity into 50_go_build-one-component workflow, and I'd like to have it separated

With the items 1 and 2 as main reasons, 7 as an extra reason, and 3-6 as nice bonus, do you think we could keep it that way for now?
If we found nothing from 1, 2 or 7 useful in practice, it will be easy to combine them in one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is already pretty big

it's ~10% of build_and_test.yml from the neon repo, I think we can bare a few more lines here.

will make it even more complicated

I don't think it's going to be more complicated, if anything I think it'd be less complicated because you don't have to jump through as many files to understand what's going on

for the PRs in 'Draft' mode, dev may want faster feedback, in charge of not all the archs build

We can do that without splitting it into multiple workflows. Look at https://github.com/neondatabase/neon/blob/c8fbbb9b65587d25b9dbd3c8f21266ce07159d02/.github/workflows/pre-merge-checks.yml#L80-L88 vs https://github.com/neondatabase/neon/blob/main/.github/workflows/build_and_test.yml#L165-L171

If we found nothing from 1, 2 or 7 useful in practice, it will be easy to combine them in one.

I think that's the wrong way around: If we find the less complex and less nested solution not to be sufficient, then we should go for the more complex and more nested approach. But I see that this depends on what people perceive as complex. For me, a more nested workflow adds more complexity than a single workflow encompassing more complexity in itself.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, with the moving matrix to one-arch-one-os I encounter next problem working toward adding tests.

  • In current setup, I have workflow "build one component" which makes matrix and call one-arch-one-os
  • in one-arch-one-os I had preparation steps to build keys and actual build step, that used these keys for fetching cache and actual go build
  • Adding go test as step prevents it to run in parallel, and talking to @matyaskuti in our setup build could be 1-2 minutes, and unit tests up to 5-7 minutes, so we definitely want to run them in parallel

and here comes issue with a number of ways to solve it. I don't know yet which one will be best in our case, we will know that when we apply that workflow to our main repos. From the discussion with @matyaskuti for us: "no one size fit all", so there is a chance we will use all of them: either based on component build and amount of tests it have, or based on either it's regular PR or fix, or could be also different to PR vs Draft PR to achieve the best dev experience.

Here are approaches I considered so far:

  1. Have one-component workflow with matrix calling one-arch-one-os. Inside one-arch-one-os I have preparation job, that prepare keys, and both build and test jobs needs prepare. By doing that I'm sure that keys are consistent for the build and for the test and no mistype here could be introduced in the future.
  2. We could move matrix to the file with build and test jobs. I was thinking about similar approach: having matrix of prepare jobs and each build and test jobs need prepare job. Just here we have to copy/paste that matrix statement across all 3 jobs: prepare, build and test. We need also parametrize needs block, but that shouldnt be a big problem.
  3. I could make separate workflows build and test, maybe even in separate files. And each of them will have prepare as a step for a job, and matrix to do the task for set of arches and oses.

Some of these problem will go away when we start using templates/jsonnet for building workflow, so I also expect that part will be changed completly in next 3-4 months.

And being said: as I see we will need to rewrite that part anyway when doing that workflow for our main repos and trying to fit all their requirements and complexity.
So now I keen more just to merge that "part 2" to go further and get sooner to start adding it to our main repos.

Lets sync in Slack on what approach is better for now and for being more flexible for next steps.

@fedordikarev
Copy link
Collaborator Author

after some more thoughts now I see the next approach on build workflows:

  • Build and Test should be separated
  • if there any errors in one step, we could either cancel second one
  • or we may want to run all of them (for example for PRs in Draft mode), so devs will get more insights on their changes from one PR
  • artifacts of Test are test reports, and conclusion of the Test we could directly connect to the "final conclusion" step
  • while for "Build" artifacts are binaries/docker images, we will pass to e2e tests before "final conclusion"
  • same way we could run other tests like "Linters" in parallel and don't embed them into Build workflow

I will go that way now, and will submit changes shortly

Copy link
Member

@jcgruenhage jcgruenhage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with those two nitpicks out of the way it should be good to merge

@fedordikarev fedordikarev merged commit e360861 into main Jan 31, 2025
9 checks passed
@fedordikarev fedordikarev deleted the feat/try_multiarch_build branch January 31, 2025 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants