ci as a state of the art, part 2 #20

fedordikarev · 2025-01-23T17:00:55Z

ok, that branch and PR were started as "lets have a matrix for os and arch, so we can have workflow to cross-build binaries and do it in parallel".
Eventually it evolved to the bigger one, so now it's time to merge it and continue will smaller changes at a time.

What that PR does:

Introduces cross-build for Golang program, by passing GOOS and GOARCH env variables.
Giving empty input for the workflow will lead to binaries build based on the runner os and arch, I think it's safe approach.
Adding os and arch into cache keys to avoid issues when build cache is platform dependant
Splitting cache into go-mod-cache and go-build-cache.
Go mod cache depends on go.mod and go.sum files, and should be updated when these files changed
For the go-build cache it's harder to predict what is the best strategy for the key here, so use the approach when for Pull Requests we take cache from the base branch, as we expect PRs close enough to the base branch.
And recreate build cache from scratch on push event, so only actual data is there.
Thanks @jcgruenhage for the hint and to improve readability and maintability, setting cache keys moved into separate step
For the packing result into docker image, currently there is a placeholder.
I expect to add docker part as part 4, part 3 will be for the job with updating go-mod-cache only once, and all following parallel builds will reuse that cache.
For now arch and os are limited to linux / {amd64, arm64} only. Need an option with kind of "preset" so it could be easily adjusted by the user.

orca-security-us

Orca Security Scan Summary

Status	Check	Issues by priority
Passed	Infrastructure as Code	0 0 0 0	View in Orca
Passed	Secrets	0 0 0 0	View in Orca
Passed	Vulnerabilities	0 0 0 0	View in Orca

jcgruenhage · 2025-01-27T16:05:30Z

.github/workflows/50_go_build-one-component.yml

+            goos: 'darwin'
+          - goarch: 'arm'
+            goos: 'darwin'
+    uses: ./.github/workflows/50_go_build-one-arch-one-os-one-path.yml


I don't see a big benefit to splitting .github/workflows/50_go_build-one-component.yml and .github/workflows/50_go_build-one-arch-one-os-one-path.yml. I'd keep them in one workflow.

My current thoughts are:

build-one-arch-one-os-one-path is already pretty big, and with assumption unit testing and packing into Docker will be added there, will make it even more complicated

so my preference were not overcomplicate it as much as possible

I also keep an idea in my head, to make each workflow useful by itself.

and thinking about next scenario: we have workflow for regular PR, when we want by the end to have containers for all the aarchs we use in production

for the PRs in 'Draft' mode, dev may want faster feedback, in charge of not all the archs build.

For that build-one-arch can be good fit, while keeping 50_go_build-one-component for the main workflow

also I have an idea of adding arch-os presets, so for example user can set: build for linux only. or having workflow for i386 only.. that will add some complexity into 50_go_build-one-component workflow, and I'd like to have it separated

With the items 1 and 2 as main reasons, 7 as an extra reason, and 3-6 as nice bonus, do you think we could keep it that way for now?
If we found nothing from 1, 2 or 7 useful in practice, it will be easy to combine them in one.

is already pretty big

it's ~10% of build_and_test.yml from the neon repo, I think we can bare a few more lines here.

will make it even more complicated

I don't think it's going to be more complicated, if anything I think it'd be less complicated because you don't have to jump through as many files to understand what's going on

for the PRs in 'Draft' mode, dev may want faster feedback, in charge of not all the archs build

We can do that without splitting it into multiple workflows. Look at https://github.com/neondatabase/neon/blob/c8fbbb9b65587d25b9dbd3c8f21266ce07159d02/.github/workflows/pre-merge-checks.yml#L80-L88 vs https://github.com/neondatabase/neon/blob/main/.github/workflows/build_and_test.yml#L165-L171

If we found nothing from 1, 2 or 7 useful in practice, it will be easy to combine them in one.

I think that's the wrong way around: If we find the less complex and less nested solution not to be sufficient, then we should go for the more complex and more nested approach. But I see that this depends on what people perceive as complex. For me, a more nested workflow adds more complexity than a single workflow encompassing more complexity in itself.

ok, with the moving matrix to one-arch-one-os I encounter next problem working toward adding tests.

In current setup, I have workflow "build one component" which makes matrix and call one-arch-one-os

in one-arch-one-os I had preparation steps to build keys and actual build step, that used these keys for fetching cache and actual go build

Adding go test as step prevents it to run in parallel, and talking to @matyaskuti in our setup build could be 1-2 minutes, and unit tests up to 5-7 minutes, so we definitely want to run them in parallel

and here comes issue with a number of ways to solve it. I don't know yet which one will be best in our case, we will know that when we apply that workflow to our main repos. From the discussion with @matyaskuti for us: "no one size fit all", so there is a chance we will use all of them: either based on component build and amount of tests it have, or based on either it's regular PR or fix, or could be also different to PR vs Draft PR to achieve the best dev experience.

Here are approaches I considered so far:

Have one-component workflow with matrix calling one-arch-one-os. Inside one-arch-one-os I have preparation job, that prepare keys, and both build and test jobs needs prepare. By doing that I'm sure that keys are consistent for the build and for the test and no mistype here could be introduced in the future.

We could move matrix to the file with build and test jobs. I was thinking about similar approach: having matrix of prepare jobs and each build and test jobs need prepare job. Just here we have to copy/paste that matrix statement across all 3 jobs: prepare, build and test. We need also parametrize needs block, but that shouldnt be a big problem.

I could make separate workflows build and test, maybe even in separate files. And each of them will have prepare as a step for a job, and matrix to do the task for set of arches and oses.

Some of these problem will go away when we start using templates/jsonnet for building workflow, so I also expect that part will be changed completly in next 3-4 months.

And being said: as I see we will need to rewrite that part anyway when doing that workflow for our main repos and trying to fit all their requirements and complexity.
So now I keen more just to merge that "part 2" to go further and get sooner to start adding it to our main repos.

Lets sync in Slack on what approach is better for now and for being more flexible for next steps.

.github/workflows/50_go_build-one-arch-one-os-one-path.yml

fedordikarev · 2025-01-31T10:06:50Z

after some more thoughts now I see the next approach on build workflows:

Build and Test should be separated
if there any errors in one step, we could either cancel second one
or we may want to run all of them (for example for PRs in Draft mode), so devs will get more insights on their changes from one PR
artifacts of Test are test reports, and conclusion of the Test we could directly connect to the "final conclusion" step
while for "Build" artifacts are binaries/docker images, we will pass to e2e tests before "final conclusion"
same way we could run other tests like "Linters" in parallel and don't embed them into Build workflow

I will go that way now, and will submit changes shortly

jcgruenhage

with those two nitpicks out of the way it should be good to merge

Dockerfiles/go-app-common.Dockerfile

.github/workflows/codeql.yml

try cross-builds

6eee738

orca-security-us bot reviewed Jan 23, 2025

View reviewed changes

fedordikarev added 28 commits January 23, 2025 18:05

trigger build for both arches

6c78c49

it was nice, but dont work

7b07be9

fix

15e83b2

add GOOS

f65d343

lets try full builds

ea8227a

fix goarch

134fd40

exclude 32 bits from darwin

327d6fc

trigger action

b7f9b25

try to build on small-arm64

ae49c82

trigger build

8c0b968

refactor a bit

bd61e60

back to amd64 runners

9aeba14

names for the builds

5ee9bcb

pack-to-docker-image

6d70e72

cache: restore or update

f011625

added 00_push-entry-point.yml

5555e70

fix base_sha

cdd1dc0

on push always rebuild to update build cache

17884e0

show github event

9598445

show context

d539d03

fix event.after

8cada75

fix base_sha -> base-sha

79128e7

move keys setup to separate step and reuse it

e63edf4

fix CMD_PATH substitions

8895df8

Revert trigger in history-exporter

b8182c2

limit builds to linux/\{amd64, arm64\} only for now

1d549bd

set gh_repo name

86d768a

fix repo full name and add comments

98194c4

jcgruenhage requested changes Jan 28, 2025

View reviewed changes

fedordikarev added 22 commits January 28, 2025 16:54

add test and use job to setup keys for both build and test

615b759

Trigger build

fd804f4

fix needs

1fc73cf

fix

235187e

fix

c849916

give names for jobs

147f664

add 10_workflow-sanity-checks.yml

c050c8b

add workflows-sanity-checks

58cac24

some actionlint fixes

9b9f9dd

actionlint fix

9b9336e

fix for pre 4.18.1 yq

f52b82f

fix

b25f90a

split file-filter to 'trigger_all' and per-cmd filters

db2dfad

add docker build

7dac74d

disable CodeQL for now for PRs to main branch

8ed2051

shellcheck

8368988

shellcheck

6b1caff

fix build tag and add platform

3475ed7

use head_ref

183ffc0

align folders

b1a4fec

adjust head-sha

304751f

put ARG BINARY_TO_ADD to the right context

3510158

jcgruenhage approved these changes Jan 31, 2025

View reviewed changes

Dockerfiles/go-app-common.Dockerfile Outdated Show resolved Hide resolved

.github/workflows/codeql.yml Outdated Show resolved Hide resolved

fedordikarev added 3 commits January 31, 2025 13:45

Dockefile: move binary to /usr/local/bin and add entrypoint

1ecea86

undo trigger build

1ccdd38

push workflow on main branch

2988541

fedordikarev merged commit e360861 into main Jan 31, 2025
9 checks passed

fedordikarev deleted the feat/try_multiarch_build branch January 31, 2025 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci as a state of the art, part 2 #20

ci as a state of the art, part 2 #20

fedordikarev commented Jan 23, 2025 •

edited

Loading

orca-security-us bot left a comment •

edited

Loading

jcgruenhage Jan 27, 2025

fedordikarev Jan 28, 2025

jcgruenhage Jan 28, 2025

fedordikarev Jan 29, 2025

fedordikarev commented Jan 31, 2025

jcgruenhage left a comment

ci as a state of the art, part 2 #20

ci as a state of the art, part 2 #20

Conversation

fedordikarev commented Jan 23, 2025 • edited Loading

orca-security-us bot left a comment • edited Loading

Choose a reason for hiding this comment

Orca Security Scan Summary

jcgruenhage Jan 27, 2025

Choose a reason for hiding this comment

fedordikarev Jan 28, 2025

Choose a reason for hiding this comment

jcgruenhage Jan 28, 2025

Choose a reason for hiding this comment

fedordikarev Jan 29, 2025

Choose a reason for hiding this comment

fedordikarev commented Jan 31, 2025

jcgruenhage left a comment

Choose a reason for hiding this comment

fedordikarev commented Jan 23, 2025 •

edited

Loading

orca-security-us bot left a comment •

edited

Loading