-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run PHD tests in parallel by default #882
base: master
Are you sure you want to change the base?
Conversation
6615594
to
f95fa59
Compare
f95fa59
to
e0d4da0
Compare
after being mostly perplexed on Friday afternoon: there are a few wrinkles that show up with this form of concurrency in CI that i wasn't seeing locally. the more confusing one is now, it's not great to download artifacts N times, or extract them potentially N times, but it's weird that the extra rounds of extracting |
if phd-runner decides to run with parallelism > 1, the tmpdir and artifact directories, which phd-runner will write to from each framework, are rewritten to be unique per-framework. this means the directory layout changes based on available parallelism, and makes the `phd-run-with-args.sh` script somewhat more gross, and generally is pretty unfortunate. at this point it's just to demonstrate that something is possible, but this really deserves the requisite surgery to share one log dir and one artifact dir across the Frameworks.
f09ce77
to
589a5ea
Compare
589a5ea leaves this in an "it.... works, i guess" place - instead of doing the requisite surgery for especially since i'm doing this only if the inferred parallelism is >1 it's pretty gross. since at this point i only intend to show that we could do something if we were gonna sweat the time, i'm moving this back to draft. in the future i'd love to do the right kind of surgery to just share state between runners here! |
the run for |
this adds a
--parallelism
flag tophd-runner
(and by extension,cargo xtask phd run
), which when set tellsphd-runner
what the upper bound is for how many tests to run concurrently.if
--parallelism
is not passed,phd-runner
will try to guess a level of parallelism that won't cause tests to fail. this involves a few pretty arbitrary choices that i think are reasonable generally, definitely fine for the environments i know most about (my workstation and lab CI), but if they're wrong will be kind of annoying.MAX_VMS_GUESS
assumes that no more than three VMs are running in a test at the same time. i think the actual number here is two. the heuristic also assumes it's safe to use up to a quarter of physical memory if the reservoir is not configured. if these are wrong the worst case is we'll try running too many test VMs concurrently and either have slower tests (reclaiming ARC memory, paging stuff out, ...) or outright fail to allocate for a VM and fail the test for that reason.and of course, if PHD is run with
--omicron-build
and there's no reservoir, that'll fail regardless of parallelism. so really, please set the reservoir to some non-zero size! but havingphd-runner
do that implies having topfexec cargo xtask phd run
and that seems in poor form. so i implore the i'm-sure-studious-reader-of-test-logs to set a reservoir size for tests.on my workstation the PHD tests run with the Alpine image in about 320 seconds. out of the box this infers a max parallelism of 5, wherein the tests take 71 seconds. or with a manual
--parallelism 8
, 42 seconds. seems good! i've run with--parallelism 8
in a loop for a bit and haven't seen anything untoward.