-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify p4studio interactive to use limited parallel jobs #78
Modify p4studio interactive to use limited parallel jobs #78
Conversation
Signed-off-by: Andy Fingerhut <andy_fingerhut@alum.wustl.edu>
…active' Signed-off-by: Andy Fingerhut <andy_fingerhut@alum.wustl.edu>
I have tested this, and I do right at the beginning see the 2 lines of output from the function that calculates the number of parallel jobs to use, so these changes are tested working. |
@@ -166,7 +166,8 @@ def create_default_profile() -> Profile: | |||
|
|||
|
|||
def build_sde(context: Context, profile: Profile) -> None: | |||
plan = ProfileExecutionPlan(profile, None, os.cpu_count()) | |||
jobs = calculate_jobs_from_available_cpus_and_memory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should be an argument from the previous calculation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry, but I do not understand the comment. The new first line of build_sde
calculates and stores a value in variable jobs
. The second line calls ProfileExecutionPlan
, with the 3rd parameter being jobs
instead of os.cpu_count()
.
What are you recommending I change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jobs = calculate_jobs_from_available_cpus_and_memory() |
this calculation is already done here, no? Ideally, we should only do this once and then reuse the result in other contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case it is not obvious: the code in this file is run if you do a command p4studio interactive
, which is what the install.sh
bash script invokes. If you do this, the other Python source file that has the definition of calculate_jobs_from_available_cpus_and_memory
has none of its code executed at all, unless it is explicitly called from somewhere in this file. That is what this change is adding -- an explicit call from this file into the function defined in the other file, that calculates a reasonable number of parallel jobs to use.
This is not a repeated call of that function in this code path. It is the only one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the common ancestor between the profile and the interactive command? I was thinking that this ancestor should call the function, then pass the result to either one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the common ancestor is some click
module Python code that parses some command line arguments, and decides via click annotations on Python methods which function to invoke. If you do p4studio profile <some args>
on the command line, it invokes a method in the file profile_command.py
. If you do p4studio interactive
, it iknvokes a method in this file being modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this line early in this file:
from profile.profile_command import execute_plan, calculate_jobs_from_available_cpus_and_memory
There is an execute_plan
function called from both profile_command.py
and interactive_command.py
that is doing most of the work of building and installing the software. Whoever wrote this code decided to define it in profile_command.py
and call it from here, rather than putting execute_plan
in some common 3rd file and calling it from both places. I'm not adding new code flows that didn't exist between files already here. I'm adding one more cross-file call where there was already a more significant one before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think we can leave it there then. It seems like checks like these should be set from both workflows, but it may be that the interactive command has less restrictions and doesn't even invoke system checks or common code.
One thing, for CI we should try to use max CPUs. Simply to speed things up. If it locks up we can revert that change. |
I can create a separate PR for that. It is just adding command line option like '--jobs |
Or I can just toss that one-line change into this PR. Give me a minute. |
Yeah, I just noted that because I saw current CI uses 3 jobs instead of 4. |
Feel free to do this in a separate PR. Either works for me. |
Signed-off-by: Andy Fingerhut <andy_fingerhut@alum.wustl.edu>
If it is a 4vCPU 16GB RAM instance with no swap, it could end up significantly slower, or failing. At least that is what Vlad saw on an AWS instance he tried this on a while ago. |
The Tofino back end unified build uses somewhere around 5-6 GB of RAM on a very few individual processes, is why. Vlad suggested the idea of disabling unified build option -- more processes to run, but no individual huge-memory ones that cause you to want to reduce the number of parallel jobs. |
In my experience a unified build outperforms even multiple cores but we can tweak the "chunk size" of the build to reduce the memory usage. |
My suggestion is to have a swap file and then implement #72. Essentially, all we need to take the min of 15% RAM and 25% Swap and add that amount to the total size of RAM before calculating the amount of RAM per CPU. Having 15% of RAM in the swap does not usually impact the performance. It also makes sense to have swap anyway, just for the sake of safety. |
I think this PR is ready to merge, if you approve. All of the suggestions for detecting swap space and using that to determine the number of parallel jobs to run has a separate issue created to track it already. I am not planning to implement that in this PR. |
No description provided.