Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make the performence breakdown like the picture Fig3? #10

Open
chenhongyu2048 opened this issue Aug 16, 2023 · 2 comments
Open

Comments

@chenhongyu2048
Copy link

As I was reading this article I noticed that the TIME breakdown in Figure 3 is very accurate, I was wondering what tool you used to complete the time measurements?

@yzhaiustc
Copy link

yzhaiustc commented Aug 16, 2023

I just measured time manually by conducting tic and toc for each segment and obtained T1, T2, ..., Tn.
T_total = T1 + T2 + ... + Tn, such that the percentage of each segment can be computed --- T1/T_total * 100% or so.
This measurement makes sense since the BERT inference is a single-stream computational pipeline.
Alternatively you could try with the built-in nsight systems to measure elapsed time - and I would expect you see a similar result.

@chenhongyu2048
Copy link
Author

chenhongyu2048 commented Aug 17, 2023

Thank you for your reply. By "single-stream computational pipeline" do you mean that the time spent loading model weights from the HBM to the Cache will be counted in the computational time? Or the time for loading is overlapped by computation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants