You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As I was reading this article I noticed that the TIME breakdown in Figure 3 is very accurate, I was wondering what tool you used to complete the time measurements?
The text was updated successfully, but these errors were encountered:
I just measured time manually by conducting tic and toc for each segment and obtained T1, T2, ..., Tn.
T_total = T1 + T2 + ... + Tn, such that the percentage of each segment can be computed --- T1/T_total * 100% or so.
This measurement makes sense since the BERT inference is a single-stream computational pipeline.
Alternatively you could try with the built-in nsight systems to measure elapsed time - and I would expect you see a similar result.
Thank you for your reply. By "single-stream computational pipeline" do you mean that the time spent loading model weights from the HBM to the Cache will be counted in the computational time? Or the time for loading is overlapped by computation?
As I was reading this article I noticed that the TIME breakdown in Figure 3 is very accurate, I was wondering what tool you used to complete the time measurements?
The text was updated successfully, but these errors were encountered: