From aed2a8477b0304f38519b62b15ef37e78a0412cd Mon Sep 17 00:00:00 2001 From: Daniel Strano Date: Fri, 15 Nov 2024 09:40:15 -0500 Subject: [PATCH] PSTRIDEPOW note in README --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8f794e089..347507b38 100644 --- a/README.md +++ b/README.md @@ -239,7 +239,7 @@ Specify a device list for `QUnitMulti` the same way you would for `QPager`, with `QBdtHybrid` sets a threshold for "hybridization" between "quantum binary decision diagrams" (see Acknowledgements at bottom of document) and state vector simulation, based on how efficiently the "diagram" or "tree" can be "compressed." The environment variable `QRACK_QBDT_HYBRID_THRESHOLD` (typically taking values between 0 and 1) sets a multiplicative fraction for maximally-compressed size of the tree, as fraction of node count vs. equivalent state vector amplitude count, before switching over to state vector simulation. Note that maximum `QBdt` node count is _twice_ the count of amplitudes in the equivalent state vector simulation, so set the variable to 2 or higher to completely suppress switching and recover `QBdt`-only simulation in all cases. ## Build and environment options for CPU engines -`QEngineCPU` and `QHybrid` batch work items in groups of 2^`PSTRIDEPOW` before dispatching them to single CPU threads, potentially greatly reducing waiting on mutexes without signficantly hurting utilization and scheduling. The default for this option can be controlled at build time, by passing `-DPSTRIDEPOW=n` to CMake, with "n" being an integer greater than or equal to 0. This can be overridden at run time by the enviroment variable `QRACK_PSTRIDEPOW=n`. If an environment variable is not defined for this option, the default from CMake build will be used. (The default is meant to work well across different typical consumer systems, but it might benefit from system-tailored tuning via the environment variable.) +`QEngineCPU` and `QHybrid` batch work items in groups of 2^`PSTRIDEPOW` before dispatching them to single CPU threads, potentially greatly reducing waiting on mutexes without signficantly hurting utilization and scheduling. The default for this option can be controlled at build time, by passing `-DPSTRIDEPOW=n` to CMake, with "n" being an integer greater than or equal to 0. This can be overridden at run time by the enviroment variable `QRACK_PSTRIDEPOW=n`. If an environment variable is not defined for this option, the default from CMake build will be used. (The default is meant to work well across different typical consumer systems, but it can benefit greatly from system-tailored tuning via the environment variable. **This can be critical to performance of CPU-based simulation**, so you should _always_ tune it when deploying Qrack on a new system.) `-DENABLE_QUNIT_CPU_PARALLEL=OFF` disables asynchronous dispatch of `QStabilizerHybrid` and low width `QEngineCPU`/`QHybrid` gates with `std::future`. This option is on by default. Typically, `QUnit` stays safely under maximum thread count limits, but situations arise where async CPU simulation causes `QUnit` to dispatch too many CPU threads for the operating system. This build option can also reduce overall thread usage when Qrack user code operates in a multi-threaded or multi-shell environment. (Linux thread count limits might be smaller than Windows.)