v0.4.3
AMDGPU v0.4.3
Closed issues:
- Queue selection test fail (#274)
Merged pull requests:
- Add device quirks from CUDA.jl, enhance at-rocprintf (#269) (@jpsamaroo)
- Use an optimized norm function for ROCBLASArray (#282) (@amontoison)
- Add rocBLAS_jll and rocSPARSE_jll deps (#284) (@jpsamaroo)
- active_kernels: Use WeakKeyDict (#285) (@jpsamaroo)
- CI: Add gfx90a to more jobs (#289) (@jpsamaroo)
- build: Remove build step, run at toplevel (#290) (@jpsamaroo)
- Mapreducedim support for AnyROCArray (#291) (@matinraayai)
- Parallelize tests (#293) (@jpsamaroo)
- Fix precompilation (#294) (@pxl-th)
- Do not rethrow EOF (#296) (@pxl-th)
- Use correct queue for kernels (#297) (@pxl-th)
- Implement kernel hashing system (#302) (@jpsamaroo)