Skip to content

Releases: JuliaGPU/AMDGPU.jl

v0.4.7

06 Feb 16:41
cfc1603
Compare
Choose a tag to compare

AMDGPU v0.4.7

Diff since v0.4.6

Closed issues:

  • Segfault on ROCm 5.0 (#239)

Merged pull requests:

  • Add more MIOpen functionality (#370) (@pxl-th)
  • Store HSA.Signal instead of Ref{HSA.Signal} in ROCSignal (#375) (@pxl-th)

v0.4.6

01 Feb 16:59
408fab2
Compare
Choose a tag to compare

AMDGPU v0.4.6

Diff since v0.4.5

Closed issues:

  • Implement occupancy API (#271)
  • getinfo should determine the Ref output container automatically (#273)

Merged pull requests:

v0.4.5

24 Jan 02:51
Compare
Choose a tag to compare

AMDGPU v0.4.5

Diff since v0.4.4

Closed issues:

  • Mem.alloc: Allow using hipMalloc to service allocations (#286)
  • rocBLAS GEMM ignores @view (#319)
  • sincospi intrinsic is broken (#334)
  • #jps/dev segfaults on MI250x (#340)
  • method ambiguity in rand! (#343)
  • Add function or macro for AMDGPU.jl equivalent to CUDA.CuDynamicSharedArray (and CUDA.CuStaticSharedArray) (#347)
  • Free KernelState in finalizer (#352)
  • --check-bounds=no is broken on Julia 1.9.0-beta3 (#354)

Merged pull requests:

v0.4.4

26 Oct 15:29
f92a5f5
Compare
Choose a tag to compare

AMDGPU v0.4.4

Diff since v0.4.3

Closed issues:

  • Repetetive AMDGPU.ones calls crash runtime (#299)
  • Add AMDGPU.jl equivalent to CUDA.CuDynamicSharedArray (and CUDA.CuStaticSharedArray) (#304)
  • Segfault with basic kernel from AMDGPU.jl doc on LUMI (#308)
  • ROC kernel faulting upon having AMDGPU and CUDA loaded (#312)
  • AMDGPU.rand failing to create a ROCArray (#315)

Merged pull requests:

v0.4.3

03 Oct 19:25
696f840
Compare
Choose a tag to compare

AMDGPU v0.4.3

Diff since v0.4.2

Closed issues:

  • Queue selection test fail (#274)

Merged pull requests:

v0.4.2

08 Sep 16:22
287712e
Compare
Choose a tag to compare

AMDGPU v0.4.2

Diff since v0.4.1

Closed issues:

  • build failure on Julia 1.8.1 (#278)

Merged pull requests:

v0.4.1

01 Aug 16:23
5f723f8
Compare
Choose a tag to compare

AMDGPU v0.4.1

Diff since v0.4.0

Closed issues:

  • Add option to disable automatic mark/wait of specific arrays (#126)
  • Limit multi-dimensional groupsize properly (#150)
  • Optimize kernarg allocations in kernel construction (#247)
  • Add priority kwarg to ROCQueue ctor (#256)

Merged pull requests:

v0.4.0

24 Jul 00:35
e00143e
Compare
Choose a tag to compare

AMDGPU v0.4.0

Diff since v0.3.7

Closed issues:

  • Use atomics/locking in hostcall (#4)
  • Bump Setfield compat (#242)

Merged pull requests:

v0.3.7

11 May 17:56
d29e3d4
Compare
Choose a tag to compare

AMDGPU v0.3.7

Diff since v0.3.6

Closed issues:

  • Move standard warnings of using AMDGPU on system without AMD GPU to @debug? (#199)

Merged pull requests:

v0.3.6

09 May 16:07
f37a908
Compare
Choose a tag to compare

AMDGPU v0.3.6

Diff since v0.3.5

Merged pull requests: