Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: sycl-ls fails on new Battlemage B580 #788

Closed
HeyItsBATMAN opened this issue Dec 19, 2024 · 3 comments
Closed

Bug: sycl-ls fails on new Battlemage B580 #788

HeyItsBATMAN opened this issue Dec 19, 2024 · 3 comments

Comments

@HeyItsBATMAN
Copy link

System: Arch Linux
GPU: Intel Arc B580
Kernel: 6.12.4
Basekit versions: 2024.1 and 2025.0 (same issue)

After installing the basekit, sourcing setvars.sh, any attempt to run sycl-ls afterwards results in the error:
'sycl-ls' terminated by signal SIGBUS (Misaligned address error)

Running clinfo detects the GPU correctly:

$ sudo clinfo -l
Platform #0: Intel(R) OpenCL Graphics
 `-- Device #0: Intel(R) Graphics [0xe20b]

On another note: I've tested various IPEX and SYCL guides, both from llama.cpp, aswell as from the (intel-analytics team)[https://github.com/intel-analytics/ipex-llm], and those applications fail with the same error.

If theres any more info needed, I am open to testing & experimenting.

@JablonskiMateusz
Copy link
Contributor

Hi @HeyItsBATMAN
Please try to capture callstack of the issue

@HeyItsBATMAN
Copy link
Author

Okay, so I've installed debugging symbols and run gdb with the settings below.
I also added SYCL_PI_TRACE=2, which is below the gdb backtrace.

gdbinit

set debuginfod enabled on
add-auto-load-safe-path /opt/intel/oneapi/compiler/2024.1/lib/libsycl.so.7.1.0-gdb.py

gdb backtrace

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGBUS, Bus error.
0x00007fffd28f2061 in memcpy (__dest=<optimized out>, __src=<optimized out>, __len=<optimized out>, __dest=<optimized out>, __src=<optimized out>, __len=<optimized out>) at /usr/include/bits/string_fortified.h:29
29	 return __builtin___memcpy_chk (__dest, __src, __len,
#0  0x00007fffd28f2061 in memcpy (__dest=<optimized out>, __src=<optimized out>, __len=<optimized out>, __dest=<optimized out>, __src=<optimized out>, __len=<optimized out>) at /usr/include/bits/string_fortified.h:29
#1  memcpy_s (dst=<optimized out>, destSize=<optimized out>, src=<optimized out>, count=<optimized out>, dst=<optimized out>, destSize=<optimized out>, src=<optimized out>, count=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/helpers/string.h:71
#2  NEO::BindlessHeapsHelper::BindlessHeapsHelper (this=0x555557a39d20, rootDevice=<optimized out>, isMultiOsContextCapable=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/helpers/bindless_heaps_helper.cpp:46
#3  std::make_unique<NEO::BindlessHeapsHelper, NEO::Device*&, bool&> () at /usr/include/c++/14.2.1/bits/unique_ptr.h:1076
#4  NEO::RootDeviceEnvironment::createBindlessHeapsHelper (this=0x5555579d3970, rootDevice=<optimized out>, availableDevices=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/execution_environment/root_device_environment.cpp:144
#5  NEO::RootDevice::createBindlessHeapsHelper (this=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/root_device.cpp:58
#6  NEO::RootDevice::createBindlessHeapsHelper (this=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/root_device.cpp:52
#7  0x00007fffd281c1a3 in NEO::Device::initDeviceFully (this=this@entry=0x5555579da5d0) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/device.cpp:251
#8  0x00007fffd281d5dd in NEO::Device::initDeviceFully (this=0x5555579da5d0) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/device.cpp:510
#9  0x00007fffd28463be in NEO::Device::createDeviceImpl (this=0x5555579da5d0) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/device.cpp:155
#10 NEO::Device::createDeviceImpl (this=0x5555579da5d0) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/device.cpp:128
#11 NEO::Device::createDeviceInternals<NEO::RootDevice> (device=0x5555579da5d0) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/device.h:227
#12 NEO::Device::create<NEO::RootDevice, NEO::ExecutionEnvironment*, unsigned int&> () at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/device/device.h:97
#13 operator() (__closure=0x0, executionEnvironment=..., rootDeviceIndex=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/os_interface/device_factory.cpp:275
#14 _FUN () at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/os_interface/device_factory.cpp:276
#15 0x00007fffd251cd4e in NEO::DeviceFactory::createDevices (executionEnvironment=...) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/shared/source/os_interface/device_factory.cpp:265
#16 L0::DriverImp::initialize (this=<optimized out>, result=<optimized out>) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/level_zero/core/source/driver/driver.cpp:77
#17 0x00007fffd2513bd1 in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<L0::DriverImp::driverInit(unsigned int)::{lambda()#1}>(std::once_flag&, L0::DriverImp::driverInit(unsigned int)::{lambda()#1}&&)::{lambda()#1}>(L0::DriverImp::driverInit(unsigned int)::{lambda()#1}&)::{lambda()#1}::_FUN() [clone .lto_priv.0] () at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/level_zero/core/source/driver/driver.cpp:106
#18 0x00007ffff76a88fb in __pthread_once_slow (once_control=0x7fffd2ade04c <_ZN2L0L9driverImpE.lto_priv.0+12>, init_routine=0x7ffff78e02e0 <std::__once_proxy()>) at pthread_once.c:116
#19 0x00007ffff76a8979 in ___pthread_once (once_control=<optimized out>, init_routine=<optimized out>) at pthread_once.c:143
#20 0x00007fffd24cf49c in __gthread_once (__once=0x7fffd2ade04c <_ZN2L0L9driverImpE.lto_priv.0+12>, __func=<optimized out>) at /usr/include/c++/14.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:713
#21 std::call_once<L0::DriverImp::driverInit(ze_init_flags_t)::<lambda()> > (__once=..., __f=...) at /usr/include/c++/14.2.1/mutex:916
#22 L0::DriverImp::driverInit (this=0x7fffd2ade040 <_ZN2L0L9driverImpE.lto_priv.0>, flags=1) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/level_zero/core/source/driver/driver.cpp:104
#23 L0::init (flags=1) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/level_zero/core/source/driver/driver.cpp:157
#24 L0::zeInit (flags=1) at /usr/src/debug/intel-compute-runtime/compute-runtime-24.45.31740.9/level_zero/api/core/ze_driver_api_entrypoints.h:17
#25 0x00007ffff713f95f in loader::context_t::init_driver (this=this@entry=0x555555587c20, driver=..., flags=flags@entry=1, desc=desc@entry=0x0, globalInitStored=globalInitStored@entry=0x555555589e98, sysmanGlobalInitStored=sysmanGlobalInitStored@entry=0x55555558a640, sysmanOnly=false) at /usr/src/debug/level-zero/level-zero-1.18.5/source/loader/ze_loader.cpp:282
#26 0x00007ffff71423e2 in loader::context_t::check_drivers (this=<optimized out>, flags=<optimized out>, desc=<optimized out>, globalInitStored=<optimized out>, sysmanGlobalInitStored=<optimized out>, requireDdiReinit=<optimized out>, sysmanOnly=<optimized out>) at /usr/src/debug/level-zero/level-zero-1.18.5/source/loader/ze_loader.cpp:167
#27 0x00007ffff715c662 in zelLoaderDriverCheck (flags=<optimized out>, desc=<optimized out>, globalInitStored=<optimized out>, sysmanGlobalInitStored=<optimized out>, requireDdiReinit=<optimized out>, sysmanOnly=<optimized out>) at /usr/src/debug/level-zero/level-zero-1.18.5/source/loader/ze_loader_api.cpp:38
#28 0x00007ffff713489c in ze_lib::context_t::Init (this=0x555555589e00, flags=1, sysmanOnly=<optimized out>, desc=0x0) at /usr/src/debug/level-zero/level-zero-1.18.5/source/lib/ze_lib.cpp:117
#29 0x00007ffff7134978 in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<zeInit::{lambda()#1}>(std::once_flag&, zeInit::{lambda()#1}&&)::{lambda()#1}>(zeInit::{lambda()#1}&)::{lambda()#1}::_FUN() () at /usr/src/debug/level-zero/level-zero-1.18.5/source/lib/ze_libapi.cpp:48
#30 0x00007ffff76a88fb in __pthread_once_slow (once_control=0x555555589e00, init_routine=0x7ffff78e02e0 <std::__once_proxy()>) at pthread_once.c:116
#31 0x00007ffff76a8979 in ___pthread_once (once_control=<optimized out>, init_routine=<optimized out>) at pthread_once.c:143
#32 0x00007ffff712a5a4 in __gthread_once (__once=0x555555589e00, __func=<optimized out>) at /usr/include/c++/14.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:713
#33 std::call_once<zeInit(ze_init_flags_t)::<lambda()> > (__once=..., __f=...) at /usr/include/c++/14.2.1/mutex:916
#34 zeInit (flags=<optimized out>) at /usr/src/debug/level-zero/level-zero-1.18.5/source/lib/ze_libapi.cpp:48
#35 0x00007ffff591c041 in std::__1::__function::__func<ur_adapter_handle_t_::ur_adapter_handle_t_()::$_0, std::__1::allocator<ur_adapter_handle_t_::ur_adapter_handle_t_()::$_0>, void (Result<std::__1::vector<std::__1::unique_ptr<ur_platform_handle_t_, std::__1::default_delete<ur_platform_handle_t_> >, std::__1::allocator<std::__1::unique_ptr<ur_platform_handle_t_, std::__1::default_delete<ur_platform_handle_t_> > > > >&)>::operator()(Result<std::__1::vector<std::__1::unique_ptr<ur_platform_handle_t_, std::__1::default_delete<ur_platform_handle_t_> >, std::__1::allocator<std::__1::unique_ptr<ur_platform_handle_t_, std::__1::default_delete<ur_platform_handle_t_> > > > >&) () from /opt/intel/oneapi/compiler/2024.1/lib/libpi_level_zero.so
#36 0x00007ffff598ae55 in std::__1::__call_once(unsigned long volatile&, void*, void (*)(void*)) () from /opt/intel/oneapi/compiler/2024.1/lib/libpi_level_zero.so
#37 0x00007ffff595562b in urPlatformGet () from /opt/intel/oneapi/compiler/2024.1/lib/libpi_level_zero.so
#38 0x00007ffff596ad27 in piPlatformsGet () from /opt/intel/oneapi/compiler/2024.1/lib/libpi_level_zero.so
#39 0x00007ffff7e8204a in _pi_result sycl::_V1::detail::plugin::call_nocheck<(sycl::_V1::detail::PiApiKind)0, int, decltype(nullptr), unsigned int*>(int, decltype(nullptr), unsigned int*) const () from /opt/intel/oneapi/compiler/2024.1/lib/libsycl.so.7
#40 0x00007ffff7e7ca6b in sycl::_V1::detail::platform_impl::get_platforms()::$_0::operator()(std::shared_ptr<sycl::_V1::detail::plugin>&) const () from /opt/intel/oneapi/compiler/2024.1/lib/libsycl.so.7
#41 0x00007ffff7e7bdc1 in sycl::_V1::detail::platform_impl::get_platforms() () from /opt/intel/oneapi/compiler/2024.1/lib/libsycl.so.7
#42 0x00007ffff7f7ff69 in sycl::_V1::platform::get_platforms() () from /opt/intel/oneapi/compiler/2024.1/lib/libsycl.so.7
#43 0x000055555555b334 in main ()

sycl trace (as file due to length)
sycl-trace.txt

@HeyItsBATMAN
Copy link
Author

For anyone encountering the same issue: enable REBAR in your BIOS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants