Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zig 0.14.0 Build: unable to open '': FileNotFound #23120

Closed
timbess opened this issue Mar 6, 2025 · 8 comments · Fixed by #23121 · May be fixed by #23172
Closed

Zig 0.14.0 Build: unable to open '': FileNotFound #23120

timbess opened this issue Mar 6, 2025 · 8 comments · Fixed by #23121 · May be fixed by #23172
Labels
bug Observed behavior contradicts documented or intended behavior regression It worked in a previous version of Zig, but stopped working.
Milestone

Comments

@timbess
Copy link

timbess commented Mar 6, 2025

Zig Version

0.14.0

Steps to Reproduce and Observed Behavior

The second build onwards seems to have this is issue:

unable to open '': FileNotFound
error: the following build command failed with exit code 1:
/Users/tbess/src/sandbox/.zig-cache/o/0df7f29757d824113e0d80a506c813f3/build /Users/tbess/zig/0.14.0/files/zig /Users/tbess/zig/0.14.0/files/lib /Users/tbess/src/sandbox /Users/tbess/src/sandbox/.zig-cache /Users/tbess/.cache/zig --seed 0xb16b2bb7 -Zfd3003892dbb5e1f

It seems like it's failing to create the build graph at all since I'm unable to even use zig build --help.

Occasionally if I keep deleting the cache and rerunning the build I can get it to segfault as well. I just did a dev build of zig 0.14.0 tag and this is the backtrace in LLDB

❯ rm -rf ~/.cache/zig .zig-cache zig-out
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xaaaaaaaaaaaaaaaa)
  * frame #0: 0x00000001000c0670 zig`zig.stringEscape__anon_32972(bytes=(ptr = "", len = 12297829382473034410), writer=(context = 0x000000016fdf99d0, writeFn = 0x0000000100124948)) at zig.zig:474:10
    frame #1: 0x0000000100a9d6e4 zig`Build.Cache.Path.format__anon_329283(self=Build.Cache.Path @ 0x000000016fdf94c8, options=fmt.FormatOptions @ 0x00000001028055e0, writer=(context = 0x000000016fdf99d0, writeFn = 0x0000000100124948)) at Path.zig:156:29
    frame #2: 0x000000010076171c zig`fmt.formatType__anon_257873(value=<unavailable>, options=fmt.FormatOptions @ 0x00000001028055e0, writer=(context = 0x000000016fdf99d0, writeFn = 0x0000000100124948), max_depth=3) at fmt.zig:515:32
    frame #3: 0x000000010045b6e8 zig`fmt.format__anon_201647(writer=(context = 0x000000016fdf99d0, writeFn = 0x0000000100124948), args=struct { Build.Cache.Path } @ 0x000000016fdf99a8) at fmt.zig:193:23
    frame #4: 0x00000001002e977c zig`io.Writer.print__anon_176094(self=<unavailable>, args=<unavailable>) at Writer.zig:24:26
    frame #5: 0x0000000100202798 zig`Package.Fetch.JobQueue.createDependenciesSource [inlined] io.GenericWriter(*array_list.ArrayListAligned(u8,null),error{OutOfMemory},(function 'appendWrite')).print at io.zig:312:47
    frame #6: 0x000000010020277c zig`Package.Fetch.JobQueue.createDependenciesSource(jq=0x000000016fdfcfb0, buf=0x000000016fdfd548) at Fetch.zig:205:35
    frame #7: 0x00000001001fed14 zig`main.cmdBuild(gpa=mem.Allocator @ 0x00000001030a6c18, arena=mem.Allocator @ 0x000000016fdfee70, args=[][]const u8 @ 0x000000016fdfaaa8) at main.zig:5275:55
    frame #8: 0x00000001001119f4 zig`main.mainArgs(gpa=mem.Allocator @ 0x00000001030a6c18, arena=mem.Allocator @ 0x000000016fdfee70, args=[][]const u8 @ 0x000000016fdfea50) at main.zig:293:24
    frame #9: 0x000000010010fda4 zig`main.main at main.zig:212:20
    frame #10: 0x000000010010fba4 zig`start.main [inlined] start.callMain at start.zig:656:37
    frame #11: 0x000000010010fb98 zig`start.main [inlined] start.callMainWithArgs at start.zig:616:20
    frame #12: 0x000000010010fb20 zig`start.main(c_argc=2, c_argv=0x000000016fdff348, c_envp=0x000000016fdff360) at start.zig:631:28
    frame #13: 0x000000019df10274 dyld`start + 2840
(lldb)

And here's the zig error trace if I let it run:

❯ ../zig/zig-out/bin/zig build
thread 16098309 panic: Panic while preparing callstack
/Users/tbess/zig/0.14.0/files/lib/std/debug.zig:315:31: 0x104982237 in dumpCurrentStackTrace (zig)
        writeCurrentStackTrace(stderr, debug_info, io.tty.detectConfig(io.getStdErr()), start_addr) catch |err| {
                              ^
/Users/tbess/src/zig/src/crash_report.zig:283:44: 0x10496016f in dumpStackTrace (zig)
                debug.dumpCurrentStackTrace(ct.ret_addr);
                                           ^
/Users/tbess/src/zig/src/crash_report.zig:435:39: 0x1049139c7 in reportStack (zig)
        state.panic_ctx.dumpStackTrace();
                                      ^
/Users/tbess/src/zig/src/crash_report.zig:509:9: 0x10490ba5b in initPanic (zig)
        @call(.auto, func, args);
        ^
/Users/tbess/src/zig/src/crash_report.zig:509:9: 0x104905903 in dispatch (zig)
        @call(.auto, func, args);
        ^
/Users/tbess/src/zig/src/crash_report.zig:340:21: 0x10490578b in preDispatch (zig)
            dispatch(null, .{ .current = .{ .ret_addr = null } }, "Panic while preparing callstack");
                    ^
/Users/tbess/src/zig/src/crash_report.zig:163:28: 0x1049019df in compilerPanic (zig)
    PanicSwitch.preDispatch();
                           ^
/Users/tbess/zig/0.14.0/files/lib/std/debug.zig:75:17: 0x104906bfb in incorrectAlignment (zig)
            call("incorrect alignment", @returnAddress());
                ^
/Users/tbess/src/zig/src/crash_report.zig:213:39: 0x104b275af in handleSegfaultPosix (zig)
        => StackContext{ .exception = @ptrCast(@alignCast(ctx_ptr)) },
                                      ^
???:?:?: 0x19e2c6de3 in ??? (libsystem_platform.dylib)
???:?:?: 0x9098001053996e3 in ??? (???)
/Users/tbess/zig/0.14.0/files/lib/std/Build/Cache/Path.zig:156:29: 0x1053996e3 in format__anon_329283 (zig)
            try stringEscape(p, f, options, writer);
                            ^
/Users/tbess/zig/0.14.0/files/lib/std/fmt.zig:515:32: 0x10505d71b in formatType__anon_257873 (zig)
        return try value.format(actual_fmt, options, writer);
                               ^
/Users/tbess/zig/0.14.0/files/lib/std/fmt.zig:193:23: 0x104d576e7 in format__anon_201647 (zig)
        try formatType(
                      ^
/Users/tbess/zig/0.14.0/files/lib/std/io/Writer.zig:24:26: 0x104be577b in print__anon_176094 (zig)
    return std.fmt.format(self, format, args);
                         ^
/Users/tbess/zig/0.14.0/files/lib/std/io.zig:312:47: 0x104afe797 in createDependenciesSource (zig)
            return @errorCast(self.any().print(format, args));
                                              ^
/Users/tbess/src/zig/src/main.zig:5275:55: 0x104afad13 in cmdBuild (zig)
                try job_queue.createDependenciesSource(&source_buf);
                                                      ^
/Users/tbess/src/zig/src/main.zig:293:24: 0x104a0d9f3 in mainArgs (zig)
        return cmdBuild(gpa, arena, cmd_args);
                       ^
/Users/tbess/src/zig/src/main.zig:212:20: 0x104a0bda3 in main (zig)
    return mainArgs(gpa, arena, args);
                   ^
/Users/tbess/zig/0.14.0/files/lib/std/start.zig:656:37: 0x104a0bba3 in main (zig)
            const result = root.main() catch |err| {
                                    ^
???:?:?: 0x19df10273 in ??? (???)
???:?:?: 0x4f51ffffffffffff in ??? (???)
[1]    69226 abort      ../zig/zig-out/bin/zig build

Expected Behavior

zig build should work after cache is built.

@timbess timbess added the bug Observed behavior contradicts documented or intended behavior label Mar 6, 2025
@andrewrk andrewrk added this to the 0.14.1 milestone Mar 6, 2025
@andrewrk andrewrk added the regression It worked in a previous version of Zig, but stopped working. label Mar 6, 2025
@timbess
Copy link
Author

timbess commented Mar 6, 2025

Oh weirdly, with a clean cache I get a different error trace after which, if I run it again I get the error trace above.

❯ ../zig/zig-out/bin/zig build
thread 16108232 panic: attempt to use null value
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:1641:47: 0x104f2fb2f in allocRegs (zig)
                .inst => |inst| inst.toIndex().?,
                                              ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:1753:23: 0x104f3283f in binOpRegister (zig)
    try self.allocRegs(
                      ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:2297:50: 0x104f2d743 in shiftExact (zig)
                    return try self.binOpRegister(mir_tag_register, lhs_bind, rhs_bind, lhs_ty, lhs_ty, maybe_inst);
                                                 ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:2325:48: 0x104f2dac7 in shiftNormal (zig)
                    .shl => try self.shiftExact(.shl_exact, lhs_bind, rhs_bind, lhs_ty, rhs_ty, maybe_inst),
                                               ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:2460:41: 0x104ce9d97 in airBinOp (zig)
            .shl => try self.shiftNormal(tag, lhs_bind, rhs_bind, lhs_ty, rhs_ty, inst),
                                        ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:660:50: 0x104a93f13 in genBody (zig)
            .shl             => try self.airBinOp(inst, .shl),
                                                 ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:524:25: 0x104a90f4b in gen (zig)
        try self.genBody(self.air.getMainBody());
                        ^
/Users/tbess/src/zig/src/arch/aarch64/CodeGen.zig:382:17: 0x10483bc1b in generate (zig)
    function.gen() catch |err| switch (err) {
                ^
/Users/tbess/src/zig/src/codegen.zig:72:51: 0x1045726bf in generateFunction (zig)
            return importBackend(backend).generate(lf, pt, src_loc, func_index, air, liveness, code, debug_output);
                                                  ^
/Users/tbess/src/zig/src/link/MachO/ZigObject.zig:799:33: 0x104574227 in updateFunc (zig)
    try codegen.generateFunction(
                                ^
/Users/tbess/src/zig/src/link/MachO.zig:3071:44: 0x1041b6a9f in updateFunc (zig)
    return self.getZigObject().?.updateFunc(self, pt, func_index, air, liveness);
                                           ^
/Users/tbess/src/zig/src/link.zig:752:82: 0x103e14fef in updateFunc (zig)
                return @as(*tag.Type(), @fieldParentPtr("base", base)).updateFunc(pt, func_index, air, liveness);
                                                                                 ^
/Users/tbess/src/zig/src/Zcu/PerThread.zig:1705:22: 0x1039e4f0b in linkerUpdateFunc (zig)
        lf.updateFunc(pt, func_index, air, liveness) catch |err| switch (err) {
                     ^
/Users/tbess/src/zig/src/link.zig:1599:36: 0x1036a0123 in doTask (zig)
                pt.linkerUpdateFunc(func.func, func.air) catch |err| switch (err) {
                                   ^
/Users/tbess/src/zig/src/link.zig:1405:34: 0x10336da3f in flushTaskQueue (zig)
        for (tasks) |task| doTask(comp, tid, task);
                                 ^
/Users/tbess/zig/0.14.0/files/lib/std/Thread/Pool.zig:182:50: 0x10336dc4b in runFn (zig)
            @call(.auto, func, .{id.?} ++ closure.arguments);
                                                 ^
/Users/tbess/zig/0.14.0/files/lib/std/Thread/Pool.zig:295:32: 0x10332664f in worker (zig)
            run_node.data.runFn(&run_node.data, id);
                               ^
/Users/tbess/zig/0.14.0/files/lib/std/Thread.zig:488:13: 0x1031b9d3f in callFn__anon_182465 (zig)
            @call(.auto, f, args);
            ^
/Users/tbess/zig/0.14.0/files/lib/std/Thread.zig:757:30: 0x10309e9c3 in entryFn (zig)
                return callFn(f, args_ptr.*);
                             ^
???:?:?: 0x19e2902e3 in ??? (libsystem_pthread.dylib)
???:?:?: 0xce3c80019e28b0fb in ??? (???)
[1]    73100 abort      ../zig/zig-out/bin/zig build

@simeks
Copy link
Contributor

simeks commented Mar 6, 2025

I encountered the same issue with very similar dependencies (vulkan and zglfw). I managed to work around it by bumping my dependencies to use the new hash format instead.

@andrewrk
Copy link
Member

andrewrk commented Mar 6, 2025

I'm hitting it in https://github.com/allyourcodebase/zlib which only has the one dependency, even with the dependency updated to the new hash format.

this diff fixes the problem:

--- a/lib/std/Build/Step/InstallArtifact.zig
+++ b/lib/std/Build/Step/InstallArtifact.zig
@@ -189,9 +189,9 @@ fn make(step: *Step, options: Step.MakeOptions) !void {
                 const src_dir_path = dir.source.getPath3(b, step);
                 const full_h_prefix = b.getInstallPath(h_dir, dir.dest_rel_path);
 
-                var src_dir = src_dir_path.root_dir.handle.openDir(src_dir_path.sub_path, .{ .iterate = true }) catch |err| {
-                    return step.fail("unable to open source directory '{s}': {s}", .{
-                        src_dir_path.sub_path, @errorName(err),
+                var src_dir = src_dir_path.root_dir.handle.openDir(src_dir_path.subPathOrDot(), .{ .iterate = true }) catch |err| {
+                    return step.fail("unable to open source directory '{}': {s}", .{
+                        src_dir_path, @errorName(err),
                     });
                 };
                 defer src_dir.close();

which looks like a reasonable change to make regardless, but it does not explain why it worked before.

@timbess
Copy link
Author

timbess commented Mar 6, 2025

I encountered the same issue with very similar dependencies (vulkan and zglfw). I managed to work around it by bumping my dependencies to use the new hash format instead.

That's interesting, I made that change and now it only happens in ~ 1 out of 5 builds rather than every time.

@andrewrk
Copy link
Member

andrewrk commented Mar 7, 2025

43f73af looks relevant but that was a long time ago

@andrewrk
Copy link
Member

andrewrk commented Mar 7, 2025

I can reproduce the problem with 6b6c1b1 which is before the new hash change.

@simeks
Copy link
Contributor

simeks commented Mar 7, 2025

Oh, interesting, I realized that for zglfw I pointed to a local copy (via .path) while experimenting so might also have been what fixed it. If it helps I know I had no issues on 6fe1993 before updating to v0.14.0.

@timbess
Copy link
Author

timbess commented Mar 9, 2025

So I think this fixed a related issue, but I'm still seeing the original problem I had when I checkout master and run the build on my example repo.

When I use a debug build of zig, I see this issue:

Image

Seems like it's interpolating uninitialized memory when it tries to template in fetch.package_root in Fetch.zig.

I'm not super familiar with this code, but I sat down to debug it a bit. I think the issue is that my transitive dependency graph includes two copies of the exact same dependency on https://github.com/zig-gamedev/system_sdk/archive/d1e724748d15cfcbf50c45ec7c7019688d45b16a.tar.gz, but one is marked at lazy, and the other is not. So when the JobQueue hits the eager version first, it properly downloads everything and ignores the lazy version, but when it schedules the lazy one first, it early outs and doesn't download it, but once we hit the eager version, it mutates the original table entry to mark it as eager after it has already run and not downloaded anything. That would explain why it seems to happen non-deterministically since it depends on the ordering of the dependency fetches.

I made a PR for the patch that seems to fix it for me locally here #23172.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior regression It worked in a previous version of Zig, but stopped working.
Projects
None yet
3 participants