Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow when referencing global pointer to std.StaticBitSet(1_000_000_000) #22426

Open
robinheydon opened this issue Jan 5, 2025 · 1 comment
Labels
bug Observed behavior contradicts documented or intended behavior

Comments

@robinheydon
Copy link

Zig Version

0.14.0-dev.2599+ff4f2753e

Steps to Reproduce and Observed Behavior

with the code:

const std = @import("std");

const BitSet = std.StaticBitSet(1_000_000_000);
var bit_set : *BitSet = undefined;

pub fn main () !void
{
    var gpa = std.heap.GeneralPurposeAllocator (.{}) {};
    const allocator = gpa.allocator ();
    defer std.debug.assert (gpa.deinit () == .ok);

    bit_set = try allocator.create (BitSet);
    defer allocator.destroy (bit_set);

    clear_bitset ();
}

fn clear_bitset () void
{
    bit_set.setRangeValue (.{ .start = 0, .end = bit_set.capacity () - 1}, false);
}

and compiled with

$ zig build-exe test.zig

The executable seg faults.

By the looks of the objdump -d this is attempting to do a __zig_probe_stack with the size 0x7735950 - this obviously fails.

000000000103b270 <test.clear_bitset>:
 103b270:       55                      push   %rbp
 103b271:       48 89 e5                mov    %rsp,%rbp
 103b274:       b8 50 59 73 07          mov    $0x7735950,%eax
 103b279:       e8 c2 60 0b 00          call   10f1340 <__zig_probe_stack>
 103b27e:       48 29 c4                sub    %rax,%rsp
 103b281:       48 8b 04 25 00 50 0f    mov    0x10f5000,%rax
 103b288:       01 
 103b289:       48 89 85 b8 a6 8c f8    mov    %rax,-0x7735948(%rbp)
 103b290:       48 8b 34 25 00 50 0f    mov    0x10f5000,%rsi
 103b297:       01 
 103b298:       48 8d bd c0 a6 8c f8    lea    -0x7735940(%rbp),%rdi
 103b29f:       ba 40 59 73 07          mov    $0x7735940,%edx
 103b2a4:       e8 d7 60 0b 00          call   10f1380 <memcpy>
 103b2a9:       48 8b bd b8 a6 8c f8    mov    -0x7735948(%rbp),%rdi
 103b2b0:       48 be 70 05 00 01 00    movabs $0x1000570,%rsi
 103b2b7:       00 00 00 
 103b2ba:       31 d2                   xor    %edx,%edx
 103b2bc:       e8 2f 4c 00 00          call   103fef0 <bit_set.ArrayBitSet(usize,1000000000).setRangeValue>
 103b2c1:       48 81 c4 50 59 73 07    add    $0x7735950,%rsp
 103b2c8:       5d                      pop    %rbp
 103b2c9:       c3                      ret    
 103b2ca:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

It also appears to be doing a memcpy from the global bit_set (address 0x10f5000).

As far as I can tell, ArrayBitSet(usize, 1_000_000_000).setRangeValue takes a pointer to the bitset so there should be no need to copy this to the stack before calling this function.

Expected Behavior

The code should just run without a seg fault.

Actually, I expect that no memcpy of the global pointer into the local stack would happen, and the call to setRangeValue would just take the pointer to the global.

@robinheydon robinheydon added the bug Observed behavior contradicts documented or intended behavior label Jan 5, 2025
@robinheydon
Copy link
Author

Ok, after a little bit of investigation, I think I've worked out what is going on.

The following code works as expected. It only references the bit_set global through its pointer.

const std = @import("std");

const ArrayBitSet = struct {
    const num_data = 1_000_000_000 / 32;
    data: [num_data]usize,

    const Self = @This();

    pub fn capacity(_: Self) usize {
        return num_data;
    }

    pub fn clear(self: *Self, value: usize) void {
        const len = self.capacity();
        for (0..len) |i| {
            self.data[i] = value;
        }
    }
};

var bit_set: *ArrayBitSet = undefined;

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer std.debug.assert(gpa.deinit() == .ok);

    bit_set = try allocator.create(ArrayBitSet);
    defer allocator.destroy(bit_set);

    std.debug.print("{}\n", .{bit_set.capacity()});

    clear_bitset();
}

fn clear_bitset() void {
    const len = bit_set.capacity();
    _ = len;
    bit_set.clear(0);
}

If just just make the capacity function inline, it fails.

    pub inline fn capacity(_: Self) usize {
        return num_data;
    }

This is despite the obvious fact that the Self parameter is not used, and it is only referencing the global defined in the ArrayBitSet function return type. This is illustrated by looking at the generated code for the non-inline capacity function.

000000000103b410 <test.ArrayBitSet.capacity>:
 103b410:	55                   	push   %rbp
 103b411:	48 89 e5             	mov    %rsp,%rbp
 103b414:	b8 50 d6 dc 01       	mov    $0x1dcd650,%eax
 103b419:	5d                   	pop    %rbp
 103b41a:	c3                   	ret    

However, when the inline function is 'called', a complete copy of the bit_set global variable is created on the function call stack and then the function is called.

This is therefore easy to compare the call sites for the two instances of capacity.

The non-inlined call-site.

 103b5f4:	48 83 ec 10          	sub    $0x10,%rsp
 103b5f8:	48 8b 3c 25 00 40 0f 	mov    0x10f4000,%rdi
 103b5ff:	01 
 103b600:	e8 0b fe ff ff       	call   103b410 <test.ArrayBitSet.capacity>
 103b605:	48 89 45 f8          	mov    %rax,-0x8(%rbp)

And the inlined call-site.

 103b5e4:	b8 90 b2 e6 0e       	mov    $0xee6b290,%eax
 103b5e9:	e8 d2 5b 0b 00       	call   10f11c0 <__zig_probe_stack>
 103b5ee:	48 29 c4             	sub    %rax,%rsp
 103b5f1:	48 8b 34 25 00 40 0f 	mov    0x10f4000,%rsi
 103b5f8:	01 
 103b5f9:	48 8d bd 78 4d 19 f1 	lea    -0xee6b288(%rbp),%rdi
 103b600:	ba 80 b2 e6 0e       	mov    $0xee6b280,%edx
 103b605:	e8 f6 5b 0b 00       	call   10f1200 <memcpy>
 103b60a:	48 c7 45 f8 50 d6 dc 	movq   $0x1dcd650,-0x8(%rbp)
 103b611:	01 

Only the last instruction at 103b60a is actually relevant here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior
Projects
None yet
Development

No branches or pull requests

1 participant