store: Only close self-pipe when we're done #672

cgwalters · 2024-10-22T22:46:51Z

This took a crazy long time to debug but after lots of false starts I think this is right. Basically what's going on is we have async tasks that are talking over a pipe() inside our own process.

We must not close the read side of the pipe until the writer is done.

I believe this is dependent on tokio task scheduling order, and it's way easier to reproduce when pinned to a single CPU.

Closes: #657

jeckersb · 2024-10-23T13:17:02Z

Seems to fix it for me on my single-cpu reproducer, although it looks like the CI test failed because now it's getting stuck during pull?

This took a crazy long time to debug but after lots of false starts I think this is right. Basically what's going on is we have async tasks that are talking over a `pipe()` inside our own process. We must not close the read side of the pipe until the writer is done. I believe this is dependent on tokio task scheduling order, and it's way easier to reproduce when pinned to a single CPU. Closes: ostreedev#657 Signed-off-by: Colin Walters <[email protected]>

cgwalters · 2024-10-23T16:51:22Z

Seems to fix it for me on my single-cpu reproducer, although it looks like the CI test failed because now it's getting stuck during pull?

Yeah that's concerning. I have no working theory for why it'd work when run from ostree-ext but not bootc...I updated the PR to more clearly ensure that we're just borrowing the reader which I think makes ownership clearer at least, let's see if it will help.

A hang is definitely worse than a flake here and symptomatic of a larger problem.

cgwalters · 2024-10-23T16:53:32Z

@jeckersb do you want to play with this more and see if you can reproduce with bootc and if so get the stack trace etc?

jeckersb · 2024-10-23T17:12:13Z

@jeckersb do you want to play with this more and see if you can reproduce with bootc and if so get the stack trace etc?

Yeah I'll try with bootc. I let it run all morning with just ostree-ext and didn't have any problems fwiw.

jeckersb · 2024-10-23T18:03:11Z

First stab with gdb, helpful but it also gets confused by the re-exec dance.

Thread 1 ".tmpL16TtM" received signal SIGINT, Interrupt.
warning: BFD: reopening /tmp/.tmpL16TtM: No such file or directory
warning: Can't read data for section '.eh_frame' in file '/tmp/.tmpL16TtM'
0x00007fa93f21fa32 in epoll_wait () from /lib64/libc.so.6
(gdb) t a a bt                                                                                                                                                                                                                                          

Thread 5 (Thread 0x7fa93e0006c0 (LWP 6171) "tokio-runtime-w"):
#0  0x00007fa93f211e4a in read () from /lib64/libc.so.6
#1  0x000055c6855df76f in std::sys::pal::unix::fd::FileDesc::read_buf () at std/src/sys/pal/unix/fd.rs:156
#2  std::sys::pal::unix::fd::{impl#1}::read_buf () at std/src/sys/pal/unix/fd.rs:556
#3  std::io::default_read_to_end<&std::sys::pal::unix::fd::FileDesc> () at std/src/io/mod.rs:477
#4  std::io::Read::read_to_end<&std::sys::pal::unix::fd::FileDesc> () at std/src/io/mod.rs:862
#5  std::sys::pal::unix::fd::FileDesc::read_to_end () at std/src/sys/pal/unix/fd.rs:125
#6  0x000055c6855db92b in std::sys::pal::unix::pipe::AnonPipe::read_to_end () at std/src/sys/pal/unix/pipe.rs:74
#7  std::process::{impl#16}::read_to_end () at std/src/process.rs:488
#8  std::process::Child::wait_with_output () at std/src/process.rs:2263
#9  0x000055c68502bcc5 in containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure#0} () at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/containers-image-proxy-0.7.0/src/imageproxy.rs:321
#10 tokio::runtime::blocking::task::{impl#2}::poll<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}, core::result::Result<std::process::Output, std::io::error::Error>> (self=..., _cx=<error reading variable:
 Cannot access memory at address 0x0>) at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/task.rs:42
#11 tokio::runtime::task::core::{impl#6}::poll::{closure#0}<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedul
e> (ptr=<optimized out>) at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331
#12 tokio::loom::std::unsafe_cell::UnsafeCell<tokio::runtime::task::core::Stage<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>>>::with_mut<tokio::runtime::task
::core::Stage<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>>, core::task::poll::Poll<core::result::Result<std::process::Output, std::io::error::Error>>, tokio
::runtime::task::core::{impl#6}::poll::{closure_env#0}<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>> (
self=<optimized out>, f=...) at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16
#13 tokio::runtime::task::core::Core<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::poll<tokio::runtime
::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedule> (self=<optimized out>, cx=...) at /var/home/jeckersb/.cargo/registry/sr
c/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320
#14 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fa93e57bb40 (LWP 6028) ".tmpL16TtM"):
#0  0x00007fa93f21fa32 in epoll_wait () from /lib64/libc.so.6
#1  0x000055c685417344 in mio::sys::unix::selector::Selector::select (timeout=<optimized out>, self=<optimized out>, events=<optimized out>) at src/sys/unix/mod.rs:8
#2  mio::poll::Poll::poll (self=<optimized out>, events=<optimized out>, timeout=...) at src/poll.rs:435
#3  0x000000003b9aca00 in ?? ()
#4  0x7fffffffffffffff in ?? ()
#5  0x000055c685406090 in tokio::runtime::io::driver::Driver::turn (self=<optimized out>, handle=<optimized out>, max_wait=...) at src/runtime/io/driver.rs:149
#6  0x0000000000000001 in ?? ()
#7  0x00007ffed7a2ecb8 in ?? ()
#8  0x000055c6853ed908 in tokio::runtime::time::wheel::Wheel::next_expiration (self=<optimized out>) at src/runtime/time/wheel/mod.rs:181
#9  0x0000000000000000 in ?? ()

jeckersb · 2024-10-23T18:08:16Z

Also the progress gets stuck, it hangs for maybe 5 seconds, and then some threads exit (skopeo?):

layers already present: 0; layers needed: 65 (832.3 MB)
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/65
 └ Fetching ░░░░░░░░░░░░░░░░░░░░ 0 B/42.69 MiB (0 B/s) ostree chunk c4f42b3a58c66f41e8c48
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/65
 └ Fetching ░░░░░░░░░░░░░░░░░░░░ 416.00 KiB/42.69 MiB (159.59 KiB/s) ostree chunk c4f42b3a58c66f41e8c48
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/65
 └ Fetching ████████████████████ 96.46 MiB/42.69 MiB (114.98 MiB/s) ostree chunk c4f42b3a58c66f41e8c48
[Thread 0x7fcdfd6006c0 (LWP 6687) exited]
[Thread 0x7fcdfd2006c0 (LWP 6729) exited]
[Thread 0x7fcdfce006c0 (LWP 6730) exited]
[Thread 0x7fcdfde006c0 (LWP 6682) exited]

jeckersb · 2024-10-23T18:20:24Z

and then some threads exit (skopeo?):

Actually it's just other tokio worker threads. I C-c'd it after it hung but before they terminated to get their stacks too:

(gdb) t a a bt         

Thread 11 (Thread 0x7fc2dc8006c0 (LWP 8031) "tokio-runtime-w"):
#0  0x00007fc2de9903dd in syscall () from /lib64/libc.so.6
#1  0x000055fb49582bfb in std::sys::pal::unix::futex::futex_wait () at std/src/sys/pal/unix/futex.rs:67
#2  std::sys::sync::condvar::futex::Condvar::wait_optional_timeout () at std/src/sys/sync/condvar/futex.rs:50
#3  std::sys::sync::condvar::futex::Condvar::wait_timeout () at std/src/sys/sync/condvar/futex.rs:38
warning: Can't read data for section '.debug_loc' in file '/tmp/.tmpf4wMNV'
#4  0x000055fb4938f883 in std::sync::condvar::Condvar::wait_timeout<tokio::runtime::blocking::pool::Shared> (self=<optimized out>, guard=..., dur=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sync/condvar.rs:388
#5  tokio::runtime::blocking::pool::Inner::run (self=<optimized out>, worker_thread_id=4) at src/runtime/blocking/pool.rs:527
#6  0x00007fc2d8112600 in ?? ()
#7  0x000055fb6fe000e0 in ?? ()
#8  0x0000000000000000 in ?? ()

Thread 10 (Thread 0x7fc2dcc006c0 (LWP 8030) "tokio-runtime-w"):
#0  0x00007fc2de9903dd in syscall () from /lib64/libc.so.6
#1  0x000055fb49582bfb in std::sys::pal::unix::futex::futex_wait () at std/src/sys/pal/unix/futex.rs:67
#2  std::sys::sync::condvar::futex::Condvar::wait_optional_timeout () at std/src/sys/sync/condvar/futex.rs:50
#3  std::sys::sync::condvar::futex::Condvar::wait_timeout () at std/src/sys/sync/condvar/futex.rs:38
#4  0x000055fb4938f883 in std::sync::condvar::Condvar::wait_timeout<tokio::runtime::blocking::pool::Shared> (self=<optimized out>, guard=..., dur=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sync/condvar.rs:388
#5  tokio::runtime::blocking::pool::Inner::run (self=<optimized out>, worker_thread_id=3) at src/runtime/blocking/pool.rs:527
#6  0x00007fc2d8112600 in ?? ()
#7  0x000055fb6fe000e0 in ?? ()
#8  0x00007fc2de980800 in fchown () from /lib64/libc.so.6
#9  0x000055fb6fe000e0 in ?? ()
#10 0x00007fc2dcbff900 in ?? ()
#11 0x00007fc2c4000cd0 in ?? ()
#12 0x00007fc2df6d8000 in ?? ()
#13 0x000055fb4990a8d8 in ?? ()
#14 0x0000000000000000 in ?? ()

Thread 8 (Thread 0x7fc2dd0006c0 (LWP 7989) "tokio-runtime-w"):
#0  0x00007fc2de9903dd in syscall () from /lib64/libc.so.6
#1  0x000055fb49582bfb in std::sys::pal::unix::futex::futex_wait () at std/src/sys/pal/unix/futex.rs:67
#2  std::sys::sync::condvar::futex::Condvar::wait_optional_timeout () at std/src/sys/sync/condvar/futex.rs:50
#3  std::sys::sync::condvar::futex::Condvar::wait_timeout () at std/src/sys/sync/condvar/futex.rs:38
#4  0x000055fb4938f883 in std::sync::condvar::Condvar::wait_timeout<tokio::runtime::blocking::pool::Shared> (self=<optimized out>, guard=..., dur=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sync/condvar.rs:388
#5  tokio::runtime::blocking::pool::Inner::run (self=<optimized out>, worker_thread_id=2) at src/runtime/blocking/pool.rs:527
#6  0x000055fb70049a00 in ?? ()
#7  0x000055fb6fe000e0 in ?? ()
#8  0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7fc2dd4006c0 (LWP 7985) "tokio-runtime-w"):
#0  0x00007fc2de984e4a in read () from /lib64/libc.so.6
#1  0x000055fb4957976f in std::sys::pal::unix::fd::FileDesc::read_buf () at std/src/sys/pal/unix/fd.rs:156
#2  std::sys::pal::unix::fd::{impl#1}::read_buf () at std/src/sys/pal/unix/fd.rs:556
#3  std::io::default_read_to_end<&std::sys::pal::unix::fd::FileDesc> () at std/src/io/mod.rs:477
#4  std::io::Read::read_to_end<&std::sys::pal::unix::fd::FileDesc> () at std/src/io/mod.rs:862
#5  std::sys::pal::unix::fd::FileDesc::read_to_end () at std/src/sys/pal/unix/fd.rs:125
#6  0x000055fb4957592b in std::sys::pal::unix::pipe::AnonPipe::read_to_end () at std/src/sys/pal/unix/pipe.rs:74
#7  std::process::{impl#16}::read_to_end () at std/src/process.rs:488
#8  std::process::Child::wait_with_output () at std/src/process.rs:2263
#9  0x000055fb48fc5cc5 in containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure#0} () at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/containers-image-proxy-0.7.0/src/imageproxy.rs:321
#10 tokio::runtime::blocking::task::{impl#2}::poll<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}, core::result::Result<std::process::Output, std::io::error::Error>> (self=..., _cx=<error reading variable:
 Cannot access memory at address 0x0>) at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/task.rs:42
#11 tokio::runtime::task::core::{impl#6}::poll::{closure#0}<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedul
e> (ptr=<optimized out>) at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331
#12 tokio::loom::std::unsafe_cell::UnsafeCell<tokio::runtime::task::core::Stage<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>>>::with_mut<tokio::runtime::task
::core::Stage<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>>, core::task::poll::Poll<core::result::Result<std::process::Output, std::io::error::Error>>, tokio
::runtime::task::core::{impl#6}::poll::{closure_env#0}<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>> (
self=<optimized out>, f=...) at /var/home/jeckersb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16
#13 tokio::runtime::task::core::Core<tokio::runtime::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::poll<tokio::runtime
::blocking::task::BlockingTask<containers_image_proxy::{impl#5}::new_with_config::{async_fn#0}::{async_block#0}::{closure_env#0}>, tokio::runtime::blocking::schedule::BlockingSchedule> (self=<optimized out>, cx=...) at /var/home/jeckersb/.cargo/registry/sr
c/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320
#14 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7fc2dd8006c0 (LWP 7984) "tokio-runtime-w"):
#0  0x00007fc2de9903dd in syscall () from /lib64/libc.so.6
#1  0x000055fb49582bfb in std::sys::pal::unix::futex::futex_wait () at std/src/sys/pal/unix/futex.rs:67
#2  std::sys::sync::condvar::futex::Condvar::wait_optional_timeout () at std/src/sys/sync/condvar/futex.rs:50
#3  std::sys::sync::condvar::futex::Condvar::wait_timeout () at std/src/sys/sync/condvar/futex.rs:38
#4  0x000055fb4938f883 in std::sync::condvar::Condvar::wait_timeout<tokio::runtime::blocking::pool::Shared> (self=<optimized out>, guard=..., dur=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sync/condvar.rs:388
#5  tokio::runtime::blocking::pool::Inner::run (self=<optimized out>, worker_thread_id=0) at src/runtime/blocking/pool.rs:527
#6  0x000055fb6ff27700 in ?? ()
#7  0x000055fb6fe000e0 in ?? ()
#8  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fc2ddce0b40 (LWP 7848) ".tmpf4wMNV"):
#0  0x00007fc2de992a32 in epoll_wait () from /lib64/libc.so.6
#1  0x000055fb493b1344 in mio::sys::unix::selector::Selector::select (timeout=<optimized out>, self=<optimized out>, events=<optimized out>) at src/sys/unix/mod.rs:8
#2  mio::poll::Poll::poll (self=<optimized out>, events=<optimized out>, timeout=...) at src/poll.rs:435
#3  0x000000003b9aca00 in ?? ()
#4  0x7fffffffffffffff in ?? ()
#5  0x000055fb493a0090 in tokio::runtime::io::driver::Driver::turn (self=<optimized out>, handle=<optimized out>, max_wait=...) at src/runtime/io/driver.rs:149
#6  0x0000000000000001 in ?? ()
#7  0x00007ffe832e49c8 in ?? ()
#8  0x000055fb49387908 in tokio::runtime::time::wheel::Wheel::next_expiration (self=<optimized out>) at src/runtime/time/wheel/mod.rs:181
#9  0x0000000000000000 in ?? ()

jeckersb · 2024-10-23T18:54:23Z

Relevant bits with RUST_LOG=trace

[Detaching after fork from child process 8319]
DEBUG new_with_config: Spawned skopeo pid=8319 config=ImageProxyConfig { authfile: None, auth_data: None, auth_anonymous: true, certificate_directory: None, decryption_keys: None, insecure_skip_tls_verification: None, skopeo_cmd: None }
TRACE new_with_config:impl_request: sending request Initialize config=ImageProxyConfig { authfile: None, auth_data: None, auth_anonymous: true, certificate_directory: None, decryption_keys: None, insecure_skip_tls_verification: None, skopeo_cmd: None } self=ImageProxy method="Initialize"
TRACE new_with_config:impl_request: completed request config=ImageProxyConfig { authfile: None, auth_data: None, auth_anonymous: true, certificate_directory: None, decryption_keys: None, insecure_skip_tls_verification: None, skopeo_cmd: None } self=ImageProxy method="Initialize"
DEBUG new_with_config: Remote protocol version: 0.2.6 config=ImageProxyConfig { authfile: None, auth_data: None, auth_anonymous: true, certificate_directory: None, decryption_keys: None, insecure_skip_tls_verification: None, skopeo_cmd: None }
DEBUG open_image: opening image self=ImageProxy imgref="containers-storage:localhost/bootc-gdb:latest@sha256:28acbffedf1b833c3dd0cd97b5d6c9e293df7916145baaf77aca263c4384016b"
TRACE open_image:impl_request: sending request OpenImage self=ImageProxy imgref="containers-storage:localhost/bootc-gdb:latest@sha256:28acbffedf1b833c3dd0cd97b5d6c9e293df7916145baaf77aca263c4384016b" self=ImageProxy method="OpenImage"
TRACE open_image:impl_request: completed request self=ImageProxy imgref="containers-storage:localhost/bootc-gdb:latest@sha256:28acbffedf1b833c3dd0cd97b5d6c9e293df7916145baaf77aca263c4384016b" self=ImageProxy method="OpenImage"
TRACE impl_request: sending request GetManifest self=ImageProxy method="GetManifest"
TRACE impl_request: completed request self=ImageProxy method="GetManifest"
DEBUG finish_pipe: closing pipe self=ImageProxy pipeid=8
TRACE finish_pipe:impl_request: sending request FinishPipe self=ImageProxy pipeid=8 self=ImageProxy method="FinishPipe"
TRACE finish_pipe:impl_request: completed request self=ImageProxy pipeid=8 self=ImageProxy method="FinishPipe"
TRACE impl_request: sending request GetFullConfig self=ImageProxy method="GetFullConfig"
TRACE impl_request: completed request self=ImageProxy method="GetFullConfig"
DEBUG finish_pipe: closing pipe self=ImageProxy pipeid=8
TRACE finish_pipe:impl_request: sending request FinishPipe self=ImageProxy pipeid=8 self=ImageProxy method="FinishPipe"
TRACE finish_pipe:impl_request: completed request self=ImageProxy pipeid=8 self=ImageProxy method="FinishPipe"
layers already present: 0; layers needed: 66 (2.5 GB)
DEBUG Fetching base
DEBUG get_layer_info: Getting layer info self=ImageProxy img=OpenedImage(1)
TRACE get_layer_info:impl_request: sending request GetLayerInfo self=ImageProxy img=OpenedImage(1) self=ImageProxy method="GetLayerInfo"
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/66
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 0/0TRACE get_layer_info:impl_request: completed request self=ImageProxy img=OpenedImage(1) self=ImageProxy method="GetLayerInfo"
DEBUG fetching sha256:e9d9ddf12c645b3917d7f2866a170b52a09fe5823619fda23bdc58f8e10b0710
DEBUG get_blob: fetching blob self=ImageProxy img=OpenedImage(1) digest=Digest { algorithm: Sha256, digest: "e9d9ddf12c645b3917d7f2866a170b52a09fe5823619fda23bdc58f8e10b0710" } size=101180416
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/66
 └ Fetching ░░░░░░░░░░░░░░░░░░░░ 0 B/96.49 MiB (0 B/s) ostree chunk e9d9ddf12c645b3917d7f                                                                                                                                                                       TRACE get_blob:impl_request: completed request self=ImageProxy img=OpenedImage(1) digest=Digest { algorithm: Sha256, digest: "e9d9ddf12c645b3917d7f2866a170b52a09fe5823619fda23bdc58f8e10b0710" } size=101180416 self=ImageProxy method="GetBlob"
DEBUG finish_pipe: closing pipe self=ImageProxy pipeid=8
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/66
 └ Fetching ░░░░░░░░░░░░░░░░░░░░ 432.00 KiB/96.49 MiB (176.22 KiB/s) ostree chunk e9d9ddf12c645b3917d7f
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/66
 └ Fetching ██████████████░░░░░░ 71.00 MiB/96.49 MiB (90.90 MiB/s) ostree chunk e9d9ddf12c645b3917d7f
Fetching layers ░░░░░░░░░░░░░░░░░░░░ 0/66
 └ Fetching ███████████████████░ 94.09 MiB/96.49 MiB (110.13 MiB/s) ostree chunk e9d9ddf12c645b3917d7f                                                                                                                                                          TRACE finish_pipe:impl_request: completed request self=ImageProxy pipeid=8 self=ImageProxy method="FinishPipe"
DEBUG Imported 254 content objects
DEBUG Wrote ostree/container/blob/sha256_3A_e9d9ddf12c645b3917d7f2866a170b52a09fe5823619fda23bdc58f8e10b0710 => c56e6e293081397af2dcd9879be0e8903902cf2c6e9f7878073d1f5d5fedcf95
[Thread 0x7f63a9a006c0 (LWP 8278) exited]
[Thread 0x7f63a92006c0 (LWP 8324) exited]
[Thread 0x7f63a96006c0 (LWP 8323) exited]
[Thread 0x7f63a9e006c0 (LWP 8277) exited]
[Thread 0x7f63aa6006c0 (LWP 8272) exited]

jeckersb · 2024-10-23T19:00:20Z

Meanwhile skopeo appears to idle waiting for a request

Thread 4 (Thread 0x7f9b01e006c0 (LWP 8320) "skopeo"):
#0  runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:558
#1  0x0000562332b92787 in runtime.futexsleep (addr=<optimized out>, val=<optimized out>, ns=<optimized out>) at /usr/lib/golang/src/runtime/os_linux.go:75
#2  0x0000562332b692f5 in runtime.notetsleep_internal (n=0x5623345137c0 <runtime[sched]+256>, ns=60000000000, ~r0=<optimized out>) at /usr/lib/golang/src/runtime/lock_futex.go:212
#3  0x0000562332b69429 in runtime.notetsleep (n=0xfffffffffffffdfc, ns=0, ~r0=<optimized out>) at /usr/lib/golang/src/runtime/lock_futex.go:235
#4  0x0000562332ba5349 in runtime.sysmon () at /usr/lib/golang/src/runtime/proc.go:6001
#5  0x0000562332b9bb35 in runtime.mstart1 () at /usr/lib/golang/src/runtime/proc.go:1729
#6  0x0000562332b9ba76 in runtime.mstart0 () at /usr/lib/golang/src/runtime/proc.go:1686
#7  0x0000562332bc9d45 in runtime.mstart () at /usr/lib/golang/src/runtime/asm_amd64.s:394
#8  0x00005623336d1738 in crosscall1 () at gcc_amd64.S:42
#9  0x00007ffc1f71ae70 in ?? ()
#10 0x0000000000000016 in ?? ()
#11 0xffffffffffffff78 in ?? ()
#12 0x000000c000006a80 in ?? ()
#13 0x00007f9b01dffdc0 in ?? ()
#14 0x0000562332bc9d40 in ?? ()
#15 0x00005623336d1039 in threadentry (v=<optimized out>) at gcc_linux_amd64.c:89
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 3 (Thread 0x7f9b014006c0 (LWP 8321) "skopeo"):
#0  runtime/internal/syscall.Syscall6 () at /usr/lib/golang/src/runtime/internal/syscall/asm_linux_amd64.s:36
#1  0x0000562332b5edc5 in runtime/internal/syscall.EpollWait (epfd=<optimized out>, events=..., maxev=<optimized out>, waitms=<optimized out>, n=<optimized out>, errno=<optimized out>) at /usr/lib/golang/src/runtime/internal/syscall/syscall_linux.go:55
#2  0x0000562332b92412 in runtime.netpoll (delay=<optimized out>, ~r0=..., ~r1=<optimized out>) at /usr/lib/golang/src/runtime/netpoll_epoll.go:118
#3  0x0000562332b9eb06 in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /usr/lib/golang/src/runtime/proc.go:3470
#4  0x0000562332ba0071 in runtime.schedule () at /usr/lib/golang/src/runtime/proc.go:3868
#5  0x0000562332ba064c in runtime.park_m (gp=0xc000006fc0) at /usr/lib/golang/src/runtime/proc.go:4036
#6  0x0000562332bc9dd0 in runtime.mcall () at /usr/lib/golang/src/runtime/asm_amd64.s:458
#7  0x01ffffffffffff78 in ?? ()
#8  0x0000000000800000 in ?? ()
#9  0x000000c000007340 in ?? ()
#10 0x00007f9b013ffdc0 in ?? ()
#11 0x0000562332bc9d45 in runtime.mstart () at /usr/lib/golang/src/runtime/asm_amd64.s:394
#12 0x00005623336d1738 in crosscall1 () at gcc_amd64.S:42
#13 0x00007ffc1f71ad30 in ?? ()
#14 0x0000000000000016 in ?? ()
#15 0xffffffffffffff78 in ?? ()
#16 0x000000c000007340 in ?? ()
#17 0x00007f9b013ffdc0 in ?? ()
#18 0x0000562332bc9d40 in ?? ()
#19 0x00005623336d1039 in threadentry (v=<optimized out>) at gcc_linux_amd64.c:89
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 0x7f9b00a006c0 (LWP 8322) "skopeo"):
#0  runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:558
#1  0x0000562332b92710 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=94709175081987) at /usr/lib/golang/src/runtime/os_linux.go:69
#2  0x0000562332b691c7 in runtime.notesleep (n=0x5623345736f8 <runtime[newmHandoff]+24>) at /usr/lib/golang/src/runtime/lock_futex.go:170
#3  0x0000562332b9d311 in runtime.templateThread () at /usr/lib/golang/src/runtime/proc.go:2760
#4  0x0000562332b9bb35 in runtime.mstart1 () at /usr/lib/golang/src/runtime/proc.go:1729
#5  0x0000562332b9ba76 in runtime.mstart0 () at /usr/lib/golang/src/runtime/proc.go:1686
#6  0x0000562332bc9d45 in runtime.mstart () at /usr/lib/golang/src/runtime/asm_amd64.s:394
#7  0x00005623336d1738 in crosscall1 () at gcc_amd64.S:42
#8  0x00007ffc1f71af10 in ?? ()
#9  0x0000000000000016 in ?? ()
#10 0xffffffffffffff78 in ?? ()
#11 0x000000c0000076c0 in ?? ()
#12 0x00007f9b009ffdc0 in ?? ()
#13 0x0000562332bc9d40 in ?? ()
#14 0x00005623336d1039 in threadentry (v=<optimized out>) at gcc_linux_amd64.c:89
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 0x7f9b489e2840 (LWP 8319) "skopeo"):
#0  runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:558
#1  0x0000562332b92710 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=94709175081987) at /usr/lib/golang/src/runtime/os_linux.go:69
#2  0x0000562332b691c7 in runtime.notesleep (n=0x562334510a80 <runtime.m0+320>) at /usr/lib/golang/src/runtime/lock_futex.go:170
#3  0x0000562332b9d42c in runtime.mPark () at /usr/lib/golang/src/runtime/proc.go:1761
#4  runtime.stopm () at /usr/lib/golang/src/runtime/proc.go:2782
#5  0x0000562332b9ef9f in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /usr/lib/golang/src/runtime/proc.go:3512
#6  0x0000562332ba0071 in runtime.schedule () at /usr/lib/golang/src/runtime/proc.go:3868
#7  0x0000562332ba064c in runtime.park_m (gp=0xc000006e00) at /usr/lib/golang/src/runtime/proc.go:4036
#8  0x0000562332bc9dd0 in runtime.mcall () at /usr/lib/golang/src/runtime/asm_amd64.s:458
#9  0x00007ffc1f71b318 in ?? ()
#10 0x0000562332bce765 in runtime.newproc (fn=0x562332bc9cd3 <runtime.rt0_go+307>) at <autogenerated>:1
#11 0x0000562332bc9d45 in runtime.mstart () at /usr/lib/golang/src/runtime/asm_amd64.s:394
#12 0x0000562332bc9cd3 in runtime.rt0_go () at /usr/lib/golang/src/runtime/asm_amd64.s:358
#13 0x00007ffc00000003 in ?? ()
#14 0x00007ffc1f71b478 in ?? ()
#15 0x00007ffc1f71b3a0 in ?? ()
#16 0x0000000000000003 in ?? ()
#17 0x00007ffc1f71b478 in ?? ()
#18 0x00007f9b48dc7088 in __libc_start_call_main (main=0x0, argc=0, argv=0x0) at ../sysdeps/nptl/libc_start_call_main.h:58
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

jeckersb · 2024-10-23T19:02:14Z

So if I'm following correctly, I think we've lost the handle on whatever is supposed to be driving things with skopeo? The skopeo process is still there, and we're still waiting in the blocking task for the skopeo process to exit, but we don't have any stack frames on the bootc side to drive the next API action with skopeo?

cgwalters · 2024-10-23T20:55:27Z

Yeah I don't think this is right in practice...or at least I'm not confident in it enough. Will comment back in the issue

cgwalters mentioned this pull request Oct 22, 2024

Parsing layer blob: Broken pipe #657

Closed

cgwalters force-pushed the ownership-blob-self branch from 374fa43 to 3b2c098 Compare October 23, 2024 16:46

cgwalters closed this Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store: Only close self-pipe when we're done #672

store: Only close self-pipe when we're done #672

cgwalters commented Oct 22, 2024

jeckersb commented Oct 23, 2024

cgwalters commented Oct 23, 2024

cgwalters commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

cgwalters commented Oct 23, 2024

store: Only close self-pipe when we're done #672

store: Only close self-pipe when we're done #672

Conversation

cgwalters commented Oct 22, 2024

jeckersb commented Oct 23, 2024

cgwalters commented Oct 23, 2024

cgwalters commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

jeckersb commented Oct 23, 2024

cgwalters commented Oct 23, 2024