-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read & Write race condition #75
Comments
This is a really good catch – I do wish that there was an official way to make this check (both the initial implementation and your fix rely on private/undocumented properties), but I know that at the time of original authorship at least, there wasn't one. Just to clarify your reproduction of this, were payloads of the listed sizes able to consistently produce this error, or was it still nondeterministic? I'm just thinking in terms of adding tests for this case. |
Hi @mike-marcacci, the payload size is irrelevant. I've tried it with 100 bytes, 16KiB, 16KiB + 1B, and if I recall, another higher value. The issue is really non-deterministic so I can't think of a way to create a test for it also. I still don't understand what can cause the I was only really able to replicate it during a high concurrency scenario on my local machine, but as discussed on the issue on |
If you'll modify this repo to add to a counter https://github.com/captainjapeng/graphql-upload-zero-buffer like in the image below, you'll see that sometimes error occurs at around less than 1000 index but can occur on the last end of the load test as well. One thing to take note of is that the problem is less reproducible when the node debugger is turned on probably because it adds overhead and makes everything slow. Another thing is that sometimes
|
Ah this is really helpful – thanks for all the additional details. I doubt there's any sort of tests we can add to protect against regressions in future versions of node, but I'm glad to have this documented here. |
This could potentially be a lead: I encountered what appears to be exactly this bug. The fix is only in the latest versions of Since I wanted to prove to myself that upgrading to the latest version of What does seem to work for me is simply concurrency. The below isn't a reusable use case for you, but the gist should be pretty obvious; it references a it("...", async () => {
// This parameter is critical. There's a bug in older versions of fs-capacitor that causes
// some bytes to be potentially missed; guard against any possible version (or upstream) regression.
const N = 15;
const streams = Array.from({ length: N }).map(() => getMultiStream());
const results = await Promise.all(streams.map((s) => new LazyPipeline(s).toArray()));
expect(results).toEqual(Array.from({ length: N }).map(() => objects.flat()));
},
) By the way, if the above is of any help—and even if it isn't—would you consider backporting the fix for this bug to a version of the library that's compatible with CommonJS? ESM may be the way of a future, but an awful lot of people work on projects stuck in CJS land. Right now, nearly 80% of your users are using a version of your library with a known data corruption bug, which is not maximally user friendly. |
Upon further investigation from jaydenseric/graphql-upload#318 there are chances that fs.read might return zero bytes when processing a lot of files. I was able to replicate this issue at 100, 10248, 102416, 102416+1, 102416*2 byte files.
I have pinned point the issue to the line of codes https://github.com/mike-marcacci/fs-capacitor/blob/main/src/index.ts#L61-L70 basically it ends the stream whenever the write stream has already finished even if the read stream has not yet fully read the file as seen on the image.
The text was updated successfully, but these errors were encountered: