-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
postgresMgr.js null dereference #10
Comments
How reproducible is this problem for you? Are you (or can you) run with --abort-on-uncaught-exception and share the resulting core file? |
Uh, pretty reproducible (as in I can make it happen again with stress testing, not that it happens often). Due to the vagaries of how we're running it I'm slightly reticent to go down this path, especially when I think the nature of the crash is relatively clear from the debug logging I had added (both error handlers are called and in an order that makes the second one crash). |
I think I understand what you're saying, but without understanding this better, I'm not sure what the most appropriate fix is. (I don't want to blindly add a defensive check that papers over other potential problems.) Do you happen to still have the debug output, or anything about which errors were emitted? Was it the same underlying error that was reported on both the query and the client? Thanks again for reporting this. |
I super hate to be that guy, but I'm not sure what I can or can't share so I'm in a weird spot. At the least the error the query handler logged was:
The error the client error handler logged (right before dying) was:
This was immediately after it decided it was going to stop postgres to assume the role of sync (last normal line above first error was the "trying SIGINT" message). |
Thanks. Interesting that they were two different errors. |
Sorry to be so sparing with the actual data, but the powers that be move
|
I put up a sketch of a rewrite of some of the state management which fixes the problem, for values of "fixes" equal to "based on my limited understanding of javascript" and "could no longer repro crash". I got the go-ahead to post a core dump so if you'd still rather see that I guess I can remove the proposed fix locally, reproduce again, etc. I guess let me know either way. |
Thanks for taking a swing at fixing this. I'm not sure I understand the new invariants, though. Before the change, the invariants were that there was only ever one active client, and it was represented by self._pgClient. There was only ever one outstanding request, and it's represented by self._pgRequestOutstanding. Your change obviously assumes that it's possible to get a client's error callback fired even after it's no longer the only client. Is that what happened in your test case? The change also assumes that when a query ends, it may not be the currently outstanding request. How can that happen? At 1645 in the new file, it seems like you leak the 'error' listener with each new query. If we issue a request that succeeds, but there's a subsequent client error (say, 10 requests later), won't we fire 'error' callbacks for all of the successful requests that happened before it? Thanks again for taking a look. I know this code is rather hairy, and it's also tricky to get right. (It could also be made a lot simpler with a fix for node-postgres#718.) |
Uh, I am not strictly sure whether or not I observed that. I chose
Again, I don't have an execution path that leads to that. Maybe client errors
Uh, sure does seem that way. At the least _pgQueryFini won't do anything |
Updated the PR (now that the new cut has been through a few nights of not breaking). |
Observed in (a private branch based on) 99fd1bf "Uncaught TypeError: Cannot call method 'removeAllListeners' of null" on line analogous to line 1634 in 99fd1bf.
Admittedly we are on a private branch with other changes and it is based pretty far back in history but I believe the problem is still relevant.
Having plastered the code with debug logging and reproduced, the basic order of events is that that query error handler happens, nulls out _pgClient, and then that client error handler happens and crashes.
The text was updated successfully, but these errors were encountered: