-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions related to thrift errors #73
Comments
An HTTP 502 in particular is probably a network blip on our side. We'd had an increased number of those since our move to Google's cloud - turns out no system has a 100% uptime SLA :), so their front-end load balancer downtime now stacks on top of whatever blips we have on our end. The 302 is a little more confusing - do you know what URL it's 302ing to? Certain requests redirect to a maintenance page when a shard is unavailable (eg during a restart, or during our weekly service release), but I didn't think thrift requests would return 302s for that reason. Are you seeing them mostly on Wednesdays? There isn't really any way to hide them from the client directly - thrift clients we use internally use connection pools and have retry logic baked in to deal with transient errors like that. In terms of filling up memory, make sure you close your connection even on an error, not just on success, and that your objects get garbage collected. |
It looks like those thrift errors are logged by the SDK, so it's hard for me to tell exactly which request they match up with. I am not sure if / how they are surfaced to us. Internally, we map any Evernote error that's not related to tokens (error codes 8 & 9), or rate limits (error code 19) to a 500. Our approach dates back to the old SDK, so it might need to be overhauled. Anyway, when I look at the graph for 500 errors from Evernote, the spikes seem more or less random, though Tuesdays and Thursdays seem to be more affected:
Our main load is definitely on Mondays, so I don't think it's related to the number of requests we are making. Regarding memory, I think we actually face the issue described in #71. I find the same stack trace scattered in our logs, surfacing as an uncaught exception. |
Closing this issue as thrift errors are not actually causing crashes. MemBuffer overrun issues are (-> follow-up in #71). |
Since we moved to
v2
, I find a lot of these in our logs:They coincide with the following pattern on our servers:
Is there any reason why those errors would come in waves every few days? Given the error codes, I don't think they would be related to specific users updating their account, more due to server issues on your side, maybe?
Is there any way to shield from them?
Any clues on why they would tend to fill up memory?
The text was updated successfully, but these errors were encountered: