Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent infinite/recursive tracing of gRPC storage #5979

Closed
wants to merge 1 commit into from

Conversation

ldlb9527
Copy link

@ldlb9527 ldlb9527 commented Sep 12, 2024

fixes: #5971

I'm not sure if this is reasonable, so please let me know if there are any issues.

Copy link

codecov bot commented Sep 18, 2024

Codecov Report

Attention: Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.

Project coverage is 96.78%. Comparing base (b7e8884) to head (1f58769).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
plugin/storage/grpc/factory.go 85.71% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5979      +/-   ##
==========================================
- Coverage   96.79%   96.78%   -0.02%     
==========================================
  Files         348      348              
  Lines       16559    16563       +4     
==========================================
+ Hits        16029    16031       +2     
- Misses        342      343       +1     
- Partials      188      189       +1     
Flag Coverage Δ
badger_v1 8.02% <0.00%> (-0.01%) ⬇️
badger_v2 1.82% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1 16.60% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2 1.75% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1 16.60% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2 1.75% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 18.76% <0.00%> (-0.03%) ⬇️
elasticsearch-7.x-v1 18.83% <0.00%> (+<0.01%) ⬆️
elasticsearch-8.x-v1 19.02% <0.00%> (-0.01%) ⬇️
elasticsearch-8.x-v2 ?
grpc_v1 9.54% <86.36%> (+0.03%) ⬆️
grpc_v2 7.15% <0.00%> (-0.01%) ⬇️
kafka-v1 9.73% <0.00%> (-0.01%) ⬇️
kafka-v2 1.82% <0.00%> (-0.01%) ⬇️
memory_v2 1.82% <0.00%> (+0.01%) ⬆️
opensearch-1.x-v1 18.87% <0.00%> (-0.03%) ⬇️
opensearch-2.x-v1 18.87% <0.00%> (-0.03%) ⬇️
opensearch-2.x-v2 1.81% <0.00%> (-0.02%) ⬇️
tailsampling-processor 0.46% <0.00%> (-0.01%) ⬇️
unittests 95.27% <90.90%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ldlb9527 ldlb9527 force-pushed the fix-factory branch 3 times, most recently from 2754064 to 1f58769 Compare September 18, 2024 05:11
@yurishkuro yurishkuro changed the title Prevent infinite loop in gRPC tracing during span storage. Prevent infinite/recursive tracing of gRPC storage Sep 24, 2024
@ldlb9527 ldlb9527 force-pushed the fix-factory branch 2 times, most recently from f35efec to 1cb3b7a Compare September 24, 2024 19:06
// TODO needs to be joined with the metricsFactory
LeveledMeterProvider: func(_ configtelemetry.Level) metric.MeterProvider {
return noopmetric.NewMeterProvider()
},
}
newClientFn := func(opts ...grpc.DialOption) (conn *grpc.ClientConn, err error) {
newClientFn := func(telSettings component.TelemetrySettings, opts ...grpc.DialOption) (conn *grpc.ClientConn, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on this part. In newRemoteStorage() you never modify telset when calling newClientFn twice, so why do you need to pass it vs. how it was passed already here via closure?

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add tests that confirm that the write methods don't get traces?

yurishkuro pushed a commit that referenced this pull request Oct 27, 2024
<!--
!! Please DELETE this comment before posting.
We appreciate your contribution to the Jaeger project! 👋🎉
-->

## Which problem is this PR solving?
- Fixes #5971 
- Towards #6113 and #5859

## Description of the changes
- This PR fixes an issue where the GRPC remote storage client was
provided a tracer which was resulting in an infinite loop of trace
generation. This infinite loop would happen when we would try to write a
trace to storage which would generate a new trace that needed to be
written and so on. This PR provides a fix for this by using a noop
tracer for the writer clients so that we do not generate traces on the
write paths but still do so when reading.
- This is likely just a temporary fix and we'll want to monitor
open-telemetry/opentelemetry-collector#10663
for a better long-term fix.

## How was this change tested?
- Added the healthcheck endpoint which was previously failing in #6113.

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

## Co-Authors 
This PR is a continuation of
#5979
Co-authored-by: cx <[email protected]>

---------

Signed-off-by: Mahad Zaryab <[email protected]>
@yurishkuro
Copy link
Member

implemented in #6125

@yurishkuro yurishkuro closed this Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[jaeger-v2] Dangerous use of tracer in plugin/storage/grpc/factory.go
2 participants